Ciao,
Questa mattina alcuni utenti mi hanno avvisato che la nostra installazione di Discourse era apparentemente offline, ma al momento del controllo era di nuovo online. Ho notato diversi comportamenti “offline” nel mio stack di monitoraggio, ma quando sono andato a controllare i log tramite launcher logs app sembra che vengano visualizzati senza data/ora (
) e solo quelli più recenti, senza distinzione dalla sorgente dei log, dato che un’installazione di Discourse è composta da diversi componenti (nginx, redis, psql, ecc.).
C’è un modo per avere log più accurati o, ancora meglio, esporli in qualche modo in modo che io possa catturarli con loki / promtail?
Tutti i grafici sono impostati su “ultimi 5 giorni” poiché ho aggiunto questo monitoraggio solo di recente. Questo è ciò che posso vedere nel mio monitoraggio:
Uptime del container:
Dal plugin discourse-prometheus, ci sono molti “buchi” che suppongo siano dovuti ai momenti in cui Discourse non risponde, poiché corrispondono a quando sono stato avvisato dagli utenti stamattina:
Ingrandito nelle ultime 6 ore per chiarire che non si tratta di una finestra “piccola” e che ovviamente non è accettabile per gli utenti:
Log ottenuti con ./launcher logs app
root@vmi1229594:/var/discourse# ./launcher logs app
x86_64 arch detected.
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
Started runsvdir, PID is 41
ok: run: redis: (pid 53) 0s
ok: run: postgres: (pid 55) 0s
supervisor pid: 51 unicorn pid: 87
Shutting Down
run-parts: executing /etc/runit/3.d/01-nginx
ok: down: nginx: 0s, normally up
run-parts: executing /etc/runit/3.d/02-unicorn
(51) exiting
ok: down: unicorn: 1s, normally up
run-parts: executing /etc/runit/3.d/10-redis
ok: down: redis: 0s, normally up
run-parts: executing /etc/runit/3.d/99-postgres
ok: down: postgres: 0s, normally up
ok: down: nginx: 5s, normally up
ok: down: postgres: 0s, normally up
ok: down: redis: 3s, normally up
ok: down: unicorn: 5s, normally up
ok: down: cron: 1s, normally up
ok: down: rsyslog: 1s, normally up
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
Started runsvdir, PID is 35
ok: run: redis: (pid 49) 0s
ok: run: postgres: (pid 48) 0s
supervisor pid: 43 unicorn pid: 81
(43) Reopening logs
(43) Reopening logs
Shutting Down
run-parts: executing /etc/runit/3.d/01-nginx
ok: down: nginx: 0s, normally up
run-parts: executing /etc/runit/3.d/02-unicorn
(43) exiting
ok: down: unicorn: 0s, normally up
run-parts: executing /etc/runit/3.d/10-redis
ok: down: redis: 0s, normally up
run-parts: executing /etc/runit/3.d/99-postgres
timeout: run: postgres: (pid 48) 34983s, want down, got TERM
run-parts: /etc/runit/3.d/99-postgres exited with return code 1
ok: down: nginx: 10s, normally up
ok: down: redis: 8s, normally up
ok: down: unicorn: 10s, normally up
ok: down: cron: 0s, normally up
ok: down: rsyslog: 0s, normally up
kill: run: postgres: (pid 48) 34991s, want down, got TERM
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
Started runsvdir, PID is 35
fail: redis: runsv not running
ok: run: redis: (pid 53) 1s
ok: run: postgres: (pid 48) 1s
supervisor pid: 79 unicorn pid: 83
(79) Reopening logs
Shutting Down
run-parts: executing /etc/runit/3.d/01-nginx
ok: down: nginx: 0s, normally up
run-parts: executing /etc/runit/3.d/02-unicorn
(79) exiting
ok: down: unicorn: 0s, normally up
run-parts: executing /etc/runit/3.d/10-redis
ok: down: redis: 0s, normally up
run-parts: executing /etc/runit/3.d/99-postgres
ok: down: postgres: 0s, normally up
ok: down: nginx: 5s, normally up
ok: down: postgres: 0s, normally up
ok: down: redis: 3s, normally up
ok: down: unicorn: 5s, normally up
ok: down: cron: 1s, normally up
ok: down: rsyslog: 1s, normally up
run-parts: executing /etc/runit/1.d/00-ensure-links
run-parts: executing /etc/runit/1.d/00-fix-var-logs
run-parts: executing /etc/runit/1.d/01-cleanup-web-pids
run-parts: executing /etc/runit/1.d/anacron
run-parts: executing /etc/runit/1.d/cleanup-pids
Cleaning stale PID files
run-parts: executing /etc/runit/1.d/copy-env
Started runsvdir, PID is 34
ok: run: redis: (pid 49) 0s
ok: run: postgres: (pid 44) 0s
supervisor pid: 41 unicorn pid: 80
(41) Reopening logs
(41) Reopening logs
(41) Reopening logs
(41) Reopening logs
(41) Reopening logs








