Discourse and Logging Errors

(Sam Saffron) #1

Back in the days I worked at Stack Overflow, my day always started the same way.

  • Navigate to the errors page
  • Look through the errors we got overnight
  • Sort them out
  • Continue with my day

We used an error logging solution that was based off Google Code Archive - Long-term storage for Google Code Project Hosting. over time we rewrote most of it. Life was good. I even blogged about some of our extensions to it.

Fast forward a few year, I am working with Discourse and was very much dismayed to find there is no such thing for Rails apps.

Instead, a pretty disturbing pattern has emerged in the community. People are outsourcing errors or simply leaving it in a impossible to understand enormous local log.

In the outsource-this-camp you have http://www.exceptional.io/ , https://airbrake.io/ and the open source GitHub - errbit/errbit: The open source error catcher that's Airbrake API compliant that we use.

There is also the built in text rails log, it has no web interface, and logs so much data it is very hard to make any sense of it.

###Why I strongly oppose the outsourcing approach for Discourse

  • Discourse should stand on its own without dependencies on third party services

We need the ability to quickly troubleshoot installs, pick up on various errors early and address them. If we expect users installing Discourse to go about installing and configuring yet another service, just to get something basic like logging, we have lost the game.

  • We need the service to be “close” to the application.

We have the technology to notify administrators when stuff is going pear shape. We already have an admin panel that has room for this. We need to remove all barriers from getting access to failures. When a request fails we want to know which user had the failure.

###What I want?

It has always been our goal to help Ruby and Rails the ecosystem. I would really like our error log to be engineered externally to our app as Rack middleware, with hooks to notify the main application as needed on failure.

At a very basic level we want a row for each HTTP request that fails and each jobs that fails.

###Absolute musts for v0.1

  • Implemented as rack middleware, with minimal new gem dependencies (if any)
  • No new storage system dependencies (no mongo, no hadoop clusters etc.)
  • A flexible storage engine (file / postgres / redis)
  • View backtraces per error
  • Group the same errors in a row
  • Ships with Discourse

Going forward I would very much want to support JavaScript error logging as well. Discourse if VERY js heavy and we need to know when stuff fails.

Before I go about reinventing wheels, is there a standalone rack middleware that already fill this need?


(vipulnsward) #2

I don’t know of many logging solutions that do this. But I am pretty interested to help building such a solution.
Some must haves that I could think of:

  • Minimal
  • Generic, able to hook into any stack
  • Good control over verbosity
  • A Rack-Middleware sounds good
  • Simple analytics shooting off the Errors(Going in hand with Errors Grouping)

Looking forward to when you start on this, to help.

(Garry Shutler) #3

It’s not what you’re looking for at the moment but I can see a reasonably straightforward path to alter the logging gem I have written, Hatchet, to support most of these requirements.

  • It’s minimal in that there are no direct non-core dependencies
  • It has a middleware built in
  • It’s pretty Rails aware and tries to make the logs Rails generates more consumable when setting up its Railtie
  • Storage is entirely flexible through custom appenders
  • Formatting is also highly configurable
  • There’s high control over the verbosity

In fact, of the requirements outlined here, I think the only one that’s missing is the errors grouping, but I have a couple of ideas of how to implement that too.

Sound any good?

(I’d have linked to more stuff directly but I was only allowed 2 links as this it my first post)

(Garry Shutler) #4

I’ve started work on adding the ability to buffer and flush messages to Hatchet which would help achieve per-request error grouping.

Essentially it adds a flush! protocol to the logging library so that you may collect messages during a request and then be told to write/inspect them at a relevant point.

The branch includes a simple implementation which is tied to the thread context to give you an idea of how it could work. Discourse would probably buffer to an array on the Rack env instead. Then the middleware is altered to call flush! after each request.

For error grouping the flush! implementation could be more intelligent and inspect the buffer for ERROR or higher messages.

(Sam Saffron) #5

I will try to respond in more depth over the weekend, I had a look through the APIs and they look well thought out and documented, my main concern is the Web UI, is there a front end anywhere for hatchet ?

(Garry Shutler) #6

No, there isn’t a UI. So far I’ve got by fine with grepping file logs and having messages conditionally forwarded to Hipchat/email/external error reporting/log shipping services.

I mentioned Hatchet as it deals with all the plumbing and would allow people to send their logs wherever they liked in whatever format they liked with a little configuration. For example, sending errors to an email address of their choice could be made a one-liner.

I’m more than happy to work with you guys to create a Discourse appender that would store errors in a way that could be used in the errors UI you’ve described.

(Josh Robb) #7

I know you’d rather not have any new gem’s - but I thought it was worth mentioning lograge.

It’s only a small part of what your asking for - but it seems like it might be interesting if nothing else.

(John Martirano) #8

We are also looking at error logging solutions and integrated logging solutions in general. I’d love to use some open source solution we can self-host but i’ve seen nothing yet that compares with some of the fee-based online services available already. http://www.rollbar.com is my favorite so far. Check out the live demo. The ability to aggregate errors and view on a timeline, mark the severity and status of the error, and comment on them so teams can collaborate - those are fundamental features IMO. The github integration looks nice too - though we are in Stash so that doesnt help us.

(John Bachir) #9

exception_logger saves exceptions to the DB and has an admin CRUD interface for browsing them. I haven’t tried it.

(Jeff Atwood) #10

I believe @sam found a logging solution he likes, perhaps he can describe it.

(Sam Saffron) #11

That would be http://logstash.net/ which is awesome, I will post about it (and about the Discourse plugin I wrote for it) however its a slightly different use case.

(nXqd) #12

hi @sam, logstash is awesome. Due to what I know we need a gem called logstasher to make Rails log work with logstash. Do you have any clue to make things work together ? It’d be great then since I need logstash for the current running discourse :smiley:

Cheers !

(Sam Saffron) #13

FYI, I have been working on this problem and have made some progress.

For now dev logs are visible at localhost:3000/logs using the new logster gem. This week I will be working on improving this system and getting production logs in there as well.

(Lee_Ars) #14

Hey @sam—I see you rolled logster into prod a few hours ago. Is it a live feed, or do I need to refresh the page to see additions to the log?

(Sam Saffron) #15

Its a live feed, but very much v0 at the moment.

I plant to make it way nicer and more informative, its reporting a bunch of noise now that I need to suppress and I need to build row collapsing.

(Lee_Ars) #16

Hey, anything that means not having to manually grep that hairy production log file is a good thing. Thanks :smile:

(Sam Saffron) #17

Closing this, now that we have logster :slight_smile:

(Sam Saffron) #18