Getting Discourse running on JRuby

Hello friends!

I work on the JRuby project, and have recently (a couple times in the past year) attempted to get Discourse running. It would mean you can run a single Discourse process for a whole site, and probably use less memory and CPU at the same time. I think it’s worth getting it to run.

As with most existing Ruby apps, there are a few missing C extensions.

The good news is that most of these extensions appear to have alternatives for JRuby, or they’re trivial enough that it should be easy to just make a JRuby version.

I wanted to start a discussion here so we can talk about some of the exts and possible replacements.

Any interest in Discourse on JRuby?

16 Likes

Yes absolutely.

We really want to work with JRuby, what are the current stumbling blocks?

8 Likes

Here’s the diff I have at the moment. This is not an exhaustive list of extensions with no JRuby support but it’s most of them.

https://gist.github.com/headius/99f7a177b67d635d4783e7f7b164a3c9

Some replacements I know of:

  • fast_xor has been superceded by xorcist, which has JRuby support.
  • pg has a JRuby equivalent in pg_jruby. It lags behind pg a lot (lack of resources).
  • oj may have JRuby support soon; Tom Enebo has a partial port.
  • cppjieba_rb might be extended to support JRuby using the jieba-analysis Java library. I have started a dialog with the author here: https://github.com/fantasticfears/cppjieba_rb/issues/1
  • fast_blank (transitive dependency of onebox) was trivial to port to JRuby. My PR is here: https://github.com/SamSaffron/fast_blank/pull/21
  • Presumably unicorn can just be replaced with puma. Both are in the Gemfile.

Other exts will have to be discussed.

6 Likes

Yes this is totally safe, puma is supported. If we really want to conserve memory here though it may be interesting to run sidekiq inside puma.

Oh … fast_blank should not be a dependency for onebox, I just removed it, next release will not have it.

I think it probably makes sense just to not depend on fast_blank for jruby and set the fast_blank dependency in Discourse to MRI only. String @match? does matching without needing to set globals so it should in theory be fast enough for jRuby.

Happy to make this MRI dependency only for now.

Oh, we should bench this if perf is the same we can move to xorcist

8 Likes

The fast_xor README actually says to use xorcist. I believe the code was simply moved there and expanded to support JRuby.

Here’s the complete list of exts with status as I know it today.

I updated the gist I linked above, now excluding or replacing all the gems above. With these changes, bundle install completes.

I have no idea if Discourse actually works yet though :slight_smile:

15 Likes

giphy

This is so exciting!

15 Likes

@sam any of the easy / obvious replacements we should do next week just to get more obstacles out of the way of this happening.

3 Likes

I just did this.

https://github.com/discourse/discourse/commit/79e0cd7f529c14d482b2c5dc1394bc90d6070d56

and

https://github.com/discourse/discourse/commit/b301c9f6c12f517ee64e15599d7cac6ed2b9d7d8

and

https://github.com/discourse/discourse/commit/c234a14f0de458168adb5100cb7fd9d21a362d7d

This leaves, cppjieba, libv8 / mini_racer, nokogumbo, oj, pg and rinku

Regarding Rhino, you got to make sure the ruby rhino with latest https://github.com/mozilla/rhino It looks pretty silent there https://github.com/cowboyd/therubyrhino/commits/master . Ideally I would prefer v8, but at a minimum we would want latest Rhino.

10 Likes

therubyrhino should just work, if not there’s a pretty solid attempt to use the engine provided with Java 8,
which is kind of a (direct) successor of Rhino, gem 'dienashorner', platform: :jruby … really depends what the JS engine is used for - if mostly for compiling with the asset pipeline than both are expected to work fine.

have also looked into xorcist, managed to run into one issue - not failing properly on frozen strings, but its pretty edge case that is easy to work-around + PR submitted.

4 Likes

Aside: can y’all share with those of us on the sidelines a little bit about why? Is the notion that JRuby would replace MRI as the default implementation for Discourse in the future if you get this working?

1 Like

That certainly isn’t my goal! JRuby would always be an alternative, and in some cases it may be a better choice than MRI for larger deployments. A single JRuby instance can handle an entire server load, maxing out all cores. If you are getting to the point of running 3 or 4 or 5 MRI instances, there may be a good case to try JRuby.

We also usually perform better, but that may take some tweaking early on.

3 Likes

I realize looking at the Rinku README that it’s a drop-in replacement for Rails autolinking, which after 3.1 was pulled out as the rails_autolink gem. I am looking into doing a port of Rinku, but simply using rails_autolink works around this one right now.

Rinku is a drop-in replacement for Rails 3.1 `auto_link`
----------------------------------------------------

Auto-linking functionality has been removed from Rails 3.1,
and is instead offered as a standalone gem, `rails_autolink`. You can
choose to use Rinku instead.
2 Likes

On Discourse, our main use for V8 is the markdown cooking.

Since we have a live preview and extensible markdown pipeline, we guarantee the same behavior on the browser and on server side by running the exact same code.

Our markdown library is https://github.com/markdown-it/markdown-it

5 Likes

I started porting Rinku and quickly realized that 99% of it is just raw C code working with character arrays.

So I started poking around for a Java autolinking library, and I found this: http://javadoc.io/doc/org.nibor.autolink/autolink/0.8.0

Even better, it accepts CharSequence, which means we should just be able to pass it a Ruby string (or one of our representations of it) and it will function mostly without pre-transcoding everything into Java characters.

I’ll see if I can get some basic API equivalent wrapper around it.

4 Likes

I have enlisted the autolink-java author in our efforts: https://github.com/robinst/autolink-java/issues/20

And I have done a proof-of-concept wrapper here: https://github.com/headius/jruby-autolink

So far…it works. But it does eventually create Java strings, so we may (or may not) want to adapt or fork this library to work with Ruby strings more directly.

4 Likes

Regarding Nokogumbo, I’d just like to say here that it seems bad that Discourse is using both Nokogiri and Nokogumbo together.

Nokogumbo follows the HTML 5 parsing specification; Nokogiri is built on libxml2’s HTML 4 parser. They differ in behavior in ways that can introduce subtle bugs when handling tricky corner cases worked out during the development of the HTML 5 parser specification.

I recommend Nokogumbo over Nokogiri, because Nokogumbo matches what browsers do, and, more philosophically, because the HTML 5 parser is fully specified, as opposed to HTML 4 which left room for undefined behavior.

(It’s like the difference between kramdown and CommonMark.)

3 Likes

Yeah I am totally for moving to nokogumbo if we can

One big concern though is that Nokogiri::HTML5.fragment(string) is considered “experimental” whatever that means.

Also, nokogumbo requires nokogiri, so there is that ;p

https://github.com/discourse/discourse/blob/master/Gemfile.lock#L194-L195

3 Likes