Getting Discourse running on JRuby

(Charles Oliver Nutter) #1

Hello friends!

I work on the JRuby project, and have recently (a couple times in the past year) attempted to get Discourse running. It would mean you can run a single Discourse process for a whole site, and probably use less memory and CPU at the same time. I think it’s worth getting it to run.

As with most existing Ruby apps, there are a few missing C extensions.

The good news is that most of these extensions appear to have alternatives for JRuby, or they’re trivial enough that it should be easy to just make a JRuby version.

I wanted to start a discussion here so we can talk about some of the exts and possible replacements.

Any interest in Discourse on JRuby?

(Sam Saffron) #2

Yes absolutely.

We really want to work with JRuby, what are the current stumbling blocks?

(Charles Oliver Nutter) #3

Here’s the diff I have at the moment. This is not an exhaustive list of extensions with no JRuby support but it’s most of them.

Some replacements I know of:

Other exts will have to be discussed.

(Sam Saffron) #4

Yes this is totally safe, puma is supported. If we really want to conserve memory here though it may be interesting to run sidekiq inside puma.

Oh … fast_blank should not be a dependency for onebox, I just removed it, next release will not have it.

I think it probably makes sense just to not depend on fast_blank for jruby and set the fast_blank dependency in Discourse to MRI only. String @match? does matching without needing to set globals so it should in theory be fast enough for jRuby.

Happy to make this MRI dependency only for now.

Oh, we should bench this if perf is the same we can move to xorcist

(Charles Oliver Nutter) #5

The fast_xor README actually says to use xorcist. I believe the code was simply moved there and expanded to support JRuby.

Here’s the complete list of exts with status as I know it today.

I updated the gist I linked above, now excluding or replacing all the gems above. With these changes, bundle install completes.

I have no idea if Discourse actually works yet though :slight_smile:

(Robin Ward) #6


This is so exciting!

(Jeff Atwood) #7

@sam any of the easy / obvious replacements we should do next week just to get more obstacles out of the way of this happening.

(Sam Saffron) #8

I just did this.



This leaves, cppjieba, libv8 / mini_racer, nokogumbo, oj, pg and rinku

Regarding Rhino, you got to make sure the ruby rhino with latest GitHub - mozilla/rhino: Rhino is an open-source implementation of JavaScript written entirely in Java It looks pretty silent there Commits · cowboyd/therubyrhino · GitHub . Ideally I would prefer v8, but at a minimum we would want latest Rhino.

(Karol Bucek) #9

therubyrhino should just work, if not there’s a pretty solid attempt to use the engine provided with Java 8,
which is kind of a (direct) successor of Rhino, gem 'dienashorner', platform: :jruby … really depends what the JS engine is used for - if mostly for compiling with the asset pipeline than both are expected to work fine.

have also looked into xorcist, managed to run into one issue - not failing properly on frozen strings, but its pretty edge case that is easy to work-around + PR submitted.

(Dave McClure) #10

Aside: can y’all share with those of us on the sidelines a little bit about why? Is the notion that JRuby would replace MRI as the default implementation for Discourse in the future if you get this working?

(Charles Oliver Nutter) #11

That certainly isn’t my goal! JRuby would always be an alternative, and in some cases it may be a better choice than MRI for larger deployments. A single JRuby instance can handle an entire server load, maxing out all cores. If you are getting to the point of running 3 or 4 or 5 MRI instances, there may be a good case to try JRuby.

We also usually perform better, but that may take some tweaking early on.

(Charles Oliver Nutter) #12

I realize looking at the Rinku README that it’s a drop-in replacement for Rails autolinking, which after 3.1 was pulled out as the rails_autolink gem. I am looking into doing a port of Rinku, but simply using rails_autolink works around this one right now.

Rinku is a drop-in replacement for Rails 3.1 `auto_link`

Auto-linking functionality has been removed from Rails 3.1,
and is instead offered as a standalone gem, `rails_autolink`. You can
choose to use Rinku instead.

(Rafael dos Santos Silva) #13

On Discourse, our main use for V8 is the markdown cooking.

Since we have a live preview and extensible markdown pipeline, we guarantee the same behavior on the browser and on server side by running the exact same code.

Our markdown library is GitHub - markdown-it/markdown-it: Markdown parser, done right. 100% CommonMark support, extensions, syntax plugins & high speed

(Charles Oliver Nutter) #14

I started porting Rinku and quickly realized that 99% of it is just raw C code working with character arrays.

So I started poking around for a Java autolinking library, and I found this: org.nibor.autolink:autolink:0.8.0 API Doc :: Javadoc.IO

Even better, it accepts CharSequence, which means we should just be able to pass it a Ruby string (or one of our representations of it) and it will function mostly without pre-transcoding everything into Java characters.

I’ll see if I can get some basic API equivalent wrapper around it.

(Charles Oliver Nutter) #15

I have enlisted the autolink-java author in our efforts: Adapt autolink-java to replace rinku in JRuby · Issue #20 · robinst/autolink-java · GitHub

And I have done a proof-of-concept wrapper here: GitHub - headius/jruby-autolink: A JRuby wrapper around the autolink-java library to provide autolinking like rinku

So far…it works. But it does eventually create Java strings, so we may (or may not) want to adapt or fork this library to work with Ruby strings more directly.

(Dan Fabulich) #16

Regarding Nokogumbo, I’d just like to say here that it seems bad that Discourse is using both Nokogiri and Nokogumbo together.

Nokogumbo follows the HTML 5 parsing specification; Nokogiri is built on libxml2’s HTML 4 parser. They differ in behavior in ways that can introduce subtle bugs when handling tricky corner cases worked out during the development of the HTML 5 parser specification.

I recommend Nokogumbo over Nokogiri, because Nokogumbo matches what browsers do, and, more philosophically, because the HTML 5 parser is fully specified, as opposed to HTML 4 which left room for undefined behavior.

(It’s like the difference between kramdown and CommonMark.)

(Sam Saffron) #17

Yeah I am totally for moving to nokogumbo if we can

One big concern though is that Nokogiri::HTML5.fragment(string) is considered “experimental” whatever that means.

Also, nokogumbo requires nokogiri, so there is that ;p