Upgrading Mathjax to version 4

@sam and all interested in typing math in Discourse. I’ve updated the discourse-math plugin so that it uses MathJax V3, rather than the much slower and very outdated V2. As expected, the result is a much snappier user experience, while still maintaining the feature rich environment as compared to KaTeX.

I’d love to issue a pull request, if you think the results look good.


You can see it in action on my class Discourse site:

Most of the content on that site is private or unlisted. There should be several topics at the top in the MathJax V3 category that illustrate the ideas, though

You can examine the code for the plugin in this standalone discourse-mathjax plugin repo. The file that has the most modifications by far is the initializer.

You can also use that repo to install it on a standalone site right now. Just be sure to remove the old repo during installation. Thus, you’d modify the standard plugin install technique to look like this:

hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - rm -r discourse-math
          - git clone https://github.com/discourse/docker_manager.git
          - git clone https://github.com/mcmcclur/discourse-math.git

Comments

The most recent version of MathJax is actually 4.0.0. I chose to go with V3.2.2 for several reasons

  • While V4 is certainly much faster than V2, it’s not quite as fast as V3.
  • The user experience is a bit different in V4, particularly if the user clicks on the output.
  • The 4.0.0 status makes me wonder how many bugs there might be.

Having said that, the API for V4 is identical to that of V3. It should be possible to upgrade later, by simply dropping the latest MathJax Repo in.

I had to make one small change in the locales/server.en.yml file. Of course, there are a ton more such files for various languages. My understanding is that those other files would be automatically translated later?

I really don’t use chat at all and haven’t tested it in that context.

4 Likes

Pull request to upgrade MathJax to V3 made with all tests passed!

2 Likes

Regarding:

This is fantastic :hugs: , but I wonder if we can use this an an opportunity to trim down our repo a bit.

Now that we moved mathjax to core, we can lean on pnpm to pull the package and avoid bundling all the source like we do for FullCalendar for example.

Particularly have the goal of only having “links” in our repo and then we can use the build process to pull the correct dependencies.

Give us a few days I want to consult with with the dev xp team here. Thanks so much for your efforts here!

4 Likes

Yes, I think that’s certainly the right thing to do. I always kinda wondered why you packaged the whole thing!

So, I guess you’ll build a loadMathJax function for your library that’s used to load MathJax?

I will say that rolling all the plugins into core has made it a bit trickier to to play with them. Tying the dependencies to the build process would only make it harder still, though I’m sure I could pull MathJax or FullCalendar from a CDN.

I’m mostly just talking about when I tinker with plugins for use in my own forums, though, and I absolutely think you oughtta pull MathJax in during the build.

Absolutely! I’ve been using Discourse for years and am super happy you think this is fantastic! :rocket:

3 Likes

Yes, exactly. A good one to copy is morphlex:

1 Like

I wonder if you’ve been able to discuss with your developer experience folks yet? I’d be happy to help, if I can. My impression, though, is that there’s really nothing I can do without your feedback on that.

I’ve made a few additional changes in a separate branch, that I’ll post about soon. I’m aware you’ve got a ton on you plate so I don’t mean to be a bother!

I’ve modified the discourse-math plugin so that it can parse a lot more mathematical input.

@sam When I first contributed to this plugin in 2017, I recall that you were quite firm that you wanted very strict parsing. Let me say up front that my main motivation to relax and extend the parsing was so that it would work better with AI. In particular, when you chat about mathematics with an AI bot you will often find that it responds using LaTeX and there are a lot of ways that it might choose to delimit that LaTeX input. So, while I understand your motivation for strict parsing, the changes I’ve made are rather essential for that use case.

Of course, you still might not care about that use case, so I’ve put the changes in a separate branch from my V3 pull request. If you decide you like them, I’m happy to issue another pull request.

The specific changes to the pull request are:

It accepts slash-paren delimited inline math like \(a^2+b^2=c^2\).

It accepts single line double-dollar delimited display math like
$$a^2+b^2=c^2.$$

It accepts single line slash-bracket delimited display math like 
\[a^2+b^2=c^2.\]

It accepts multi line slash-bracket delimited display math like 
\[
a^2+b^2=c^2.
\]

Of course, it still accepts the inputs of the original:

Dollar delimited inline math: $a^2+b^2=c^2$.

Multi-line, double dollar delimited display math:
$$
a^2+b^2=c^2.
$$

You can find the relevant branch here.

The code also exists as a standalone plugin.

Oh, you can also see it in action!

2 Likes

@mcmcclur Thank you for your work. It would be great to see these features in the core.

1 Like

Thanks so much Mark.

My big blocker here is that I really want to move to the new patterns for dependency distribution, see:

Is this something you could look at?

Regarding relaxed syntax, it feels like a site setting to me, maybe even default on given all the LLMs out there?

3 Likes

@mcmcclur I was fiddling with this today:

Far from ready … but stuff sort of boots with 4.1 which is nice.

2 Likes

Yes, this is definitely progress!

The first key issue to address, as I suspect you know, is that the fonts are not found. In fact, I fiddled with this line in discourse-math-mathjax.js:

fontURL: getURLWithCDN("/assets/mathjax/woff-v2"),

As a test, I set the URL to simply point to a temporary directory on my own webserver, and the initial results look very good. So, it’s a matter of getting those fonts installed correctly in Discourse.

In a simple pnpm project on my machine, the following command installs the fonts:

pnpm install @mathjax/mathjax-newcm-font@4

When I run that command within discourse/frontend/discourse, the fonts appear in

/discourse/frontend/discourse/npm_modules/@mathjax/mathjax-newcm-font/chtml/woff2/

Those fonts don’t seem to land in /assets/mathjax/woff-v2 after build, though. I’ve tried a number of variations on the directory but haven’t got it to work. I assume this is some sort of routing magic that I’m no expert in. I’m pretty sure I could make good progress toward cleaning it up, once that path issue is settled.

1 Like

@sam I think I’ve made some pretty significant progress on this, with one significant caveat. I’m not sure where to load the desired components from. Expressed in code,

window.MathJax = {
    loader: {
      // This does not work:
      // paths: { mathjax: getURLWithCDN("/assets/mathjax") },
      // But this works great:
      paths: { mathjax: "https://cdn.jsdelivr.net/npm/mathjax@4.1.0" },
      load: ["core", "input/tex", "input/mml", "output/chtml", "output/svg"],
    },
    // More configuration ...
  };

When I say the commented out version doesn’t work, I mean that I get the explicit message:
MathJax(core): Can’t load “/assets/mathjax/core.js”

Note that, in both cases, the loadMathJax function is pulling the MathJax startup from the local copy. That is, I’ve got the following in
/discourse/frontend/discourse/app/static/mathjax-bundle.js

export * from "mathjax/startup.js";

Then, loadMathJax defined in
/discourse/frontend/discourse/app/lib/load-mathjax.js
calls

const bundle = await import("discourse/static/mathjax-bundle");

This suggests a couple of possibilities:

  1. Perhaps /assets/mathjax is not the correct location or
  2. Perhaps, these assets need to be registered in some way so that they appear in the dist?

Working off of the CDN version, it looks like I can make significant progress, but I assume that’s a major blocker for you.

I could share my code with you, if you like, but maybe that’s enough information for a diagnosis?

1 Like

Absolutely, code will be very helpful here, perhaps fork discourse and then push your changes to a branch, then I can pull changes from your branch into the PR.

So happy you are making progress, trying to diagnose this issue.

Can you also pull latest, I did a round of cleanup.

1 Like

OK, here is the code:

https://github.com/mcmcclur/discourse/tree/mathjax-mcmcclur

Beware, though, I did not work directly from your latest commit. I started directly from Discourse main and made changes from there. Thus, I learned a fair amount from your work but the overall structure is different.

I think you could summarize the main difference as follows: Where you (naturally) use Discourse features inherited from Ember to coordinate the timings associated with things like loading and typesetting, I use MathJax features. Thus, my load-mathjax and mathjax bundles (one for svg and one for chtml) are much simpler than yours. The loading is all coordinated via the window.MathJax object in discourse-math-mathjax.

I still have the same problem that I described before, namely that this commented out loader doesn’t work; I’ve got to use this CDN version instead. I really don’t know why.

I think that your code suffers from the same problem. That’s why AsciiMath doesn’t seem to work.

1 Like

can you double check my latest commit, I think I added a funnel for ember so now the ember build puts all the files in the right place.

2 Likes

OK, I’ve got some very good news and some frustrating news.

First, you’re absolutely right that adding the funnel places those files in the correct place. I added the funnel to my branch and it now works great without the CDN dependancy. :tada:

Unfortunately, I am unable to run your code at the moment. Whenever I navigate to a page with math on it, the math does not typeset and I see the following error message in the console:
Uncaught (in promise) Error: State EXPLORER already exists

I’m certain I had your code working before so I suppose it’s something I did. To be clear, though, I literally started an entirely fresh directory using the techniques described in Install Discourse on macOS for development.

git clone https://github.com/discourse/discourse.git ./discourse
cd ./discourse
bundle install
pnpm install
bundle exec rake db:create
bundle exec rake db:migrate
RAILS_ENV=test bundle exec rake db:create db:migrate

# In one terminal
bundle exec rails server

# In another terminal
bin/ember-cli

I then grabbed your code with

git checkout 71ad0305f812311f2a4570edf7c33f97de46c457
git switch -c mathjax-sam

Even from that fresh setup, I get the error.


At this point, I’m pretty happy with my version of the code but still curious about what’s going on with yours. I need to take a break from this for the holiday, though. I’m happy to take another look at it in a few days time.

One final point, though: As far as I know,

await import("tex-mml-chtml.js") // followed by
await import("input/asciimath.js")

shouldn’t work, which is effectively what your code is doing, I think.

I’m being imprecise with paths there but my point is that I don’t know that consecutive dynamic calls to import lead to the correct MathJax structure. I think that loading MathJax components is pretty complicated and that’s why they’ve got such a detailed loading process with the MathJax object and all.

Thanks so much for your help and patience @sam!

2 Likes

Been making progress here:

I moved the giant javascript payloads into a dedicated gem

This will make it significantly easier to stay up to date, plus mathjax is not longer checked in to the reop.

3 Likes

Hey Sam - I’ve been playing with this quite a lot today. It looks great! I do think there’s still quite a lot to do, though. Some of it, I can definitely help with. Some of it is possibly beyond my capacity, particularly with my university starting back up.

Anyways, here are a few of my thoughts.

Zoom

Zoom on hover is no longer available in MathJax V4. It’s easy to set it to zoom on alt-click, though. I’ve done so here:

Note that there’s a known MathJax bug that needs to be addressed with a little CSS, as described in this GitHub Issue. I’ve included that fix in this code as well.

Loading options

As it stands, AsciiMath cannot be turned on and Accessibility cannot be turned off. I think that’s down to the way that submodules are loaded sequentially inload-mathjax.js.

As I stated in my last message, it’s much more common to pre-define a window.MathJax object that specifies which components you want. The MathJax object is redefined when the main script is loaded. That’s how I was able to get this working in my V3 version. I think I could incorporate that approach in your code base during the first part of next week, if you’d like me to try?

Once we get options figured out, it might also be worth considering if there are new options available in V4 that should be included.

The rich editor

This is just great - I’m super happy to see it!

I wonder if it would be possible to get a sparkly AI context menu available within the modal? I ask because students (and professors :confused:) sometimes have difficulty typing LaTeX. A little AI proofreader can make that so much smoother. I’ve incorporated that into my classroom Discourse and am looking forward to using it this coming semester.


OK, I’m sure there’s a lot more, but I’m about done for the day.

Thanks so much!!! :rocket: :fire: :tada:

3 Likes

I understand that the discourse-math plugin relies on the separate MathJax/KaTeX asset gem rather than vendoring those libraries directly, which keeps the plugin lightweight and allows the math libraries to be updated independently.

I’d like to help validate this ahead of the first production release by doing some real-world testing. My initial thought was to spin up a separate, disposable instance and enable the plugin there to test math-heavy content, asset loading via the standard pipeline, CSP behaviour, and performance.

Before doing so, I wanted to ask what the recommended environment is at this stage - whether early testing in a production-like setup l is appropriate, or whether you’d prefer this to be done using a development environment until the first production release.

I’m very happy to test in whichever way is most useful and to report any issues or edge cases I encounter upstream. I can’t commit to a fixed testing schedule due to university commitments, but I’m happy to do best-effort testing when time allows, and I’m likely to have significantly more availability after 6 June.

I’ve got the options working well, now; you can see the code here:

Here are a few comments:

  • All configuration and loading is handled by the MathJaxInitConfig obect defined in math-renderer.js.
  • I removed a fair amount of inert code from load-mathjax.js.
  • The ‘ui/safe’ extension is always loaded.
  • I’ve added a “Discourse math enable menu” option, which is true by default. When false, this removes the menu entirely, which makes MathJax even speedier.
  • The next two menu items are
    • Discourse math zoom on click and
    • Discourse math enable accessibility.
      These have no effect if the menu is disabled but are independent of one another when it is enabled.

The whole menu looks like so:

I’ve not added any tests just yet but could give it a try, if you’d like a pull request.