sam

Given:

Hi. The current Mathjax plugin can be quite annoying, for it only supports English punctuation.

For instance, $1+1$ is ok

However,$1+1$ fails. This really misses the point of rendering Mathjax. This only renders Mathjax for English only.

I use Mathjax for my blog. The original Mathjax or KaTeX works well with Chinese punctuation.

Look at the source of the discourse mathjax plugin, find the area of code where the issue is and suggest a fix.

sam

try again, the source is at: GitHub - discourse/discourse-math: Official MathJax support for Discourse

gpt4o_bot GitHub Helper
GitHub code search

Searched for 'MathJax' in discourse/discourse-math

GitHub file content

Retrieved content of assets/javascripts/initializers/discourse-math-mathjax.js from discourse/discourse-math@main

In the file assets/javascripts/initializers/discourse-math-mathjax.js, the initMathJax, decorate, and mathjax functions should be checked and potentially modified to ensure that Chinese punctuation is properly supported.

Identifying the Issue:

The main issue lies in the function decorate, where it’s handling the creation of mathScript and mathWrapper elements that render the MathJax content. If Chinese punctuation is not being processed correctly, it’s likely due to how the mathScript.innerText is being populated and subsequently rendered by MathJax.

Potential Fix:

To support Chinese punctuation, we should ensure that the content being passed to MathJax can handle multiple punctuation types, including those used in Chinese. MathJax’s configuration should allow for this.

Here’s a potential fix:

  1. Improving Regex (if needed): Modify the text processing part to ensure punctuations are correctly interpreted by MathJax.
  2. Additional MathJax Configuration: Customize MathJax configuration to include additional character sets or update the current input logic.

Proposed Update to decorate function:

function decorate(elem, isPreview) {
  if (elem.dataset.appliedMathjax) {
    return;
  }

  elem.dataset.appliedMathjax = true;

  let tag, classList, type;

  if (elem.classList.contains("math")) {
    tag = elem.tagName === "DIV" ? "div" : "span";
    const display = tag === "div" ? "; mode=display" : "";
    const displayClass = tag === "div" ? "block-math" : "inline-math";
    type = `math/tex${display}`;
    classList = `math-container ${displayClass} mathjax-math`;
  } else if (elem.classList.contains("asciimath")) {
    tag = "span";
    classList = "math-container inline-math ascii-math";
    type = "math/asciimath";
  }

  const mathScript = document.createElement("script");
  mathScript.type = type;

  // Fix: Ensure proper handling of multi-language punctuation
  mathScript.innerHTML = elem.innerText.replace(/,/g, ",").replace(/。/g, ".");

  const mathWrapper = document.createElement(tag);
  mathWrapper.classList.add(...classList.split(" "));
  mathWrapper.style.display = "none";

  mathWrapper.appendChild(mathScript);

  elem.after(mathWrapper);

  later(
    this,
    () => {
      window.MathJax.Hub.Queue(() => {
        // don't bother processing previews removed from DOM
        if (elem?.parentElement?.offsetParent !== null) {
          window.MathJax.Hub.Typeset(mathScript, () => {
            elem.style.display = "none";
            mathWrapper.style.display = null;
          });
        }
      });
    },
    isPreview ? 200 : 0
  );
}

This modification ensures that any Chinese punctuations like and are replaced with their English counterparts before MathJax processes the content. This might solve the rendering issue. If further customization is needed, you might need to delve deeper into MathJax configurations and extensions.

sam

I think you are barking up the wrong tree… look at discourse-math/assets/javascripts/lib/discourse-markdown/discourse-math.js at main · discourse/discourse-math · GitHub

gpt4o_bot GitHub Helper
GitHub file content

Retrieved content of assets/javascripts/lib/discourse-markdown/discourse-math.js from discourse/discourse-math@main

Thank you for pointing out the correct file. Let’s review the discourse-math.js for the MathJax integration:

Key Functions:

  1. isSafeBoundary function:
    This function checks if the character at a position is a valid boundary for MathJax processing.

  2. math_input function:
    This function processes inline math delimited by specific characters ($ for TeX, % for AsciiMath).

  3. inlineMath and asciiMath functions:
    These functions utilize math_input to handle $ and % delimited inline math, respectively.

  4. blockMath function:
    This function processes block math delimited by double $ signs.

  5. setup function:
    This function sets up the Markdown-it plugin for handling MathJax inline and block elements.

Suggested Fix:

The problem with Chinese punctuation might stem from the punctuation boundary checks. The isSafeBoundary function uses Western punctuation checks. We should modify this function to consider Chinese punctuation as valid boundaries.

Update isSafeBoundary Function:

Before:

function isSafeBoundary(character_code, delimiter_code, md) {
  if (character_code === delimiter_code) {
    return false;
  }

  if (md.utils.isWhiteSpace(character_code)) {
    return true;
  }

  if (md.utils.isMdAsciiPunct(character_code)) {
    return true;
  }

  if (md.utils.isPunctChar(character_code)) {
    return true;
  }

  return false;
}

After (including additional punctuation):

function isSafeBoundary(character_code, delimiter_code, md) {
  const chinesePunctuation = [
    0xFF0C, // ,
    0xFF1A, // :
    0xFF1B, // ;
    0xFF0E, // .
    0x3002, // 。
    0xFF1F, // ?
    0xFF01, // !
  ];

  if (character_code === delimiter_code) {
    return false;
  }

  if (md.utils.isWhiteSpace(character_code)) {
    return true;
  }

  if (md.utils.isMdAsciiPunct(character_code)) {
    return true;
  }

  if (md.utils.isPunctChar(character_code)) {
    return true;
  }

  if (chinesePunctuation.includes(character_code)) {
    return true;
  }

  return false;
}

Summary:

  • Updated isSafeBoundary to recognize Chinese punctuation as valid boundaries for MathJax processing.
  • This should allow inline math within Chinese punctuation to be correctly processed by MathJax.

Applying this change should resolve the issue with MathJax not working correctly with Chinese punctuation.

sam

are there any other languages one should add?

gpt4o_bot GitHub Helper

It’s a good question. There are various punctuation marks used in different languages that could potentially interfere with MathJax rendering. While Chinese is the one mentioned specifically, it’s prudent to consider a broader set of punctuation marks from other languages commonly used with MathJax.

Common Punctuation Marks from Asian Languages:

  1. Japanese:

    • (full-width comma): U+3001
    • (full-width period): U+3002
    • (full-width question mark): U+FF1F
    • (full-width exclamation mark): U+FF01
  2. Korean:

    • (IDEOGRAPHIC COMMA): U+3001
    • (IDEOGRAPHIC FULL STOP): U+3002
    • (full-width colon): U+FF1A
    • (full-width semicolon): U+FF1B

Common Punctuation Marks from Other Languages:

  • Arabic:

    • ، (Arabic comma): U+060C
    • ؛ (Arabic semicolon): U+061B
    • ؟ (Arabic question mark): U+061F
  • Thai:

    • (Abbreviation mark): U+0E2F

Revised isSafeBoundary Function:

function isSafeBoundary(character_code, delimiter_code, md) {
  const additionalPunctuation = [
    // Chinese and Japanese punctuation
    0x3001, // 、
    0x3002, // 。
    
    // Full-width punctuation used in East Asian languages
    0xFF0C, // ,
    0xFF1A, // :
    0xFF1B, // ;
    0xFF0E, // .
    0xFF1F, // ?
    0xFF01, // !
    
    // Arabic punctuation
    0x060C, // ،
    0x061B, // ؛
    0x061F, // ؟

    // Thai punctuation
    0x0E2F, // ฯ
  ];

  if (character_code === delimiter_code) {
    return false;
  }

  if (md.utils.isWhiteSpace(character_code)) {
    return true;
  }

  if (md.utils.isMdAsciiPunct(character_code)) {
    return true;
  }

  if (md.utils.isPunctChar(character_code)) {
    return true;
  }

  if (additionalPunctuation.includes(character_code)) {
    return true;
  }

  return false;
}

Summary:

  • Ensure safe boundaries for a variety of languages.
  • Include punctuation commonly used in East Asian languages (Chinese, Japanese, Korean) and some punctuation from Arabic and Thai.

By making the isSafeBoundary function inclusive of these punctuation characters, MathJax rendering should work correctly for a wider range of languages and their respective punctuation marks.

sam

look at pretty text spec and suggest a test for this change

gpt4o_bot GitHub Helper
GitHub search files

Searched for 'test' in discourse/discourse-math/main

Since there are no existing test files identified in the main branch for discourse-math, it would be helpful to establish a new test file.

Suggested Test using Pretty Text Spec

Here is a recommended way to add a test case to ensure that the new punctuations are correctly handled:

Folder Structure

  1. Create a new file, for example: test/javascripts/lib/discourse-markdown/discourse-math-test.js

Test Content

Below is a basic test structure using a JavaScript testing framework, assuming you are using something like QUnit or Mocha.

import { setup as setupDiscourseMath } from 'discourse/plugins/discourse-math/discourse-markdown/discourse-math';
import { module, test } from 'qunit'; // or Mocha, Jest, etc.
import MarkdownIt from 'markdown-it';

module('Discourse Math Plugin - Punctuation Support', function (hooks) {
  let md;

  hooks.beforeEach(function () {
    md = new MarkdownIt();
    setupDiscourseMath({
      markdownIt: md,
      registerOptions: () => {},
      registerPlugin: (cb) => cb(md)
    });
  });

  test('it renders inline math with Chinese punctuation', function (assert) {
    let result = md.render('这是一个测试,$a^2 + b^2 = c^2$,这是另一个测试。');
    assert.ok(result.includes('<span class=\'math\'>a^2 + b^2 = c^2</span>'), 'Successfully processed Chinese punctuation');
  });

  test('it renders block math with Chinese punctuation', function (assert) {
    let result = md.render('$$\na^2 + b^2 = c^2\n$$');
    assert.ok(result.includes("<div class='math'>\na^2 + b^2 = c^2\n</div>"), 'Successfully processed block with Chinese punctuation');
  });
  
  test('it renders inline math with Japanese punctuation', function (assert) {
    let result = md.render('これはテストです、$a^2 + b^2 = c^2$、これもテストです。');
    assert.ok(result.includes('<span class=\'math\'>a^2 + b^2 = c^2</
sam

you searched for test… search for spec…

gpt4o_bot GitHub Helper
GitHub search files

Searched for 'spec' in discourse/discourse-math/main

GitHub file content

Retrieved content of spec/pretty_text_spec.rb from discourse/discourse-math@main

The pretty_text_spec.rb file contains tests for checking the rendering of math expressions by PrettyText along with the Discourse Math plugin. We can add new test cases to verify that MathJax can handle different punctuations, including Chinese, Japanese, and other punctuations.

Adding Tests for Various Punctuation Handling

Here’s how you can extend the current test suite to include tests for Chinese and other punctuations:

# frozen_string_literal: true

require "rails_helper"

describe PrettyText do
  context "with discourse math" do
    before { SiteSetting.discourse_math_enabled = true }

    # Existing tests...
    # ...

    it "can handle inline math with Chinese punctuation" do
      cooked = PrettyText.cook('这是一个测试,$a^2 + b^2 = c^2$,这是另一个测试。')
      html = '<p>这是一个测试,<span class="math">a^2 + b^2 = c^2</span>,这是另一个测试。</p>'
      expect(cooked).to eq(html)
    end

    it "can handle inline math with Japanese punctuation" do
      cooked = PrettyText.cook('これはテストです、$a^2 + b^2 = c^2$、これもテストです。')
      html = '<p>これはテストです、<span class="math">a^2 + b^2 = c^2</span>、これもテストです。</p>'
      expect(cooked).to eq(html)
    end

    it "can handle inline math with Arabic punctuation" do
      cooked = PrettyText.cook('هذا اختبار،$a^2 + b^2 = c^2$،هذا اختبار آخر.')
      html = '<p>هذا اختبار،<span class="math">a^2 + b^2 = c^2</span>،هذا اختبار آخر.</p>'
      expect(cooked).to eq(html)
    end

    it "can handle block math with Chinese punctuation" do
      cooked = PrettyText.cook("$$\na^2 + b^2 = c^2\n$$")
      html = "<p><div class='math'>\na^2 + b^2 = c^2\n</div></p>"
      expect(cooked).to eq(html)
    end
  end
end

Explanation:

  • Chinese Punctuation Test: Adds a test to handle inline math with Chinese punctuations.
  • Japanese Punctuation Test: Adds a test for Japanese punctuation.
  • Arabic Punctuation Test: Adds a test for Arabic punctuation.
  • Block Math Test: Ensures that block math expressions are also correctly parsed when punctuated.

By adding these tests, you can ensure that the plugin correctly interprets various punctuation marks from different languages, enhancing its robustness in multilingual environments.