Developer's guide to Markdown extensions


(Sam Saffron) #1

Discourse recently moved to a new Markdown engine called Markdown-it.

Here are some dev notes that will help you either fix bugs in core or create your new plugins.

The Basics

Discourse only contains a few helpers on top of the engine, so the vast majority of learning that needs to be done, is understanding Markdown It.

The docs directory contains the current documentation.

I strongly recommend reading:

While I develop extensions for the engine I usually open up a second editor looking at existing rules. The engine consists of a long list of rules and each rule is in a dedicated file that is reasonably easy to follow.

If I am working on an inline rule I will think of what existing inline rule works more or less like it and base my work on it.

Keep in mind, you can sometimes get away with just changing a renderer to get desired functionality which is usually much easier.

How to structure an extension?

When the markdown engine initializes it searches through all the modules.

If any module is called /discourse-markdown\/|markdown-it\// (meaning it lives in a discoruse-markdown or markdown-it directory) it will be a candidate for initialization.

If the module exports a method called setup it will be called by the engine during initialization.

The setup protocol

/my-plugins/assets/javascripts/discourse-markdown/awesome-extension.js.es6

export function setup(helper) {
   // ... your code goes here
} 

A setup method gets access to a helper object it can use for initialization. This contains the following methods and vars:

  • bool markdownIt : this property is set to true when the new engine is in use. For proper backwards compatibility you want to check it.

  • registerOptions(cb(opts, siteSettings, state)) : the provided function is called before the markdown engine is initialized, you can use it to determine if to enable or disable the engine.

  • whiteList([spec, ...]): this method is used to whitelist HTML with our sanitizer.

  • registerPlugin(func(md)): this method is used to register a Markdown It plugin.

Putting it all together

function amazingMarkdownItInline(state, silent) {
   // standard markdown it inline extension goes here.
   return false;
}

export function setup(helper) {
   if(!helper.markdownIt) { return; }

   helper.registerOptions((opts,siteSettings)=>{
      opts.features.['my_extension'] = !!siteSettings.my_extension_enabled;
   });

   helper.whiteList(['span.amazing', 'div.amazing']);

   helper.registerPlugin(md=>{
      md.inline.push('amazing', amazingMarkdownItInline);
   });
}

Discourse specific extensions

BBCode

Discourse contains 2 rulers you can use for custom BBCode tags. An inline and block level ruler.

Inline bbcode rules are ones that live in an inline paragraph like [b]bold[/b]

Block level rules apply to multiple lines of text like:

[poll]
- option 1

- options 2
[/poll]

md.inline.bbcode.ruler holds a list of inline rules that are applied in order.

md.block.bbcode.ruler holds a list of block level rules

There are many examples for inline rules at: bbcode-inline.js.es6

Quotes and polls are good examples of bbcode block rules.

Inline BBCode rules

Inline BBCode rules are an object containing information about how to handle a tag.

For example:

md.inline.bbcode.ruler.push('underline', {
    tag: 'u',
    wrap: 'span.bbcode-u'
});

Will cause

test [u]test[/u]

To be converted to:

test <span class='bbcode-u'>test</span>

Inline rules can either wrap or replace text. When wrapping you can also pass in a function to gain extra flexibility.

 md.inline.bbcode.ruler.push('url', {
      tag: 'url',
      wrap: function(startToken, endToken, tagInfo, content) {
        const url = (tagInfo.attrs['_default'] || content).trim();

        if (simpleUrlRegex.test(url)) {
          startToken.type = 'link_open';
          startToken.tag = 'a';
          startToken.attrs = [['href', url], ['data-bbcode', 'true']];
          startToken.content = '';
          startToken.nesting = 1;

          endToken.type = 'link_close';
          endToken.tag = 'a';
          endToken.content = '';
          endToken.nesting = -1;
        } else {
          // just strip the bbcode tag
          endToken.content = '';
          startToken.content = '';

          // edge case, we don't want this detected as a onebox if auto linked
          // this ensures it is not stripped
          startToken.type = 'html_inline';
        }

        return false;
      }
    });

The wrapping function provides access to:

  • The tagInfo, which is a dictionary of key/values specified via bbcode.

    [test=testing] -> {_default: "testing"}
    [test a=1] -> {a: "1"}

  • The token starting the inline

  • The token finishing the inline

  • The content of the bbcode inline

Using this information you can handle all sort of wrapping needs.

Occasionally you may want to replace the entire BBCode block, for that you can use replace

  md.inline.bbcode.ruler.push('code', {
      tag: 'code',
      replace: function(state, tagInfo, content) {
        let token;
        token = state.push('code_inline', 'code', 0);
        token.content = content;
        return true;
      }
    });

In this case we are replacing an entire [code]code block[code] with a single code_inline token.

Block BBCode rules

Block bbcode rules allow you to replace an entire block. The block APIs are the same for simple cases:

md.block.bbcode.ruler.push('happy',{
   tag: 'happy',
   wrap: 'div.happy'
});
[happy]
hello
[/happy] 

will become

<div class='happy'>
hello
</div>

The function wrapper has a slightly different API cause there are not wrapping tokens.

md.block.bbcode.ruler.push('money', {
   tag: 'money',
   wrap: function(token, tagInfo) {
      token.attrs = [['data-money', tagInfo.attrs['_default']]];
      return true;
   }
});
[money=100]
**test**
[/money]

Will become

<div data-money='100'>
<b>test</b>
</div>

You can gain full control over block rendering with before and after rule, this allows you to do stuff like double nest a tag and so on.

md.block.bbcode.ruler.push('ddiv', {
    tag: 'ddiv',
    before: function(state, tagInfo) {
        state.push('div_open', 'div', 1);
        state.push('div_open', 'div', 1);
    },
    after: function(state) {
        state.push('div_close', 'div', -1);
        state.push('div_close', 'div', -1);
     }
})
[ddiv]
test
[/ddiv]

will become

<div>
<div>
test
</div>
</div>

Handling text replacements

Discourse ships with an extra special core rule for applying regular expressions to text.

md.core.textPostProcess.ruler

To use:

md.core.textPostProcess.ruler.push('onlyfastcars', {
   matcher: /(car)|(bus)/,  //regex flags are NOT supported
   onMatch: function(buffer, matches, state) {
        let token = new state.Token('text', '', 0);
        token.content = 'fast ' + matches[0];
        buffer.push(token);
    }
});
I like cars and buses 

Will become

<p>I like fast cars and fast buses</p>

Discourse CommonMark migration plans :confetti_ball: :balloon:
DOI resolver updated
Custom BBCode Help/Advice
Discourse CommonMark text highlighting?
#2

Been playing around with the md.core.textPostProcess.ruler and noticed that the matcher doesn’t recognize the case insensitive i regex flag.

Is this a by design or a bug?


(Sam Saffron) #3

By design, it is a limitation it has. You will need to adjust your regex


#4

Thanks! I already redid my regex, but it did take me a while to figure out what was wrong.


(Sam Saffron) #5

I just wikid the guide do you mind updating it with that info?


(Eli the Bearded) #6

Is there an easy way to include existing markdown-it plugins? I can see this one as being useful:


(Sam Saffron) #7

It is pretty straight forward, I just need to write a tutorial for it. But in the specific example the this font awesome syntax would collide with emoji, which in turn would lead to many tears.


(Minsik Yoon) #8

replaceBlock is deprecated, please use the new markdown it APIs

I have met this message after upgrade discourse 1.9.
is There some advice how to fix this problem?

below is a plugin code.

 helper.replaceBlock(replaceBlockereplaceBlock
   1     start: new RegExp("\\[schedule((?:\\s+(?:" + WHITELISTED_ATTRIBUTES.join("|") + ")=(?:['\"][^\\n]+['\"]|[^\\s\\]]+))+)\\]([\\     s\\S]*)", "igm"),
   2     stop: /\[\/schedule\]/igm,
   3
   4     emitter(blockContents, matches) {
   5       const attributes = { "class": "discourse-calendar-schedule discourse-ui-card" };
   6       const contents = [];
   7
   8       if (blockContents.length){
   9         const postProcess = bc => {
  10           if (typeof bc === "string" || bc instanceof String) {
  11             const processed = this.processInline(String(bc));
  12             if (processed.length) {
  13               contents.push(["p"].concat(processed));
  14             }
  15           } else {
  16             contents.push(bc);
  17           }
  18         };
  19
  20         let b;
  21         while ((b = blockContents.shift()) !== undefined) {
  22           this.processBlock(b, blockContents).forEach(postProcess);
  23         }
  24       }

Thank you!


(Sam Saffron) #9

Have a look at the implementation here for spoiler alert for an example of bbcode processing:


(Jesse Griffin) #10

I’m trying to develop an extension that adds Mermaid JS support to Discourse. The problem I’m running into is that by the time Mermaid JS gets the data in my [mermaid] block, it has HTML intermixed in it. How can I prevent Markdown-it / Discourse from adding HTML into my [mermaid] block?

I’ve tried the regular wrap, the before and after route, and most recently the replace function. What I’m aiming for is to get a block like this:

[mermaid]
gantt
    title A Gantt Diagram
    dateFormat  YYYY-MM-DD
    section Section
    A task           :a1, 2014-01-01, 30d
    Another task     :after a1  , 20d
    section Another
    Task in sec      :2014-01-12  , 12d
    another task      : 24d
[/mermaid]

to be wrapped in a mermaid div class so that Mermaid JS library can pick it up and turn it into an SVG.

Latest plugin code attempt (using inline, though I’ve tried block too:

export function setup(helper) {
   if(!helper.markdownIt) { return; }

   helper.whiteList(['div.mermaid']);

   helper.registerPlugin(md=>{
      md.inline.bbcode.ruler.push('mermaid',{
        tag: 'mermaid',
        replace: function(state, tagInfo, content) {
          let token;
          token = state.push('code_inline', 'code', 0);
          state.push('div_open', 'div', 1);
          token.attrs = [['class', 'mermaid']];
          token.content = content;
          state.push('div_close', 'div', -1);
          return true;
        }
      });
   });
}

The error I’m getting is:

flow.js:316 Uncaught Error: Parse error on line 1:
<p>gantt<br/>dateFo
^
Expecting 'NEWLINE', 'SPACE', 'GRAPH', got 'TAGSTART'
    at Yt.parseError (flow.js:314)
    at Yt.parse (flow.js:384)
    at Object.e.getClasses (flowRenderer.js:220)
    at Object.render (mermaidAPI.js:367)
    at s (mermaid.js:96)
    at Object.init (mermaid.js:78)
    at s (mermaid.js:131)
    at mermaid.js:149

Should I be going about this in a completely different way or is there something I’m missing?

My code is at GitHub - unfoldingWord-dev/discourse-mermaid: Adds the Mermaid JS library to discourse


(Kane York) #11

Did you try doing block.bbcode?


(Sam Saffron) #12

The easiest way to unblock yourself here would be to base this on the discourse math implementation, it is more verbose, this will probably work

const escaped = state.md.utils.escapeHtml(content);
token.content = `&lt;div class='math'&gt;\n${escaped}\n&lt;/div&gt;\n`;

Also, a cool thing for you to check out is that graphvis has a js port.

note the token type is going to have to be html_raw to bypass the internal escaping.


(Jesse Griffin) #13

Awesome, thanks for the advice! I have the code here working now. The relevant snippet is:

  helper.registerPlugin(md=>{
    md.inline.bbcode.ruler.push('mermaid',{
      tag: 'mermaid',
      replace: function(state, tagInfo, content) {
        let token = state.push('html_raw', '', 0);
        const escaped = state.md.utils.escapeHtml(content);
        token.content = `<div class='mermaid'>\n${escaped}\n</div>\n`;
        return true;
      }
    });
  });

For example:


(Sam Saffron) #14

Awesome, do you mind doing a dedicated post in #plugin about it ?