GitHub onebox does not skip leading tabs

(Discourse.PRO) #1

An incorrect rendering (the code uses tabs):

A correct rendering (the code uses spaces):

(cpradio) #2

I don’t understand why this is a bug. The file contains tabs… so it is displaying tabs. Discourse isn’t adding any additional tabs, it is still displaying 4 of them.

Are you saying the onebox should replace tabs with spaces? As why should a onebox alter content of the file it is referencing?

(Gerhard Schlager) #3

The Github onebox strips leading spaces, but not tabs.

(Jeff Atwood) #4

If that’s the case then we might extend this to tabs as well for consistency so @dmitry_fedyuk has a point.

(cpradio) #5

Ah, so that seems to be as simple as changing (and its other usage ^[ ]{#{min_space}})
^[ ]*


(Dean Taylor) #6

There are two lines these change would need to be made on:

  m = l.match /^[ ]*/ # find leading spaces 0 or more


  re = "^[ ]{#{min_space}}"  #match the minimum spaces of the line

Please be specific about white spacing (i.e. use [ \t]), using \s may cause unicode white-space ghosts past, present and future to come back and kick your ass.

(cpradio) #7

Actually, I don’t think that is a concern, if Ruby’s regex uses PCRE, it only matches a subset of values unless it is specifically compiled in a way that can put a strain on performance using the PCRE_UCP option.

Plus that non-whitespace would have to be at the start of a line, which is unlikely too, granted, I guess someone could write a literal string across multiple lines.

Fairly certain \s is safe, but I also have nothing against specifying \t. Another approach would be to replace all instances of \t at the beginning of a line with 4 spaces and let the logic continue on its way since it already works with spaces.

So I’d say there is a very very small chance that \s would be problematic. Especially given that it would have to be a multi-line literal string. Otherwise, there would be zero chance of it the line starting with a unicode whitespace.