Génération incorrecte du balisage des pièces jointes avec des underscores dans le nom du fichier

ValdikSS · Avril 7, 2026, 4:26

J’ai trouvé un petit bug lors du téléchargement d’un fichier :

Si le fichier commence et se termine par un tiret bas (_test_file_.txt dans ce cas), le code de génération du balisage n’échappe pas le tiret bas, ce qui donne le « nom de fichier » suivant :
test_file.txt|attachment (23 octets)

[_test_file_.txt|attachment](upload://eSJGButIpkpu4IEifmmispiFRJu.txt) (23 octets)

Si j’échappe le premier tiret bas avec une barre oblique inverse, il est correctement analysé :

_test_file_.txt (23 octets)

[\_test_file_.txt|attachment](upload://eSJGButIpkpu4IEifmmispiFRJu.txt) (23 octets)

zogstrip · Avril 7, 2026, 2:33

Merci pour le rapport @ValdikSS Voici une tentative pour régler ce problème une fois pour toutes

github.com/discourse/discourse

FIX: Escape markdown characters in upload filenames (#39133)

main ← fix/escape-markdown-in-upload-filenames

opened 02:33PM - 07 Apr 26 UTC

ZogStriP

+159 -58

Filenames containing markdown formatting characters (`_`, `*`, `~`, `` ` ``, `[`…, `]`, `|`) would break upload markup when cooked. For example, uploading `_test_file_.txt` generated: [_test_file_.txt|attachment](upload://...) The underscores triggered emphasis parsing inside the link text, which both rendered the filename incorrectly (with italics) and prevented the `|attachment` marker from being recognized — losing the `class="attachment"` on the resulting `<a>` tag. **Markdown generation (defense in depth)** Add `escapeMarkdownCharacters` (JS) and `UploadMarkdown.escape_markdown` (Ruby) to backslash-escape all inline formatting characters in filenames before embedding them in markdown link text. Applied in: - `UploadMarkdown` — image, attachment, and playable media methods - `uploads.js` — `attachmentMarkdown` and `markdownNameFromFileName` - `inline_uploads.rb` — HTML anchor conversion and hotlinked image URLs - `to-markdown.js` — HTML-to-markdown attachment link reconstruction - `sanitizeAlt` in `markdown-image-builder.js` — image alt text **Parser resilience (belt and suspenders)** The markdown-it `renderAttachment` renderer and ProseMirror's link parser both assumed `tokens[idx+1]` was a single text token containing the full link text. When emphasis/bold/strikethrough/code was parsed inside the link text, the token sequence included formatting tokens and the `|attachment` marker was lost. Both now scan forward through all tokens between `link_open` and `link_close` to find the marker. The image renderer (`renderImageOrPlayableMedia`) split alt text on `|` assuming the first segment was always the alt and everything after was structured suffixes (dimensions, video/audio, data attributes). A pipe in the filename would produce extra segments that confused the dimension parser. It now scans from the right, consuming known suffixes, and treats everything remaining as alt text. https://meta.discourse.org/t/400079

Sujet		Réponses	Vues
Some links are misinterpreted Bug	7	758	Août 8, 2022
Double-escaping of underscores in image alt text corrupts post raw on each edit Support	2	69	Avril 23, 2026
Links broken with (at least) two underscores in URL Bug	9	2078	Juin 15, 2022
Links with underbars surrounded by hyphens render incorrectly Bug	2	772	Septembre 24, 2018
A comma followed by an underscore in a URL results in defunct link Bug markdown-it-review	6	2863	Juin 26, 2017

Génération incorrecte du balisage des pièces jointes avec des underscores dans le nom du fichier

Sujets connexes