Improving pinned topic excerpts

Our current pinned topic excerpt algorithm leaves much to be desired.

And

It basically takes all the words in the post that fit in the first 220 chars, strips formatting, mushes them them together and TADA.

This leaves moderators little to no control over what is displayed in the excerpt and can lead to a cluttered view.

Instead I would like to simplify the algorithm to:

Take all the words in the first paragraph (P), up to 220 length. Don’t cut words halfway through.

This will give moderators way better control and assist in cleaning up the topic lists where pinned topics are involved.

Thoughts?

cc @PJH

1 Like

One side-effect problem is that posts like this: http://discourse.stonehearth.net/t/stonehearth-announced-features/2361 will end up with the crappy excerpt: hello everyone,. The fix is trivial though, just remove the salutation.

I actually think all the ones you posted screenshots of look fine.

Blindly stripping pin excerpts at the first para would literally break the excerpt of every single one of the pinned topics you just screenshotted… hard to see why you’re even proposing that.

Have you looked at GMail? It also concats in a similar way.

concat from

Notice that Gmail does NOT stop at the first paragraph, and neither should we…

Maybe the thing to do is add some other kind of markup people can use, who want finer control than this. Certainly not a V1 thing.

4 Likes

I thought the simplest thing here that at least grants control is to say break on html comment <!-- break -->, unfortunately we now strip out html comments from rendered markdown so its non trivial.

Other option is HR but is litters the post. Last option is custom markup, but it also is confusing.

My major issue is that there is zero mod control here, you have an excerpt and can not control how it looks or when it stops.

2 Likes

<algorithm/thoughts prepared, but forgot to post, 4 hrs ago deleted>

Because there’d be no point, apparently.

I dunno, I think the major pain point is lack of control. No matter what fancy algorithm we come with we will mess up sometimes. Plenty of improvement though we can do with current system, like stop cutting words in half (gmail does not) and some other bits an pieces.

4 Likes

Not with the stuff I came up with; had 3 config parameters, an easy override for individual posts…

I just added trivial support for cutting exceprts, was a fairly trivial change

<span class='excerpt'>My excerpt here</span>

Allows you to control the excerpt as a mod. Will fix the word truncating as well, surprisingly it is ever so slightly more complicated than this change.

4 Likes

Yes, it is more complex than it would first seem to be.
Gets real messy when it cuts between opening and closing tags,

Not following, is that relating to the new feature?

Sorry to confuse.

I was commenting on your

A while back I wrote some code to parse HTML and create excerpts and it surprized me how invloved it was for it to come out OK.

Thanks for this. I just found that it only seems to work if the markup is at the beginning of the post. Would be nice if we could snip an excerpt from anywhere in the post.

1 Like

Since this is such a power feature, I don’t mind if you change it and submit a PR

1 Like

OK, sounds good. I’ll take a look at some point in the next week.

3 Likes

Small improvement made

https://github.com/discourse/discourse/pull/2746

A notable limitation is that the <span class='excerpt'> part still needs to start early enough in the post, before the max excerpt length.

4 Likes

I wonder if this limitation will just end up causing confusion here.

2 Likes

Certainly could.

I thought about removing it but wasn’t sure if it was worth parsing the whole post for the power-user feature for that benefit…

you can do a cheap “look for string” and then do the more expensive parse if it is there.

2 Likes

Makes sense. I’ll keep this on my list in that case.

@sam - do you think that the length of the content within the <span class='excerpt'> tags should obey the max excerpt length site setting? or should it override it in some manner? (like allow it to be up to 2x the length of the site setting?).

I know I’m overthinking it, but figured I’d ask in case there are any strong opinions either way…

1 Like