Is there a word cloud plugin for discourse?
Carl
Is there a word cloud plugin for discourse?
Carl
There is not⊠is there a specific reason youâd like one? how would it be used?
if would be cool in two ways. one, a word cloud i could click on could then bring up all the topics that match that click on a word like âsubscriberâ.
two, you could display other types of searches like this, or top posters, or whatever you want.
probably could be something that runs in a cron job one a day or more often.
I thought this was a fun idea ⊠so I created it*
Itâs at a very early âjust workingâ stage and needs a lot of refinement and additional options and potentially some click functionality:
https://github.com/merefield/discourse-word-cloud
It adds a link on your Hamburger Menu.
be aware that currently it builds the word stats from all Posts, regardless of type and location. This could effectively act as a very-round-the-houses mild privacy leak (might need some additional safeguards to exclude words from posts in private areas). You have to be logged in to see it and access the data though ⊠and the words are rendered as SVGâs ⊠and it only shows the top x hundred words, so unlikely to be much of a concern to most sites. Iâll work on that to make it more secure, but this way the query runs very fast.
Enjoy.
*It leverages some pretty nifty existing libraries which Iâve credited in the repo. Shout out to @DiscourseMetrics whose query I leveraged.
very cool. i think you would also want to not include certain words in the word cloud?
Sure, it needs a whole load of sensible exclusions and the regexes need work to get rid of markdown formatting etc. whilst not making it overly complicated. This is just a start. Iâve just added some colour.
Just to be clear though itâs awesome lol
Added a localised list of ignore words:
https://github.com/merefield/discourse-word-cloud/commit/066529ed048b004d6f9c6859697cde6aed24a9fd
which should make results a little more interesting âŠ
Iâve also added a lot of sanitising logic, so the result is much better.
Nice! I like this effort. Nice job. If I could request features:
/wordcloud/category
Hereâs how it looks on my neighborhood forum.
Great feedback, thanks, and some good ideas!
Yes that sounds like a good approach. 3 metres deep in client work atm but will look at Category selection for next update.
Category selection is in:
https://github.com/merefield/discourse-word-cloud/commit/0777adc19516688ec651f3c1439b981dc8367ec0
If you select no Category (default) you get a scan of all forum Posts (PMs and all). If you add just one Category, word stats are restricted to that etc.
As are humungous improvements to the regexâs (
) which now clean up the ârawsâ nicely and get rid of most if not all the Markdown.
NB Word stats are updated every hour now (which is probably still excessive, but for the time being makes it easier to checkout changes in Production as we go through a lot of initial code evolution).
NB#2 Iâve not yet considered other languages here beyond English (itâs certainly not tested). The current word manipulation may not work well in some languages. Suggestions & PRâs welcome.
Cool! Hereâs an updated wordle just including the most relevant categories.
Mine is a small community and still fairly new. To be honest, though, the info presented in the wordle looks pretty but is not especially meaningful or useful. I guess it could be used as a visual in a retrospective topic about the community or something along those lines. Would be fun to see more examples of how people use this.
Some of the included words are common and meaningless, e.g. youd, off, got, add etc. I wonder if the âword cloud ignore portionâ setting (which is 100 for me, the default) is doing its job? Or maybe there is another/better list of words to ignore?
Yeah, happy to consider a larger list (Iâd found a 200 word list here, but deferred to wikipedia as a more âauthoritative sourceâ)
OK iâve:
NB if there are still words you want to exclude, just add them to the beginning of:
like iâve done here (eg. âiveâ, âitsâ, âtopicâ, âpostâ)
to see the impact of any changes more quickly, simply re-trigger the job from Sidekiq:
Thatâs it for a while I suggest. I may create a dedicated Topic.
OK, you might like this:
https://github.com/merefield/discourse-word-cloud/commit/84770618bec7e17457faff2b31a54aa894ee5743
Update: Iâve now simplified the ignore list arrangement so thereâs no longer a setting for âportionâ of ignore list employed, you simply have to delete or add words to the ignore list using the native localised setting:
https://github.com/merefield/discourse-word-cloud/commit/074e0902269e752c11c3c29018f8c68c813327d3
do we need to uninstall old version to get this?
You should only need to upgrade the plugin. Having issues?
i apologize we figured it out.
No problem at all