After all those years I still see myself resorting to Google every time I need to find something. Today I was looking for the exact options in a theme settings.yml.
Revisiting the inside-Discourse-search, the topic I was looking for is the 25th result when “sort by relevance” is enabled, 19th place when AI is enabled.
Other examples: “ad plugin” does not give me the plugin topic in the first 70 (!) results, while “meta.discourse.org ad plugin” in Google gives me an immediate hit.
(I thought, maybe I’ve been using the wrong terms but “advertising plugin” gives me the topic in the 9th place with in-Discourse search, 17th when enabling AI.)
If you use the “Most viewed” option, your search mimics what Google is doing a bit better (by showing you results that others also may have found useful)
“settings in themes settings.yml” puts it in 6th
“advertising plugin” puts it in 2nd
Not perfect, definitely there is room for improvement. But I’m not sure how realistic it is to compare search here to one of the most advanced search tools on the planet that is literally synonymous with “doing an internet search”.
That being said I think Discourse could be better even if it doesn’t ever get good enough to beat Google
Search is notoriously difficult to get right, but we agree that there’s definitely room for improvement.
I’ve struggled to turn up that same topic from time to time, so maybe we can improve our own keywording a little within it. Including the documentation category or the how-to tag improves the results dramatically (I agree that it shouldn’t be necessary!)
We’re also experimenting with a new search plugin that may help improve search using Typesense (hopefully we’ll have something to test on Meta within the next few weeks) — searching “settings in themes” using our internal demo returns that topic as the 4th result, so that seems a little promising.
I was excited to see this from an authorized person because I started thinking about adding google search to the discourse. The call was really bad, I’m saying this because it is. I hope it will be possible to see a serious innovation in this regard.
It’s a fair point but the difficulty of search isn’t about scale, it’s about predicting what the user wants to see based on a couple words. Having the other 99.99999999999% of the internet and 8.5 billion searches per day to learn from is pretty helpful in that regard.
But again, I agree Discourse search can be improved. But I don’t know if Google should be the expected standard.
One thing I’ll add is that I wonder if this is even a “search” problem as much as it’s a “lookup” problem. In this case, the search results aren’t necessarily bad, they just aren’t bring up the exact page you are specifically looking for. Maybe the solution is to make the bookmark search a more prominent feature? Or some other solution that priorities important topics that are frequently referenced?
This maybe getting too much into semantics but I think the distinction is important. Search brings up results related to your search terms, not the results you are looking for in your head.
“settings in themes settings.yml” is giving you results with “settings.yml” and “themes” in it. So the results are not wrong. The problem is that some of the context is left out of what you actually want, ie. the how-to guide for adding settings to a theme. If you were more specific topic you are looking for, you can find it easily.
The magic of Google is that it can infer a lot of the hidden context from the search terms because it leverages the billions of search examples it receives daily.
Anyway, I think the overall point I’m trying to reach here is that if you were going to the library, the way you search for “cookbooks” generally vs a copy of “Gordon Ramsay’s Home Cooking” is going to be different. In this analogy, Discourse is good enough at giving you all the cookbooks you want but there is not really a good way to lookup Gordon Ramsay’s Home Cooking. Especially if you don’t remember the specific title. I find a lot of important topics on my own Discourse are often lost in the abyss. Maybe the solution is to improve my docs section, or maybe an improvement to search could help. Maybe something like recommended search results that appear at the top? I don’t have an answer, just tying to flesh out the problem a bit more
Especially when people like me are likely to say “uh, did you try search?” And search doesn’t work very well. It does seem like it’s gotten worse in the past couple of years. I suspect it’s worse because the haystack is bigger.
I’m excited to see how and whether Typesense wil help!
Yes, it’s getting too much into semantics. My point is that the search functionality is not living up to my expectations, my expectations are (IMO) not very unrealistic and there are other systems out there that do much better, and that’s not just Google.
Zooming in into your examples, I would at least expect that searching for just the relevant nouns (“settings theme”) would give me good results. But it doesn’t.
Now we have Docs for this kind of thing which limits the search to certain categories — very important for a support site! I was going to suggest that, except we can’t order the results by relevance.
This drops the utility of this search WAY down… the options are ordering by activity date or topic name.
Was this your instinct before you (maybe subconsciously) saw the title of the topic?
Maybe it’s because English is not my native language, maybe it’s because I didn’t want to add settings, I added those years ago, but it never occurred to me to include the word “add”.
And that means ”if you know exact location”. That is not bad option either, but it is not part of real life either.
On my forum I disabled semantic searches. It is just another here are random topics and it is not what I need when something is missing. And same here, semantic search just doesn’t work, but adds amount of noise.
Don’t get me wrong. I don’t expect another google and coding decent search engine must be awful hard because such one doesn’t exist — outside real seach engines, and I wouldn’t say Bing is any better than native Discourse.
On the contrary, that’s exactly how real life works. You don’t have a magic search function in physical reality so you have to store all your stuff in an organized way because when you want to find it, you need the exact location. That’s the entire premise libraries and archives are built on.
Anyway this is pretty tangential. My point wasn’t that we should memorize all titles of topics. It was that there is a layer of context missing that tells the search engine you are looking for one specific result. Google has the magic ability to infer that which gives it a huge advantage.
Maybe what is needed in Discourse is a more prompted search. Typesense is a very good start, looking forward to that. But a frustration I have on my own site is that I spend so long curating categories and tags and yet I feel like I never get the full juice out of them. I wonder if it’s possible to promt the user with tags or categories in their search. So just collate all the search results and count up their tags. Then you can serve them as a 1-click filter for the user. In this case the how-to tag is basically the one piece of context separating an undesirable result to the exact result.
Well, I have memorized several of them that got renamed and it took me months to learn the new ones. The new titles were better, but I never found them again.