I have tags that contain emojis, e.g. new-york-city-🇺🇸.
When using the search and advanced search, I don’t get the topics.
Here is an example:
Query:
https://urbantech-forum.cornelltech.io/search?expanded=true&q=tags%3Anew-york-city-🇺🇸subway
And here is a topic that should be returned:
1 Like
Not sure if this is resolvable, what is your take @neil ?
j.jaffeux
(Joffrey Jaffeux)
February 13, 2020, 9:40am
3
AFAIK it’s only because the regex we have here:
end
advanced_filter(/\Awith:images\z/i) { |posts| posts.where.not(posts: { image_upload_id: nil }) }
advanced_filter(/\Acategor(?:y|ies):(.+)\z/i) do |posts, terms|
category_ids = []
matches =
terms
.split(",")
.map do |term|
if term[0] == "="
[term[1..-1], true]
else
[term, false]
end
end
.to_h
if matches.present?
sql = <<~SQL
is not allowing emojis. If this was matching on emoji, it would work after.
We might want to consider using something like: GitHub - ticky/ruby-emoji-regex: 💎 A set of Ruby regular expressions for matching Unicode Emoji symbols. to have a good emoji regex.
what do you think @sam ?
1 Like
sam
(Sam Saffron)
February 13, 2020, 10:22am
4
Oh my, this feels like one big monster to support for a major edge case, do we already have a validating regex on the Tag class?
1 Like
j.jaffeux
(Joffrey Jaffeux)
February 13, 2020, 10:35am
5
I don’t think we do, might be wrong though @neil probably knows better
AFAIK we just have this clean_tag function:
TAG_GROUP_RESTRICTIONS_SQL = <<~SQL
tag_group_restrictions AS (
SELECT t.id as tag_id, tgm.id as tgm_id, tg.id as tag_group_id, tg.parent_tag_id as parent_tag_id,
tg.one_per_topic as one_per_topic
FROM tags t
LEFT OUTER JOIN tag_group_memberships tgm ON tgm.tag_id = t.id /*and_name_like*/
LEFT OUTER JOIN tag_groups tg ON tg.id = tgm.tag_group_id
)
SQL
CATEGORY_RESTRICTIONS_SQL = <<~SQL
category_restrictions AS (
SELECT t.id as tag_id, ct.id as ct_id, ct.category_id as category_id, NULL AS category_tag_group_id
FROM tags t
INNER JOIN category_tags ct ON t.id = ct.tag_id /*and_name_like*/
UNION
SELECT t.id as tag_id, ctg.id as ctg_id, ctg.category_id as category_id, ctg.tag_group_id AS category_tag_group_id
FROM tags t
INNER JOIN tag_group_memberships tgm ON tgm.tag_id = t.id /*and_name_like*/
Which is kinda copied in client side here:
https://github.com/discourse/discourse/blob/master/app/assets/javascripts/select-kit/mixins/tags.js.es6#L80
Also this would probably be useful at other places as we for example generate this client side: https://github.com/discourse/discourse/blob/master/app/assets/javascripts/pretty-text/emoji.js.es6#L24
2 Likes
neil
(Neil Lalonde)
February 13, 2020, 4:25pm
6
I doubt that we have any tests for emoji tag names. Maybe we can look into supporting this in 2.5?
4 Likes
I see that the search returns only one topic when using only the tag (without the subway keyword) in the search (it’s not the topic posted in the OP):
https://urbantech-forum.cornelltech.io/search?expanded=true&q=tags%3Anew-york-city-🇺🇸
It also works with other keywords (but only for that topic returned previously):
https://urbantech-forum.cornelltech.io/search?expanded=true&q=tags%3Anew-york-city-🇺🇸%20personal
If I don’t use tags it returns correctly:
https://urbantech-forum.cornelltech.io/search?expanded=true&q=new-york-city-🇺🇸
The topic returned previously, with the tag, is the 1st returned using a keyword with the tag name, but the post is another (but it might just be because of (maybe) a metadata or something like this in the post that is caught by the search, and not the tag itself, but I can’t say for sure).