Search by link

Hi,

I’m writing a python script that needs to search in my forum and check if a external link exists in the content of my forum threads.

The problem is: i checked manually, it finds some results of posts containing the link and others not. Even though the link is in the post.

Would this be something expected?

What are you searching? The raw or cooked?

The raw. Ex:

And it’s oneboxed in the op thread. Sometimes it’s works with the same domain but different ending, sometimes not.

Without knowing exactly how you’re searching it’s hard to guess what’s going on.

If I were given this task to implement I would probably make a Data Explorer query that searched for any posts with the link in raw or cooked.

2 Likes

for now it’s only a normal search, in site side, like this:

In my forum, I just created a topic with 1 link, and I still can’t find the topic by searching for the link.

I haven’t tested it here, since I’m not sure if it’s allowed to post links or not.

I’m still testing it. It seems that I can’t search for an entire long link (max 101 characters). So I need to trim it down a bit if it’s longer. Let me know if I’m allowed to post a sample link and if it’s expected behavior

Discourse keeps track of any link inside posts.
For your use case, I would use this data.
As far as I know, there is no API to access these links.
Implementing one via plugin should not be hard.

2 Likes

Thanks for your reply

I’m using this automation, It works very well, as long as the link doesn’t exceed 100 characters. If it does, it returns as non-existent, even though there is a topic with that link.

async def search_discourse_topic(session, link):
    headers = {"Api-Key": USER_API_KEY, "Api-Username": USER_ID}
    cleaned_link = clean_url(link)  # Limpa o URL fornecido para garantir consistência
    try:
        log(f"Searching for topic with link: {cleaned_link}")  # Log quando inicia a pesquisa
        async with session.get(f"{DISCOURSE_API_URL}/search.json", headers=headers, params={"q": cleaned_link}) as response:
            search_results = await response.json()
            topics = search_results.get("topics", [])
            if not topics:
                log(f"No topics found for link: {cleaned_link}")  # Log se não encontrar resultados
            for topic in topics:
                if cleaned_link in topic.get("blurb", ""):  # Checa se o link aparece na descrição do tópico
                    log(f"Found existing topic with link: {cleaned_link}")  # Log se um tópico correspondente for encontrado
                    return topic["id"]
    except Exception as e:
        log(f"Error searching for topic with link: {e}")
    return None
1 Like