Does Discourse support export conversations as an organized bulk of data?

Does Discourse support export conversations as an organized bulk of data that could be reused offline? Use cases in Slack:

  1. sometimes I extract some data from history to prepare documents for newcomers.
  2. the same approach to extract useful data from several discussions to organize external product documentaion.
  3. trying to parse a lot of bad-sorted topics/conversations using Python to get structured data with highlighted words - steps to define links between different conversations (some kind of data analyzing)

I hope I’ve been clear :grinning:

1 Like

hhm, have you used tensorflow? :wink: You might should avoid many unstructured data in the first place. I ave the same problem here. But I try to get it somehow sorted at the source. I have a slack chat hell here I will get around by letting people structure the Data…

I’m not sure that AI is my case. I collect data from different sources and the main idea is to find really important highlights (words, urls, proofs etc) and create a structured data which could answer these questions:

  1. What was the proper order on the way of accepting the certain solution related to the some task. In order to reconstruct the actual picture of events.
  2. The algorithm should detect important mentions in a tons of poor-quality conversations (especially email with many levels of attachments and non-trusted web publications)
  3. Define valuable links between different actions, approximately such way: news β†’ blog β†’ public mood and needs β†’ chat/email decision β†’ used strategy β†’ real actions β†’ approved assumptions β†’ related persons β†’ explaination the result

So I use some python template for this:

PRODUCT_RELATIVE_SOURCES = {
    "websites": {
        "company1": [
            "blog",
            "vacancies",
            "news",
            "tags"
        ]
    },
    "social-networks": {
        "network1": [
            "feed",
            "story",
            "public",
            "direct",
            "tags"
        ]
    },
    "messengers": {
        "messenger1": [
            "chat1",
            "room1",
            "bot1",
            "direct",
            "tags"
        ]
    }
   "mailboxes": {
        "box1": [
            "subject",
            "body",
            "sender",
            "cc",
            "meta"
        ]
    }
}

EXCLUDE_SOURCES = {
    "main",
    "libs",
    "opt"
}

Probably I wonder to have similar data structure exported from Discourse (through API maybe). Initially I asked the question concerning Discourse for Teams, because I found many similar with Slack and our team is not satisfied with Slack. Payable function of history is almost useless.

1 Like

In this regard, everything you can do with Discourse you can do with Discourse for Teams. This is why I moved your post into its own new topic. Maybe others have suggestions for you.

Are you familiar with json? You can add .json to pretty much any URL in Discourse to see the page in a more portable format. Maybe that helps?

For example, this topic:

https://meta.discourse.org/t/does-discourse-support-export-conversations-as-an-organized-bulk-of-data/180537.json

5 Likes

Wow, it looks great, thank you very much, Tobias! I think it is enough for me :+1:

2 Likes