Discourse 公開データダンプ

Discourse · 2023 年 5 月 12 日午前 5:22

AIの登場と、ローカル開発マシン上で大規模なデータセットが必要になることを踏まえ、Discourseフォーラムからすべての公開（匿名ユーザーに見える）データの「実用的な」コピーを取得するための簡単なパターンをまとめました。

ドキュメントの最新情報は以下で管理しています。

github.com/SamSaffron/discourse_public_import

README.md

main

### Public Data Dump for you forum

This repo attempts to establish a pattern for a public data dump. It includes 2 data explorer queries you can use to export all your public data.

Public data is defined as forum topics and posts that anonymous users can access.

### How to use this?

First you need to define 2 queries using data explorer:

1. Topic query: [here](topic_query.sql)
2. Post query: [here](post_query.sql)

Once defined note the data explorer query ids as specified in the URL

Next, define an API key with rights to run the 2 queries.

### config.json

Create a [config.json](config.json.sample) specifying the domain of your discourse site, api key and data explorer query ids.

This file has been truncated. show original

なぜこれが重要なのか？

非常に多くのトピックを含むローカルデータベースが欲しい
システム上に個人情報を一切置きたくない

これはまだ非常に粗い状態ですが、初期の実験には実用的であり、非常に多くのデータを持つローカル環境を提供します。

このドキュメントはバージョン管理されています。変更の提案はGitHubで行ってください。

bigkid · 2025 年 5 月 13 日午前 8:06

こんにちは、この仕事をありがとうございます。私はDiscourse APIにかなり新しいのですが、試してみたいと思います。read meファイルからすると、topic_queryとpost_queryがこのリポジトリの重要なドキュメントのようです。__これらのファイルをカスタマイズして、目的のダンプに適応させることは可能ですか？__例えば、特定のカテゴリやタグのトピックだけをダンプしたい場合です。ありがとうございます。

トピック		返信	表示
Archiving the public data of a forum Support	3	1903	2019 年 3 月 17 日
Exploring your Discourse Data with Discourse MCP Announcements data-explorer , ai , sql-query	0	319	2026 年 2 月 6 日
How to get all the deleted posts for a specific topic Development rest-api	1	1049	2020 年 5 月 13 日
Public data dumps Feature	7	2163	2023 年 3 月 31 日
How do I extract raw data from my discourse community site? Support	5	2054	2022 年 3 月 9 日

Discourse 公開データダンプ

なぜこれが重要なのか？

関連トピック