Given the advent of AI and the need for large datasets on local development machines, we have pulled together a quick pattern for getting a “workable” copy of all public (visible by anon) data from a Discourse forum.
Keeping the documentation up to date at:
Why you care?
You want a local database with LOTS of topics
You don’t want ANY personal data on your system
This is still in a very rough shape, but it is workable for initial experiments and gives you a very populous local setup.
This document is version controlled - suggest changes on github.