Finally coming back to this after tabling it a while. I’m willing to run a backup and try something if I can grok a chance of success, but I crave a bit more confidence here. I lack much scripting experience but I’d really like to understand how the the csv importer would preserve posts (replies) and dates, as @nathank suggests, since the script doesn’t seem to define any handling of them.
It imports limited fields for: users, emails, custom user fields, categories, and topics.
I don’t need custom user fields or new categories, so the relevant CSVs and their specified fields are:
== CSV files format
File name: users
headers: id,username
File name: emails
headers: user_id,email
File name: topics_new_users
headers: id,user_id,title,category_id,raw
File name: topics_existing_users
headers: id,user_id,title,category_id,raw
From a squint at this data model, Discourse Topics and Posts are two different creatures with some differentiating fields:
I don’t see anything in the script to handle Posts — or dates.
Maybe I’m supposed to lump incoming Topic and Post data together, but if so, how would Discourse infer the topic/reply relationship – is it just the sequence of the input? Are replies related to a Topic having the first appearance of a shared ID? All it says about ids is:
except for the topics_existing_users, the IDs in the data can be anything as long as they are consistent among the files.
If the script isn’t missing something, then I must be. I appreciate any clarifying thoughts!