Discourse での旧投稿 ID の保持をインポート

Kunenaからの移行の半ばにいます。ほぼ正常にテスト移行を実行しましたが、パーマリンクの問題と内部リンクの問題を考えると、疑問に思います。

Discourseが、Kunenaが元々割り当てた古い投稿IDを維持したまま新しい投稿を作成することは難しいでしょうか?このアプローチには明らかな利点があります。

おそらく、以下のことが必要になるでしょう。

  1. インポートスクリプトのいくつかの調整。この変更がどれほど「深い」ものか、誰か見積もっていただけますか?投稿IDはDiscourseコードの非常に深い部分で割り当てられるのか、それとも基本インポートスクリプトのどこかで割り当てられるのでしょうか?

  2. 将来、Discourseが既に使用されている投稿IDを新しい投稿に割り当てないようにする方法。例えば、Discourseを100,000から投稿IDを割り当てるように設定できる場所はあるでしょうか?

  3. 投稿IDは数値以外にできるでしょうか?もし可能なら、「k12345」(Kunena由来のもの)のような形式が、中途半端な解決策として良いかもしれません。古い投稿IDを参照する方法は維持しつつ、新しいDiscourseが割り当てるIDと競合しないようにできます。

ご指摘をいただき、ありがとうございます!

「いいね!」 1

The import script saves the import id in a post custom field. You can use it to create permalinks. Several importers do that. You can look at others for examples.

「いいね!」 2

Thanks Jay, I know that, and I use that to go from old ids to new ids.

But it would be more practical to simply use the same number and avoid that de-referencing. I am doing parts of my import with SQL and it complicates queries.

Also, the permalinks table can get quite huge. A regexp redirect is better, and easy if the id is constant.

The permalinks table is only as big as the post table. You know about permalink normalizations?

But to answer your question, yes, having discourse use different topic id would be very hard.

It seems Discourse relies on PostgreSQL to assign the id, which is numeric:

image

And it seems PostgreSQL allows manipulation of that “sequence object”:

https://www.postgresql.org/docs/8.3/static/functions-sequence.html

So I guess my number 2 is answered “yes” and my number 3 is answered “no”. My question number 1 is probably what you mean by “very hard”, right? :frowning:

EDIT: let me just add that posts uses the same mechanism as topics

I had heard about these regexp operations but I was thinking they required post ids to match so a simple find-and-replace could take me from the old post id to the new one.

Or is the mechanism more intelligent and it does a lookup on the post custom field to translate ids?

Yes. You still need to have the post (or topic or category) ID in a permalink.

Your solution might work with a plugin that you would have to continually maintain, but it’ll be really hard, and the recommended way works and has been used dozens of times, including several with millions of posts, and that’s just me.

「いいね!」 2

The approach (without keeping a plug in all the time) could be:

  1. Disable automatic serial numbers for topics
  2. Import everything with the old ids
  3. Re-enable serials, starting at a number larger than the MAX currently used

But ok, I get the point - it’s easier with the permalinks feature. I have found many posts with specific questions about this, mostly people trying to get their regexps right, but I haven’t found any page documenting the actual mechanism in general… is there such a page?

As always, thanks a lot for your help.

Can anybody point to any Documentation on how to use the permalinks feature? Even if it’s minimal docs. I don’t mind helping to write a more extended documentation once I understand how this works. Thanks.

The best thing to do is to grep Permalink script/import-scripts/*rb and look at how people have used it.