Move a Discourse site to another VPS with rsync

icculus · August 17, 2025, 9:36am

That’s more complicated.

Lord, did this turn out to be true!

I took a run at this. Now, understand our Discourse instance has like 15-20 years worth of mailing list archives imported into it. So there are a lot of old posts that look like this…

…where you get a ton of links, and email replies that quote the entire previous email repeating those links, and email signatures with links to unrelated things, etc.

So at some point, the topic_links table screwed up, and started making new rows for links it already had:

So there might be a row like:


  69534 |    18675 |   86333 |    6631 | http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org   | lists.libsdl.org            | f        |               | 2017-03-26 12:13:06.365742 | 2017-03-26 12:13:06.365742 | f          |      0 |              | SDL Info Page                                 | 2017-03-26 13:18:08.491592 | f     |

With that row created in 2017. But then I’ve got hundreds of dupes across all these posts, where the system came along in 2024 and added duplicate rows with id values like 200000 higher.

So it’s a lot to clean out.

Basically I’d wait for the reindexing to fail, see what it spit out, find the ID for the duplicate row, and adjust the URL string to no longer duplicate…it looks like the system already deals with this by inserting ** into the string semi-randomly. I did the same. I was afraid to delete the rows outright, because I have no idea what references these rows elsewhere.

It did seem to be a handful of topics, presumably different posts on the same topics, because I was fixing the same handful of URLs over and over. It was tedious, I don’t recommend it. I probably should have found a way to automate this.

It’s also worth noting that the “CONCURRENTLY” option on REINDEX will build a temporary index (with a _ccnew* string appended to the name), and if the process fails, it’ll leave it there. If you’re going to use CONCURRENTLY and have to clean up lots of things, expect to drop a LOT of temporary indices by the time you clean everything out. (Also, if like me, you have never used Postgres directly, also know I had to issue a \connect discourse; command at the start before the REINDEX command could be used.)

I’m still in the process of doing this; it doesn’t seem to be making things worse, so I’m continuing to plug along on it.

Topic		Replies	Views
Postgresql problem on moving discourse with rsync Installation	6	146	August 3, 2024
How to backup and restore a whole /var/discourse app folder? Installation	17	5492	June 19, 2020
Welcome to nginx! page before rebuild, site won't rebuild now Support	21	228	July 13, 2025
Hangs discourse and discourse restore FAILED Installation	24	1972	August 26, 2020
Stuck and lost updating forum, problems with PG migration Installation	15	208	June 12, 2025

Move a Discourse site to another VPS with rsync

Related topics