Tracking and resolving a schema drift cause

jrack · January 8, 2024, 7:02pm

So trying to migrate a site by restoring a backup from an existing deploy. I am getting the restore fail because of the schema mismatch (source is newer than target).

Now reviewing the admin/plugins endpoint for both deploys they match for the ones listed, their status, and their version so scratching my head a bit. I also tried putting all Themes and Components back just in case and no change either. Both sites are at app 3.1.3 so that doesn’t seem to be the root cause in this case.

Guessing there was a plugin installed on the site then the instance was redeployed and the db preserved without this plugin installed, but that is just a guess. Is there a deterministic way given a an instance or database that I can figure out what contributed to the schema drift? And is there a way to “down rev” or is the only method to make the target site match or superset?

merefield · January 8, 2024, 7:14pm

Could it be the old one was on beta or tests passed and not stable, so the latest stable is in fact, older?

I’d check the commit numbers match (if possible).

The backup is of the db, so I doubt plugin population matters, it will simply add the plugin data but not actually use it …

jrack · January 8, 2024, 7:24pm

I don’t believe so, but it’s a dev environment so anything is possible I guess.

Is there anything you know of in the schema_migrations table (or the container cloned code) that I could manually check and correlate the schema version to whatever change?

The rename of the file can get things uploaded via the UI, but I was using the merger which blocks based on max(schema_migrations) and I am really trying to avoid hacking things up too much.

Some of this is getting ahead of any operational task where migration version mismatches might show up. Better understand how to trace the migration version back to changes so when this comes up again I can hopefully figure out a runbook on reconciling.

merefield · January 8, 2024, 7:26pm

rails db:version

from the command line in the discourse directory… but if you’ve not restored that might not help …

or from psql:

SELECT * FROM schema_migrations ORDER BY version DESC LIMIT 1;

You could then look that up on Github to cross-ref the commit …

e.g. https://github.com/search?q=repo%3Adiscourse%2Fdiscourse+20231222030024&type=code

Browse the migration file then note the commit when it was added to confirm date etc.

NB: this is not necessarily the same as the code commit obviously! Might be older (which you can get git log -1)

jrack · January 8, 2024, 7:43pm

Yeah I can see the max schema_migration (even just looking at a backup filename), but checked the table and its just the date value. No indications where it came from.

example the “good” is 20230823100627 and the site is 20231022224833. I can’t even find files for “20231022” in the post_migrate folder (or elsewhere in the repo).

I am sure it is staring me in the face. And yeah mining past changes and emails to try and figure out if i can match some action after August where a rogue version might have snuck in.

merefield · January 8, 2024, 7:47pm

wait, you are trying to restore a dev instance to a Production database or vice versa?

I’ve never attempted that, only ever Production to Production (which will have consistent database names and other configuration).

jrack · January 8, 2024, 8:10pm

In this case its the Dev instance to a newly provisioned “Merge” instance which I will then use merger to import another Dev instance as testing a instance consolidation effort we have going on. The schema migration rev being in sync is a pre-req (not surprised). In this case the target env is on 1022 and the source is 0823. In all the 3.1.3s I have we are 0823 so been a head scratcher where 1022 came from and that’s what I am trying to back into, but I can’t find a trace.

merefield · January 8, 2024, 8:20pm

OK, your workflow is very … exotic!

Ideally you shouldn’t need to keep any data in dev and should be able to simply drop the database and re-run migrations.

To merge two diverging dev instances, you would normally merge code the branches, including the migrations, then create a new instance from scratch?

This is partly why there is a nice rake task to pre-populate some fixtures so there’s something to work with: rake dev:populate

RGJ · January 8, 2024, 8:27pm

We happen to have a database with all migration ID’s for 400+ plugins so we can map them to a plugin easily. This one comes from discourse-automation.

jrack · January 8, 2024, 8:33pm

Heh yeah all the farm animals escaped the barn a while back so I am trying to get everyone back in the pen. Or at least figure out if its even feasible.

And we found the missing piece looking at the instance file system

Folks had been looking at automate, but on the other envs they had never activated it so it was available in the Plugin list and no schema change had been made. So far from ideal, but the hint for me on this work is to check the repos for all plugins installed even if they are disabled as maybe they were enabled at one point.

We are doing a redeploy removing some of these R&D plugins as well as keeping a much closer eye on plugin /db entries and doing better record keeping there.

RGJ · January 8, 2024, 8:35pm

BTW Github search is your friend here

jrack · January 8, 2024, 8:38pm

Yeah we finally figured it out too looking at the instance runtime itself, but far from the ideal. Lesson learned for an operational runbook if we need it. I just didn’t check the automation repo in the org since it looked to be disabled and no record of anyone using it. Bad assumption on my part.

@RGJ any chance that database is publicly accessible anywhere? Using the instance file system “works” but gets messier than I would prefer.

merefield · January 8, 2024, 8:40pm

You mean people didn’t push their commits?

What are you trying to save?

jrack · January 8, 2024, 8:45pm

Yup I was searching in the discourse repo itself vs sweeping up the world as I wasn’t sure if it would even show. Searching w/o a scope is too expensive, but didn’t step out to the Org to see if there were hits for Discourse official efforts.

jrack · January 8, 2024, 8:46pm

We have 3 or 4 Discourse instances that independent teams started up to solve a shared business problem and we are seeing if we can bring everyone into a single instance while also not losing their previous work.

merefield · January 8, 2024, 8:51pm

I can’t believe it’s good practice to rely on retaining data in dev.

If the data is important, that work should probably exist in Production under more controlled circumstances.

I don’t know the full nature of what you are trying to do, but being opinionated, solutions should almost certainly be plugins that can be deployed anywhere and not even have to fully rely on a specific version of Discourse, nor care if specific data is pre-populated outside of seeding its own fixtures.

jrack · January 8, 2024, 8:59pm

Yup

In this case we are doing feasibility of these merge ops for the Prod instances using a couple Dev instances. If we can make the runbook solid than all the data and instances will be prod level, but have been maintained by independent teams thus far. So knowing what the blockers are to having a successful merge is what I am working on now. And the schema version is clearly a key one and both app and plugins can and will affect the “mergability”. Fortunately the prod instances have shown 0823 so this specific issue wouldn’t happen in a prod run, but knowing how to analyze a schema drift was needed and will really help our opdocs.

merefield · January 8, 2024, 9:06pm

Ah ok so you prototyping the merging of Production DBs, interesting.

But what are you trying to merge?

You know that moving Topics (and their users!) between instances is officially supported?:

jrack · January 9, 2024, 2:34pm

Yeah its thousands of existing topics and related content so the one offs a are bit of a mess

Merge two Discourse sites into one which uses a different script, but the same basic idea.

Did discover another nuance of schemas. So we removed the automation plugin from the deployment and redeployed. Then i noticed that schema_migration seemed to revert back to 0823 as the latest. So I thought I was good to go without installing the automation plugin into the instance that I am merging in. Well when i did another run of import i got a PG::UndefinedTable: ERROR: relation "discourse_automation_automations" does not exist error so even though the migrations version rolled back the schema changes tied to it in the actual db were still around it seems.

system · February 8, 2024, 2:35pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
During Restore: "EXCEPTION: You're trying to restore a more recent version of the schema. You should migrate first!" Support	3	971	March 24, 2022
Can't restore backup from business 7 to fresh install due to version error Support	3	623	January 14, 2022
Failed restore - You should migrate first? Installation	7	509	October 15, 2022
Determining the version from the database Support	3	417	October 21, 2020
Restore hosted discourse on local self-hosted Support	8	733	August 24, 2019

Tracking and resolving a schema drift cause

Related topics