I would like to leave some feedback about operations experience from past several years when I occasionally look after one pretty small Discourse installation:
Single self-hosted VM
Discourse installed via official docker setup
3 Official plugins
Every time it comes to updating from an older version (3-9 months old), it’s always a nightmare. I have never had a smooth upgrade that wouldn’t take days of head scratching, reading source code and forums, trying unconventional backup restore strategies, etc. So far, I have encountered these issues, some of them multiple times:
Backup cannot be restored (into same version of Discourse backup was made from).
updated discourse_docker incompatible with discourse, unable to rebuild
Plugins randomly breaking after rebuild
Discourse is an ecosystem of main app, discourse_docker and plugins, and yet neither discourse_docker nor plugins have any tags in github, or versioning, that would say “Discourse version x.y.z will definitely work with discourse_docker version x.y.z and plugins that have version x.y.z”.
Proposed solutions:
Have CI setup that tests backup / restore for each release.
Start versioning and tagging discourse_docker and plugins in sync with discourse releases
Have some CI to test interoperability of various ecosystem parts
When developing, please remember that not everyone is a discourse developer who is always using latest master branch from all discourse/* repos. I hope this feedback proves to be useful.
As someone who has been running discourse for six years now in a broad variety of environments none of the situations described above remotely mirror my experience.
Another thing from same upgrade (v2.1.5). When you go to admin panel and try to upgrade it automatically, it says:
You are running an old version of the Discourse image.
Upgrades via the web UI are disabled until you run the latest image.
To do so log in to your server using SSH and run:
cd /var/discourse
git pull
./launcher rebuild app
Guess what happens if I do git pull in /var/discourse and try to rebuild? mini_racer gem fails to build, and I’m stuck with broken installation. I then have to find an older commit on discourse_docker that would work with v2.1.5 just to be able to rebuild and make it start again. It would be easier if there were tags.
P.S. I always do upgrades in a clean VM, by first reproducing my current environment, restoring production backup, testing that it works, then upgrading it there to see if upgrade doesn’t fail. It always fails. Maybe I’m just really unlucky.
I’m curious if you’ve been having these problems for a couple of years now why you only joined Meta two hours ago? The peer support here is generally excellent, and the first step for most when they hit any form of problem.
I’m not the forum dweller type myself. Usually it’s faster to find a solution to a problem on my own than to post and discuss it in dev forum. This time I just had to vent, with hope that discourse devs would introduce some versioning to all components, and my life would get easier for next upgrade
I don’t think you’ve made your experiences any easier by taking this tact.
How do you repro your production environment exactly? If you aren’t running a staging copy in parallel you run a very high risk of just introducing more variability here.
We have configuration management, and Chef can provision an identical VM locally, check out the production version of discourse_docker, and build an empty forum, where I can then import the backup from production. The only moving part here is plugins, that are being checked out from master branch (because they have no tags).
I’m curious whether your chef recipies are a factor here. The only common issue with restoring the database relates to version 1.9, which you’re well past.
I generally find that Discourse updates cleanly and smoothly.
Over the three years I’ve run Discourse forums there have been a small handful of upgrade problems, generally caused by third-party unsupported plugins I’d installed. Not once have I had to do anything drastic like restore from a backup. The ecosystem around Discourse (including its plugin developers) are very committed and conscientious, and problems are resolved in a matter of hours.
Coming from a corporate IT background, I find Discourse’s speed of development, and responsiveness to bug fixes is extraordinarily good. In my experience using open source projects, Discourse is up there with the best supported.
I would recommend against creating a complex dependency graph by emphasising version numbers. This will add extra complexity to the development process, and is only partially useful in solving compatibility problems. It’s better IMO if we have more CI going on, and if plugins are regularly tested against the latest version of Discourse core. This is the responsibility of plugin developers IMO, but perhaps CI tips and templates could be shared here for everyone’s benefit.
Discourse team - keep doing what you’re doing. It’s a fantastic project, and, remarkably - free of charge.
It wouldn’t have to be a dependency graph, For example when you release v2.4.0 stable, just make sure that everything works, and tag everything v2.4.0. That way there would be no question which commit of the plugin or discourse_docker would work for that given version.
Actually the primary impact of this is that it becomes slower to make fixes and enhancements available to older discourse versions. It’s really not common for plugins to be rendered incompatible by Discourse updates.
It is most likely not, all it does is install the OS, create /var/discourse directory, check out given commit of discourse_docker and fills in containers/app.yml template.
It does not do pre-flight checks, not create swap. But if system runs out of memory or Docker is too old, those things are obvious, and I don’t blame Discourse for this. discourse-setup would not save me from plugins being incompatible.
Anyway, I’m not attacking Discourse developers. You guys are doing a great job, I’m just trying to suggest some changes to the development process that would make operations people lives easier.
Thanks for the feedback @spajus. It’s always good if we can find specific tasks to improve the experience for users and administrators. Generalising things in one large topic makes it difficult to turn them into actionable tasks.
Looking specifically at your second bullet point. We aim for discourse_docker to be compatible with the latest stable. Are you seeing something different?
As far as I know, this should be resolved and is no longer being investigated. If it’s still broken, let us know. The sed fix is a bad idea, as it will probably break more things in future.
If chef is only provisioning an OS, and you are using discourse_docker, then it does seem unlikely that chef is causing the issues.
@david one very specific suggestion was to tag discourse_docker and plugins when new discourse stable is released, so there would be way to tell which discourse_docker commit is compatible with older version of discourse, should someone want to set it up instead of the lastest (i.e. for restoring an older backup, or to test an upgrade procedure).
The mini_racer issue is solved in latest discourse_docker and latest discourse stable.
UPDATE: I cannot post again for 1 day due to new user restrictions, so read the updated post to see why not everyone would always want to install latest discourse.