Smoke-testing plugins during upgrade process

pr-welcome

#1

Continuing the discussion from Blank page loaded after todays upgrade due to incompatible plugin:

Plugin compatibility is an issue that has bitten me a couple of times. Yes - I have been using unsupported plugins. But no - these shouldn’t be able to take down my forum when I innocently hit the upgrade button on Discourse core.

Firstly, I accept that the hard-working core Discourse team have enough on their plate, and they shouldn’t have to support myriad third party plugins. This would slow core development down and create all sorts of awkward developer coordination bottlenecks.

However, there may be a simple and universal solution here that doesn’t require ongoing effort from the core team. I propose that on hitting the upgrade button, a new docker image is built in the background, and within it, acceptance tests of all plugins are run before the switch-over happens from old to new docker image (apologies if I have mis-stated any technical terminology). If any tests fail, the upgrade is cancelled and the admin is alerted to the failing tests. No site downtime. No errors seen by users.

All third party plugins should come with acceptance tests that will fail if the plugin has b0rked the Discourse instance upon core upgrade (eg blank screens, JavaScript errors etc).

With this in place I can be confident that upgrading the forum is not a “Russian roulette” operation.


How are you nurturing an ecosystem of “Specialised contractors”?
Forum is down, upgrade failed with fatal error, need help
(Jeff Atwood) #2

We don’t accept responsibility for plugins we didn’t write. Also, who decides which plugins “make the cut” for testing here, and how?


#3

Apologies, I don’t think I explained my idea fully.

Yes, I totally accept that the core team doesn’t support third-party plugins. So, my proposal avoids the core team having to do any direct support or ongoing development around third-party plugins.

All tests on all plugins would be run during the upgrade process. If any fail, the upgrade is seamlessly and automatically cancelled.

This happens on self-hosted instances, not on meta. It is the self-hosting admin’s responsibility to disable failing third party plugins, and to contact the developer of those plugins.


(David Taylor) #4

I agree the user experience right now is pretty bad - plugins break regularly, and uninstalling one-by-one is incredibly tedious when using the default install.

I don’t think it’s a case of responsibility, more about making the Discourse product more resilient to change. I don’t think anyone would expect the core team to maintain 3rd party plugins.

I think a good first step towards making things better would be to add some kind of “disable plugin” button to Docker_manager. Then a future thing could be to make disabling broken plugins automatic based on acceptance tests.

I think it should be possible to implement that so that it only requires restarting the server, rather than rebuilding… but not 100% sure of that


(Jeff Atwood) #5

Not sure, what do you think @sam?


(Sam Saffron) #6

Totally fine with this, #pr-welcome


(Dean Taylor) #7

I would propose something simple in the way WordPress / WooCommerce does it…
… no mythical perhaps complex smoke tests…

This blog post outlines the recent introduction specifically for WooCommerce related plugins:

When an upgrade check is performed Discourse could advise / give helpful warnings to prevent plugin related errors that could lead to broken features / downtime.

A plugin predefined file should contain two Discourse version number values:

  • “Requires at least”
  • “Tested up to”

“Tested up to” Discourse version number which indicates the version of Discourse the developer of the plugin has tested the plugin against.

The blog post goes into benefits of requiring the plugin developer to update this version number and ultimately reducing the immediate work required by the plugin developer to update the plugin because the user is warned prior to upgrade.


(Jay Pfaffman) #8

So that’s where to put it. And then docker_manager would have to pull that file for each plugin? That seems like a drag, but also seems like a safe solution. Then people could see on the upgrade screen that a plugin hadn’t been tested and they’d know to wait or ask the developer.

This could reduce substantially the number of “Rebuild broke” . . . “Disable plugins” topics.


#9

I don’t think the version numbers declaration will work in my use case. I stay on the latest discourse code, upgrading every couple of days. Continuous deployment is great. No Big Bang changes. And version numbers are largely meaningless.

I wouldn’t expect all plugin developers to manually test against every single commit of discourse-core and manually declare version compatibility on an ongoing basis.

Surely it’s easier for the upgrade process to automatically run a test suite for all plugins, and then to cancel the upgrade if any of the tests fail?


#10

Plugin developers should be keeping their code compatible with Discourse core on an ongoing basis.

I don’t want to have to choose between a) moving forward and losing key features of my site because of plugin incompatibility and b) staying put and failing to get new security updates in discourse core.

An automated testing approach would help plugin developers because they’d be made aware at exactly the point at which their plugin loses compatibility with Discourse core, so they (or others via a GitHub PR) are able to diagnose the incompatibility more easily and fix it.

Automated tests are not “mythical” or hard. Most plugins already have them. They’re just not currently being run when a Discourse instance is upgraded.

I work in the finance industry. If automated testing wasn’t possible, we wouldn’t be able to iterate our software. Simple as that. Luckily automated testing is easy, and it’s something that enables us all to iterate confidently without unexpectedly breaking the system as a whole.

Using dynamically typed languages like Ruby and JavaScript make automated testing even more vital.


(Sam Saffron) #11

For the record we always smoke test every official plugin (and run the tests the plugin provides) during our CI process which happens before meta is deployed on every commit.

I don’t see it as the job of the installer to be running the tests is a continuous integration server that should be doing this for everyone. But I am not sure how much extra work I want to take on for the ecosystem right now. Long term I am fine to have a digital ocean instance that is constantly testing plugins against latests commits.


(Jeff Atwood) #12

I think at minimum we should invalidate all unofficial plugins on every version change. As in, they should all be default-off.


#13

:scream:

This would kill third party functionality I’ve come to rely on. Killing it on a regular yet unpredictable basis (new Discourse versions are cut on a completely arbitrary basis)

The intent of this topic was to propose ways to make third party plugins more reliable and to improve continuous integration. Not to have plugins regularly and arbitrarily disabled!


(Erlend Sogge Heggen) #14

Some prior discussion:

Also noteworthy, the WordPress community developed a plugin to mitigate these types of issues:


In short, I’m in favour of adding headers for Requires at least and Tested up to, which will prompt a warning if the Discourse version doesn’t match.


(Jeff Atwood) #15

Well, if the goal is for the stability to improve, unknown plugins should be disabled on version update, until such time as they’ve demonstrated they work on the current version.


(Sam Saffron) #16

That’s a bit of a chicken vs egg problem

My call on this is to defer any work on this for another 6 months or so

If anyone from the community want

  • They can build up a system to run all tests on all plugins and flag broken plugins somehow we are 100% fine to donate the digital ocean resources required

  • Easy plugin disable per @David_Taylor’s suggestions is #pr-welcome

I think it would be brutal to auto disable every plugin on every update that is outside of the official plugin list. That is pretty much shutting down all plugins.


(Jeff Atwood) #17

To clarify, they would auto-disable but you could re-enable them as needed. That seems 100% safer to me.


(Sam Saffron) #18

For starters, the only way of enabling a plugin now is by doing a full rebuild, so that is a killer here. We would need to build a system that allows you to enable/disable plugins from /admin which requires a whole bunch of work.

I think the first task here regardless is “enable / disable” buttons in /admin/upgrade that knows to talk to the server and restart it. (note, none of this work is really reusable for our internal hosting something that makes me not super enthused about doing it now)


#19

Please don’t auto-disable plugins unless a test has failed and shown them to be incompatible.

Randomly-disabled plugins would be a bug, not a feature, IMHO.

Sam’s suggestion of a CI server sounds very reasonable.


(Jeff Atwood) #20

Be careful what you ask for then, I guess :wink: