Adding jsonb columns for custom fields

Often times when developing plugins, I’ve had to use the CustomField or PluginRowStore tables to store arbitrary data related to the plugin (for example, retort stores a string representation of reactions to a particular post as a PostCustomField)

This works well for simple data, but for anything a bit more complex, I’ve found myself wanting access to more than a postgres text-valued column can provide. Postgres has supported the jsonb column type for several versions now (since 9.4), which provides some super-great features like finding and filtering by nested json fields, as well as indexing.

I wonder what the feasibility would be of adding a jsonb column to PluginStoreRow, or possibly converting the existing value column to jsonb to allow for more complicated models to be stored there?

2 Likes

The use case I’m working with at the moment is a real-time collaborative editor, which involves storing a series of Changesets to a document; each Changeset holds a post_id, author_id, document length, and string representation of the changes. It would be feasible to store this information as JSON like so:

{
  post_id: 1,
  author_id: 2,
  length: 10,
  changes: ['1-5', 'world']
}

but putting it into a text representation

"1|2|10|['1-5', 'world']"

, or simply storing that JSON as text in the DB, doesn’t really work because I need to be able to search for rows by post_id:

PluginStoreRow.where(plugin_name: :my_plugin, key: :changeset).where("value ->>'post_id' = 1")
1 Like

A method I’ve used before is:

PluginStoreRow.where("value::json->>'provider'=?", provider)

There is no need for the database column to actually be set to json for that to work. (see here).

The other thing I’ve used to help with JSON in plugins is ActiveRecord::Store. It allows storing data in json, while keeping all the activerecord validation/serialization magic.

https://github.com/discourse/discourse-chat-integration/blob/master/app/models/channel.rb#L3

Switching to jsonb would certainly be more performant for looking up values - I like the idea of adding an additional column, so that we don’t break existing plugins. If we add jsonb, it looks like activerecord has good support for it as well :slight_smile:

7 Likes

Yes, I’m happy to start with that suggestion, although with this particular use case lookup performance is fairly critical (y’know, real-time and all that), and without an index on the value column it’s likely to be Not Quite Good Enough.

2 Likes

There is an index on the key column of PluginStoreRow, so if your lookups are always against post_id, you could utilise that.

Setting the key to post_12_<random id>

means you could use a lookup like:

post_id = 12
PluginStoreRow.where("key LIKE ?", "post_#{post_id}_%")

That breaks down as soon as you want to ‘index’ against more than one value, but for a lot of applications it is enough.

6 Likes

As time goes by I become less and less of a fan of plugin store.

Can you just use a migration here and add a proper table, it gives proper data ownership to the plugin and is way more easy to clean up, data is easier to access and so on.

I get we want to make something dynamic here, but feel it is way cleaner just to have plugins stage the tables they need with proper models and so on.

Longer term we can even have a “registry” file committed to core the lists what plugin owns what tables.

6 Likes

I ended up doing this thing:

Which is adding (and allowing the removal of) a column from within the plugin (using this format to add and remove tables would be certainly possible as well).

The removal bit’s pretty manual atm, but maybe the registry you mention could handle removing related tables if the plugin is taken out of app.yml [it would be bad to wipe the database of all the plugin’s data just because the admin deactivated it] EDIT: No wait erasing data like that is probably a real nasty side effect and it probably wouldn’t be the worst thing in the world to have an orphan table or column that you could get rid of with a little work.

Code that might be cool to write:

# plugin.rb
register_plugin_table :my_plugin, :wingbats do |t|
  # this block is passed through to create_table
  t.integer :id 
  t.timestamps  
end

register_plugin_column :my_plugin, :post_custom_fields, :my_column, :jsonb, options: {
  default: {},
  index: true
}
1 Like

My pref here today is just to lean on:

https://github.com/discourse/discourse/blob/06f82a7d72f79f1419b4ea8c1d264476dd794d87/lib/plugin/instance.rb#L456-L457

So as long as your migrations are all there we will run them on db:migrate. Only caveat is we need to run them on the test database (which we should fix so it is done by default)

4 Likes

Ok, I will try with migrations inside the plugin and see how it goes. My main concern with plugins shipping migrations is support requests. Right now, uninstalling a plugin removes pretty much all traces of it. If we start encouraging migrations inside plugins, uninstalling the plugin won’t necessarily fix the problem.

As long as plugin developers don’t modify any core tables, it should be reasonably safe :crossed_fingers:.

3 Likes

Couldn’t we fix the limitations of plugin store? or you just think it’s broken by design?

1 Like

To provide some context, Sam’s post was in response to this PR:

https://github.com/discourse/discourse/pull/6269

That change would help the problem of storing structured data, but it’s never going to be as efficient or flexible as using separate tables. :man_shrugging:

3 Likes

Maybe we should provide some guidelines on when we think pluginstore is appropriate and when we think it’s not and people should use migrations.

3 Likes

The general guidelines for using plugin store would be for

  • Tiny amounts of data
  • You do not think you will need to query it

Overall, actual use cases for PluginStore should be very very small. For example even our existing cases where we use it are misuses imo cause tidy tables here would help a lot.

  • staff notes should be a table
  • data explorer is really messy in the way it uses it and should use a table
  • oauth2 basic belongs in the new tables @david is suggesting
  • canned replies should be in a table

Regarding the “but I really really want to remove all traces of a plugin, problem”

Firstly, we are not cleaning up plugin store properly anyway. Secondly, if you make the migrations reversible we can just run them in reverse. The big risk around “adding extra tables” is that 2 plugins may fight for 1, which is why I suggested a central registry in core.

7 Likes

That’s how I ended up using it in discourse-mozilla-iam, PluginStore for storing a couple of global variables which we need to persist across restarts. Full blown models & tables for everything else.

I got around this by setting up an engine in my plugin, and placing the model under that namespace. If any other plugin decides to touch a mozilla_iam_group_mappings table (or MozillaIAM::GroupMappings model), I would be :astonished:

That would be :ok_hand:, I should’ve thought to report it when implementing this slight monstrosity:

https://github.com/mozilla/discourse-mozilla-iam/blob/master/spec/iam_helper.rb#L20-L21

3 Likes

It is currently possible by doing RAILS_ENV=test LOAD_PLUGINS=1 rake db:migrate. I think that’s probably fine, given that plugins aren’t loaded for any test environment by default.

What about RAILS_ENV=test rake plugin:spec[foobar]? That command won’t work unless you remember to run the migrations (which I think is a bit unexpected).

rake spec doesn’t run core migrations, so I think it would be inconsistent for rake plugin:spec to run migrations :thinking:. (not necessarily opposed to them both running migrations though)

1 Like

There’s also https://github.com/discourse/discourse/blob/master/lib/tasks/docker.rake which doesn’t seem to run plugin migrations (unless you’ve set LOAD_PLUGINS in the env somewhere prior to running it, which would seem to be redundant if you’ve already set SINGLE_PLUGIN).

1 Like