A small ruby/DB refactoring task

(Robin Ward) #1

One pattern we’re trying to use at Discourse is to break up our giant User table into several smaller tables. There are many advantages to doing this, but one obvious one is that by default in Rails if you say User.find(1) it will load all the columns from the database by default, which can be quite expensive.

Recently I added a location string to a User so users can optionally say where they’re from on their profile. This is definitely not a column that you want returned from the database all the time. Instead, I created a new UserProfile model and added the field there.

Now, when you view a user’s profile, it will include the location field from that table. It works great!

However, it’s obvious that there are least a couple of other fields on User that should be moved over to this table too:

  • website: the user’s personal web site
  • bio_raw and bio_cooked: their bio information
  • profile_background their profile background image

Some of those fields like website should be easier to move over than others, but I believe all should be brought over.

If anyone is looking for a Rails specific task to bring them up to speed with Discourse, this might be a good one to try! Of course we’d an expect a pull request that does this to:

  • Works properly
  • Includes a migration to move old data over so nothing is lost
  • Pass tests
  • Add more tests if necessary for coverage (some of those fields might not be tested)

We are also not against smaller PRs that don’t do the whole task. For example if you just want to move website and not bother with the other two, that’s totally fine.

Feel free to ask questions/comments here if you’re interested. Thanks!

(Jens Maier) #2

Hm, I learned in database theory that vertical partitioning has no significant benefits towards query performance and throughput apart from specific use cases involving distributed databases where this can be used to optimize network overhead.

Wouldn’t narrower selects via ActiveRecord scopes be closer to the canonical solution? Or does ActiveRecord’s design force every single query to go through the query parser again and again?

(Vikhyat Korrapati) #3

Using partial selects is a source of bugs with ActiveRecord models, see: Enemy of the State - Evil Trout's Blog

(Jens Maier) #4

Ah, yes, bugs and their ugly workarounds… :sweat_smile:

(Robin Ward) #5

This is definitely one of those cases where we’re changing the schema to acommodate ActiveRecord’s patterns.

Having said that, there is another advantage down the road if you think of models as long lived. In the Ember application we will eventually have an identity map that keeps track of objects. Most of the time we only want a User without its profile. But if we visit their profile we can just look up its UserProfile object. Both can be stored in different identity maps and retrieved from memory if we have them around. So it’s a nice way of organizing objects.

(Andrew Bezzub) #6

Did anyone start to work on this? I am interested in doing this over the weekend.

(Sam Saffron) #7

nope, nobody picked this up yet.

(Andrew Bezzub) #8

Great, I’ll work on it then

(Robin Ward) #9

Don’t be afraid to start small and do a PR for just a small amount of work at first.

(Andrew Bezzub) #10

Sure, I think I’ll start from moving only website

(Andrew Bezzub) #11

This should be done in Move bio from User to UserProfile by abezzub · Pull Request #2430 · discourse/discourse · GitHub and Move profile_background from User to UserProfile by abezzub · Pull Request #2437 · discourse/discourse · GitHub

(Robin Ward) #12

Awesome work, we will review soon!

(Robin Ward) #13