A data model to make perusing the database easier

Ah! Gotcha! Any ideas on what this could look like? We are always looking for ways to make it easier here :slight_smile:

2 Likes

Of course! Something like the Northwind (MS Access fame!) data model would be brilliant. There may even be a tool that’ll generate one by investigating the Postgres schema

2 Likes

Can you share that like you’re talking to a noob? Like:

  • What would adding this help solve?
  • How would you expect to see this if it is added? How would it look?

I am basically taking your suggestion and asking for more details (as much as you can give :wink: ) so we can improve the experience for the next Data Architect that comes around. I have no background in data, so your detailed suggestion will be helpful :slight_smile:

2 Likes

Haha Osioke,

A very fair shout! Touché…

So most folk who work with databases (not just data architects! Developers too!) find it very helpful to have some sort of a data model that shows them how the various tables are connected to each other.

For example, let’s take my query ;), I needed several pieces of information about a user - I required information about a user who:

  • was in (or not in) a particular group
  • had solved topic
  • within a certain date range

To answer the above, I need the users table, the user_actions table, and the groups table. A data model would have shown me that I can link a user to a user action via id/user_id, and to link a user to a group via their primary_group_id/id visually.

It helps to visualise not only what data is available, but how to join it, especially if there are some long-winded queries at play.

Yes you could click through every single table in the data explorer to figure out what fields are available and write them down so that you don’t forget, but having a data model may be a bit more humane for some of us :slight_smile:

5 Likes

Ah! Gotcha again! :sweat_smile:

I am not technical, so this was not clear to me. I do see the need though, so yes. A data model is something that would be helpful. Let me see what I can do. :slight_smile:

In the meantime I have moved this conversation to a new topic in our site feedback category so we keep the other discussion clean.

2 Likes

That would be amazing, thank you!

1 Like

So I am talking with the team and this is not something that would be straightforward for so many reasons. We also moved it to feature because that shows more of what this would be.

Currently, those of us that work on/with data inhouse mostly use the models that are available in the source code:

I also got a look at the Northwind data model:

That is definitely easy to understand and fit on paper or one screen. 13 tables in all.

Comparing that to Discourse, we have a whole lot more tables over 180+ or more, visualising this would be … a journey. Especially because there are also tables from plugins (and these change from install to install) and data in *_custom_fields tables which should be included as well if you really want to have a complete picture.

Also because of how our database is designed, we can’t use most tools for data modeling, we’d need to find one that works with ActiveRecord models. And I think that also makes this tricky too, all these data conversations are over my head. :sweat_smile:

But that is not to say this isn’t something we don’t want to do though, this is just commentary. I’d love to hear suggestions from you or anyone else on ways we could make this better. :wink: :slight_smile:

6 Likes

It’s really not that useful, as the sheer size of it, the lack of foreign keys and we simply leaving very little logic to the RDMS means it’s hard to understand Discourse DB without reading Discourse source.

But if you really need one, RubyMine can generate it for you.

14 Likes

You can generate one with relationships with rails-erd: GitHub - voormedia/rails-erd: Generate Entity-Relationship Diagrams for Rails applications

Not sure how useful this is though

10 Likes

@lju I hope all our explanations help here, especially with the added context. I’ll be closing this off in the next day or two. If you still feel you need some extra detail, feel free to ask.

3 Likes

Hi @osioke ,

Sorry for the delay in replying, I’ve been a bit jammed.

I’ve got some ideas for what would be helpful - if you can give me a few days I’ll write something up.

Cheers,

Lju

1 Like

Awesome! A few days you now have :slight_smile: thank you for giving this attention.

Hi All,

So my argument would be that a data model would be useful, but we don’t necessarily need to include all of the tables. I suspect there are probably the ‘key’ 15-25 odd tables that 90% of all queries use/what people are looking for. In fact, looking at the various tables made available - there are probably a series of data models that could be created, based on the types of queries/data you are looking to explore.

I can take a stab over the next few days to pull together what I think would be the most commonly queried tables - this wouldn’t be extensive research, just a stab in the dark. I’m sure the various questions asked in the Data Explorer category will also shine a light on the popular tables.

There could also be another diagram to represent ‘zones’ of interest to make navigation of different parts of the available data.

Does that make sense?

Cheers,

Lju

6 Likes

Data explorer already lists the 9 most important tables first in the query edit UI panel, and you can see all of the tables column structure and types with a click:

3 Likes

So we could take those 9 tables and turn them into a simplified data model? :thinking: :wink:

5 Likes

Sure, go ahead and share the results!

4 Likes

damn this is huge; one of the largest db schema i have seen online

was that created with https://dbdiagram.io? mind sharing the public url of the diagram?

i am more interested in the relationships and connections between these tables

users,
user_options,
api_keys,
user_api_keys,
user_auth_tokens,
user_auth_token_logs,
notifications

thanks

very helpful but will be great to have a shareable url so we can see the table relationships and also the primary/foreign keys on the tables

are there any 1-to-1 relationships in the schema? will like to know especially between users and user_options tables

anyone willing to help with the relationships between these tables? from the schema diagram

users,
user_options,
api_keys,
user_api_keys,
user_auth_tokens,
user_auth_token_logs,
notifications

interested to know if there are 1-to-1 relationships

will appreciate it…thanks

cc @Falco @sam

It’s mostly 1-N, as users have multiple notifications, auth tokens, etc.

user_options is 1-1.