Ah! Gotcha! Any ideas on what this could look like? We are always looking for ways to make it easier here
Of course! Something like the Northwind (MS Access fame!) data model would be brilliant. There may even be a tool that’ll generate one by investigating the Postgres schema
Can you share that like you’re talking to a noob? Like:
- What would adding this help solve?
- How would you expect to see this if it is added? How would it look?
I am basically taking your suggestion and asking for more details (as much as you can give ) so we can improve the experience for the next Data Architect that comes around. I have no background in data, so your detailed suggestion will be helpful
A very fair shout! Touché…
So most folk who work with databases (not just data architects! Developers too!) find it very helpful to have some sort of a data model that shows them how the various tables are connected to each other.
For example, let’s take my query ;), I needed several pieces of information about a user - I required information about a user who:
- was in (or not in) a particular group
- had solved topic
- within a certain date range
To answer the above, I need the users table, the user_actions table, and the groups table. A data model would have shown me that I can link a user to a user action via id/user_id, and to link a user to a group via their primary_group_id/id visually.
It helps to visualise not only what data is available, but how to join it, especially if there are some long-winded queries at play.
Yes you could click through every single table in the data explorer to figure out what fields are available and write them down so that you don’t forget, but having a data model may be a bit more humane for some of us
Ah! Gotcha again!
I am not technical, so this was not clear to me. I do see the need though, so yes. A data model is something that would be helpful. Let me see what I can do.
In the meantime I have moved this conversation to a new topic in our #site-feedback category so we keep the other discussion clean.
That would be amazing, thank you!
So I am talking with the team and this is not something that would be straightforward for so many reasons. We also moved it to #feature because that shows more of what this would be.
Currently, those of us that work on/with data inhouse mostly use the models that are available in the source code:
I also got a look at the Northwind data model:
That is definitely easy to understand and fit on paper or one screen. 13 tables in all.
Comparing that to Discourse, we have a whole lot more tables over 180+ or more, visualising this would be … a journey. Especially because there are also tables from plugins (and these change from install to install) and data in
*_custom_fields tables which should be included as well if you really want to have a complete picture.
Also because of how our database is designed, we can’t use most tools for data modeling, we’d need to find one that works with ActiveRecord models. And I think that also makes this tricky too, all these data conversations are over my head.
But that is not to say this isn’t something we don’t want to do though, this is just commentary. I’d love to hear suggestions from you or anyone else on ways we could make this better.
It’s really not that useful, as the sheer size of it, the lack of foreign keys and we simply leaving very little logic to the RDMS means it’s hard to understand Discourse DB without reading Discourse source.
But if you really need one, RubyMine can generate it for you.
You can generate one with relationships with rails-erd: GitHub - voormedia/rails-erd: Generate Entity-Relationship Diagrams for Rails applications
Not sure how useful this is though
@lju I hope all our explanations help here, especially with the added context. I’ll be closing this off in the next day or two. If you still feel you need some extra detail, feel free to ask.
Hi @osioke ,
Sorry for the delay in replying, I’ve been a bit jammed.
I’ve got some ideas for what would be helpful - if you can give me a few days I’ll write something up.
Awesome! A few days you now have thank you for giving this attention.
So my argument would be that a data model would be useful, but we don’t necessarily need to include all of the tables. I suspect there are probably the ‘key’ 15-25 odd tables that 90% of all queries use/what people are looking for. In fact, looking at the various tables made available - there are probably a series of data models that could be created, based on the types of queries/data you are looking to explore.
I can take a stab over the next few days to pull together what I think would be the most commonly queried tables - this wouldn’t be extensive research, just a stab in the dark. I’m sure the various questions asked in the Data Explorer category will also shine a light on the popular tables.
There could also be another diagram to represent ‘zones’ of interest to make navigation of different parts of the available data.
Does that make sense?
Data explorer already lists the 9 most important tables first in the query edit UI panel, and you can see all of the tables column structure and types with a click:
So we could take those 9 tables and turn them into a simplified data model?
Sure, go ahead and share the results!
damn this is huge; one of the largest db schema i have seen online
was that created with https://dbdiagram.io? mind sharing the public url of the diagram?
i am more interested in the relationships and connections between these tables
very helpful but will be great to have a shareable url so we can see the table relationships and also the primary/foreign keys on the tables
are there any 1-to-1 relationships in the schema? will like to know especially between
anyone willing to help with the relationships between these tables? from the schema diagram
users, user_options, api_keys, user_api_keys, user_auth_tokens, user_auth_token_logs, notifications
interested to know if there are 1-to-1 relationships
will appreciate it…thanks
It’s mostly 1-N, as users have multiple notifications, auth tokens, etc.
user_options is 1-1.