Discourse can handle BIG Groups?


(Rafael dos Santos Silva) #1

People are asking to create some subcategories with regional access on my instance.

I have created 4 subcategories and 4 groups. One for each.

The problem is: each group is around 5000 people.

Anyone tried something like this?

The index at group_users table only helps search by group_id, but not by user_id (to list all groups like in the profile), maybe adding an index (userd_id, group_id) will solve any excess of seq_scans ?


(Jeff Atwood) #2

I think it should work. I do not feel 5k group members is excessively large.


(Sam Saffron) #3

Keep in mind, the automatic trust level groups are already gigantic, where are you seeing queries missing an index?


(Rafael dos Santos Silva) #4

Now that you mention that, thats true. Since trust levels use the same group infrastructure they already get heavily used. :thumbsup:

Just theorycrafting and looking at postgres stats (I’m kinda of a postgresql nerd), the index doesn’t cover the case above:

SELECT group_id
  FROM group_users
  where user_id = 800

-----

"Seq Scan on group_users  (cost=0.00..41.62 rows=1 width=4) (actual time=0.138..0.230 rows=1 loops=1)"
"  Filter: (user_id = 800)"
"  Rows Removed by Filter: 1408"
"Planning time: 0.063 ms"
"Execution time: 0.244 ms"

So you get hundreds of miliseconds sequential scans instead of 50ms index only scan or bitmap heap scan. Maybe I’m worrying prematurely :smile:.

Since I log all queries and pgBadger then, I’ll see what happens. Thanks!


(Sam Saffron) #5

It does sound like a missing index, I will have a quick look and add one today


(Rafael dos Santos Silva) #6

Thanks!

By the way, I’m currently handling groups by exposing Discourse tables in another database, with a writable fdw.

Not sure whats the best approach here, but probably will need to create a plugin to handle mass group attribution better.

Those 5k groups trade people daily :smile: