Not exactly. It might be more clear if we reformat @simon’s query slightly:
FROM category_users cu
JOIN user_emails ue ON ue.user_id = cu.user_id
WHERE (category_id = '10' AND ue.primary = true)
ue.primary conditions are both part of the
WHERE clause, joined with
AND. If you remove one of the conditions, you remove the
AND but you keep the
Most simple SQL queries follow this form:
You can omit the
WHERE clause altogether, but then you’ll get back every row from the tables you specified.
Here’s your original query (reformatted):
WHERE category_id = '10'
SELECT *” means that you want the query to return every column from all the tables involved.
FROM category_users” indicates the table that you want to query. The
category_users table contains rows looking something like this:
user_id are called foreign keys because they point at a row in another table (in this case, the
users tables). So the 3 rows above mean that the user with id
1 is watching categories
3, and the user with id
2 is watching category
notification_level indicates whether they are Watching, Watching First Post or Tracking.
WHERE category_id = '10'” means that you are only interested in rows where the value in the
category_id column is
10. Without that line, you would get every row from the
@simon gave you a new version that added the user’s email address:
This query made a few changes from your original, for 2 reasons: Email addresses are stored in a different table (the
user_emails table), and users can have more than one email address.
cu.*” means “all the columns from the
ue.email” means “the
email column from the
category_users table now has an alias, “
cu”, which saves some typing if you need to refer to it more than once.
We’ve JOINed to the
user_emails table, and given it the alias
user_emails table contains rows like this:
Which means that the user with id
1 has 2 email addresses, email@example.com (the primary address) and firstname.lastname@example.org (a secondary address). The user with id
2 only has one address.
JOIN 2 tables in SQL, you normally need to tell the database what the join condition is. If you don’t do that, the database doesn’t know which values in each of the tables are meant to match up, so you’ll end up with every possible combination of rows in the 2 tables. If you wrote this query:
…with the example data above, you’d get 9 rows back: you’d get the first row from
category_users three times, once with each of the
user_emails rows, then similarly you’d get the second
category_users row three times, then finally you’d get the third
category_users row three times.
The join condition normally tells the database which column in the 2 tables represents the same value. In this case, the
category_users.user_id column and the
user_emails.user_id column both represent the same value. By writing
ON ue.user_id = cu.user_id after
JOIN user_emails ue, we tell the database to match the
user_emails rows up with the appropriate
Even with the
JOIN condition, we’re still going to get 4 rows back for the user with ID
1, because they are watching 2 categories and they have 2 email addresses - we’ll get a row for each combination. So @simon added an extra condition to the
WHERE clause so that the query would only return rows with the user’s primary email address. This condition is in addition to the condition that already existed (restricting the category ID) - in order to be returned rows must have
category_id = '10' AND
ue.primary = true.
Then, because you didn’t want to restrict your search to a single category, you just needed to remove the
category_id filter. You don’t want to remove the whole
WHERE clause, because you still want to only return primary email addresses. In other words, your filter condition changed from:
category_id = '10' AND ue.primary = true
ue.primary = true
Phew! I hope that all makes sense