Beware of Many to Many Joins!
Row counts looking off or data not looking quite right after a join? Well, you could possibly be doing a many-to-many join if the columns in your join conditions aren't distinct!
Let's take the following example. Say you have this table, t1.
And this table, t2.
Notice that there are multiple entries for A and B in both tables. If we join t1 to t2 on the condition t1.id = t2.id, we will join each combination of rows from A and B that satisfies this condition. In other words. t1 as 2 A's and t2 as 3 A's. Therefore, the joined table will return 2 * 3 = 6 rows where id = A.
Here's the result of the below query
select t1.id as t1_id, t2.id as t2_id, t1.val as t1_val, t2.val as t2_val from t1 join t2 on t1.id = t2.id
Now what if you didn't want to see a many to many join? Then, it's important to remember to pre-aggregate your table (or remove duplicates if you have any). Before joining, you want to ensure that each id shows up at most once in each table.
If you want to read more about joins, check out the community post here!
Please sign in to leave a comment.
Comments
0 comments