You may find instances where you have common symptoms of a many to many (M2M) relationship, but cannot find the root cause of the relationship. Such symptoms of a many to many relationship include:
- High RAM usage during query,
- ElastiCube size on disk increasing during query,
- Build failing with 100% RAM
The common check for a M2M is the below query.
Select count(a.id), distinct_count(a.id) From dim as a
However, the numbers may be identical which may not initially indicate a M2M so the below query must be run for a stronger check:
Select count(a.ID), distinct_count(a.ID),
sum(case when a.ID is null then 1 else 0 end) as [null count]
From dim as a
If the null count field shows more than 1 in both tables then you have a many-to-many relationship on null values.
Cause of Issue
There could be many reasons why null values get to the dimension tables, such as cases when the dimension table is a custom table that's taken from the fact table or if the data is imported with null values from the data source itself.
Dimension tables should always be unique, even if they have null values, there shouldn't be more than one null value.
If the dimension table is a custom table that's taken from the fact table, make sure you use the "distinct" expression.
If the nulls are imported from the data source try to remove them either in the import query or by creating a custom table that's taking only unique values from the table.