For general information on types of data relationships, potential complications and how to resolve them, see the following articles:
- Data Relationships: Introduction to Many-to-Many
- Data Relationships: Check Type of relationship (You are here)
- Data Relationships: One-to-Many Relationship
- Data Relationships: The Invisible Many to Many
- Data Relationships: Many-to-Many Relationship Resolutions
Check Type of Relationship
In this post we'll test and determine the relationship type of the data set. In order to do so, we'll need to check the cardinality of the relationship. We'll need to determine the number of unique and duplicate values on each side of the relationship. Here is the breakdown of what to look for:
- Same value for both the unique and duplicate values means that there are no duplications. This will either be a One-to-One or One-to-Many Relationship.
- If the number of duplicate values is larger than the number of unique values, then this side of the relationship has duplicated values, and we'll need to investigate the other side of the relationship.
- If the other side of the relationship yields unique values, this is again a One-to-Many Relationship
- If the other side of the relationship yields duplicate values, we've got a Many-to-Many relationship.
Check for Potential Many-to-Many Relationships in the Elasticube
1. Open the the Elasticube with the schema (ecube / ecdata)
2. Click Add Data> Custom SQL Expression
3. Enter and adjust the SQL statement below. See the image below for the necessary changes.
SELECT [Do I have duplications?]
(SELECT distinct_count(t1.col1)<>count(t1.col1) AS [Do I have duplications?]
FROM [Table1] t1
FROM [Table2] t2) AS temp
GROUP BY [Do I have duplications?]
Image 1. Explained Query
4. At the top right of the expression editor window click the 'Parse SQL Expression' (F6) button .If the expression parses successfully click the 'Preview result table' (F5) button in the top right.
Image 2. Location of Parse & Preview Results
5. Analyze your results:
- If the two values are equal, all guest ids appear only once, making all values unique. We can stop investigating at this stage, given that even if the other side of the relationship has duplicate values for guest id, we'll still be dealing with a One-To-Many relationship, where the unique values are the reservations side, and the duplicate values are on the Payments side.
- If the result returned is 'True' in both lines a many-to-many relationship exists and will need to be considered in the ElastiCube design (See the image below for "Many-to-Many relationship prior to resolution"). Please continue to Data Relationships: Many-to-Many Relationship Resolutions article to learn how to resolve the Many-to-Many relationship.Image 3. Many to Many Schema - Wrong Connection
You may find instances where you have common symptoms of a many to many (M2M) relationship, but cannot find the root cause of the relationship. Such symptoms of a many to many relationship include, high RAM usage during query, ElastiCube size on disk increasing during query, and build failing with 100% RAM. In such cases, you may experience The Invisible Many to Many. You will find out how to resolve this issue there.