A relatively common database design strategy is to have a tag column with an arbitrary number of tags
Here is an example in which we look at numbers between 1 and 10 and tag them as even, odd, perfect, or parts of the Fibonacci sequence.
Now we want to be able to select rows based on the tags.
Option 1 is to create filters using contains. But this will quickly become unworkable for large numbers of tags as new filters would have to be created for every tag.
Option 2 is a more flexible solution, doing this work on the elasticube side.
Step 1 (The Combo Tags) table : Create a table of unique tag combinations in the fact table. For example
SELECT DISTINCT tags
Step 2 (The Tag Mapper): Here we need to create a table where every individual tag links to every tag combination it is in. In SQL this is more straightforward so if you can do this in your database look up STRING SPLIT and UNROLL functions. In Sisense we have to be more round about. For Sisense we will need to check the first tag in every tag combo, then the second, then the third, etc etc for whatever the maximum number of tags in a record is. The provided example supports max three tags per row.
SELECT DISTINCT combo_tag, tag FROM (
SELECT tags AS combo_tag, tags AS tag
WHERE STRPARTS(tags, ',', 1) IS NULL OR STRPARTS(tags, ',', 1) = ''
SELECT tags AS combo_tag, STRPARTS(tags, ',', 1) AS tag
WHERE STRPARTS(tags, ',', 1) IS NOT NULL AND STRPARTS(tags, ',', 1) != ''
SELECT tags AS combo_tag, STRPARTS(tags, ',', 2) AS tag
WHERE STRPARTS(tags, ',', 2) IS NOT NULL AND STRPARTS(tags, ',', 2) != ''
SELECT tags AS combo_tag, STRPARTS(tags, ',', 3) AS tag
WHERE STRPARTS(tags, ',', 3) IS NOT NULL AND STRPARTS(tags, ',', 3) != ''
) AS tmp
When done, the table will look like this, with each combo tag linking individually to each component. So the combo_tag Even,Prime,Fib appears three times, once with Even, once with Prime, and once with Fib.
Note: These results were ordered for display reasons, do not use an order by in production environments as it unnecessarily slows down the cube. You can order your table while testing to ensure correctness, but remove when done.
Step 3 (Dim Tag): Now we just need to make a unique collection of individual tags to create our dim.
SELECT DISTINCT tag
Step 4 (Make the joins)
Join DIM_Tag.tag to Tag_Mapper.tag
Join Tag_Mapper.combo_tags to Combo_Tags.tags
Join Combo_Tags.tags to FactTable.tags
Let's review what happens in this example when the tags "Odd" and "Perfect" are selected.
Dim_Tag joins to Tag Mapper and grabs the following combo_tags
- Odd, Fib
- Odd, Prime, Fib
- Even, Perfect
- Odd, Prime
When that table is joined to combo tags and then to our fact table, we see that the following ID's are revealed
The Combo_tags table is needed between the Tag_Mappper and the fact table to prevent Many to Many joins.
This approach will run into space issues as the tag lists become larger. The best recommendation I can give here is to ensure that the tags are always in the same order (alphabetical etc) just so that we don't have Tag1,Tag2 as well as Tag2,Tag1. This will help constrain the data size.
And a dash
Please sign in to leave a comment.