Sometimes, we would like to create calculations from the fact table and we need to have a continuous range of dates or in other cases, we need to count the number of employees (just as an example) that we have in the company.
We strongly recommend using fields from the fact tables for the calculations and fields from dimensions fro filters/categories/columns etc.
But what do we do if not all dates appear in the fact tables (such as weekends, holidays etc.) and not all employees have transactions in the facts?
We then tempted to create a calculation on the dimension tables. However, in return might have undesired implications (M2M, "RandomPath") as it might force a connection between two dimension tables that are connected indirectly through 1 or more fact tables.
We want to be able to calculate from the fact table. For this, we need all the values from the relevant dimension tables to be also in the fact table. Then, we could count for example the number of employees from the fact table, even if they did not have transactions in reality.
We have 3 ways to create the desired fact table:
1) Union the fact table with a Cross Join between the relevant dimensions (from which we want to fill the missing values) and then filter the duplicated records using the RankDesc approach.
In this use case, we will have all the combinations of the dimensions involved (date and employees in our example) in the fact table.
,rankdesc(a.KeyDateEmployee, a.Amount) RankKey
, (left(tostring(s.date),10) + '|' + s.Employee) KeyDateEmployee
FROM Sales s
,(left(tostring(d.date),10) + '|' + e.Employee) KeyDateEmployee
FROM dimDate d, dimEmployees e
WHERE a1.RankKey = 1
2) Using the "WHERE NOT IN" - In this scenario, we will only add to the fact the missing values from each dimension (date and employees) but without the combinations between the dimensions.
--Original fact table
FROM [Sales] s
--Employees who don't have any transactions in the fact
FROM [dimEmployees] e
WHERE e.[Employee] NOT IN (SELECT DISTINCT s.[Employee] FROM [Sales] s)
--Dates which are not appearing in the fact
FROM [dimDate] d
WHERE d.[Date] NOT IN (SELECT DISTINCT s.[Date] FROM [Sales] s)
3) Using Right Join with the dimensions table - use case similar to the "Wher Not In".
,coalesce(s.[Date],createdate(getyear(now()), 1,1)) DateJoin
FROM [Sales] s RIGHT JOIN [dimEmployees] e
ON s.[Employee] = e.[Employee]
RIGHT JOIN [dimDate] d
ON a.DateJoin = d.Date
Here are the supporting files: