A common use case is to know the number of new users in the system.
It could be that same user appear in various months but we want to count each unique user only once.
If we will use a simplistic approach of count distinct calculation on each month and then SUM the results (for a running sum), we will be including the same user more than once (for example, if the user appeared on January and also in March I would like to count him only once).
As an example, let's look at the following sample set: this is the fact table called data.
1. Generate a custom table with the earliest date for each User and call it EarliestDate
2. Add a Key column from date and user.
3. Add a column flagEarliest with the value 1.
FROM [Data] d
GROUP BY d.[User]
4. Add a similar Key to the data table from date and user.
Key = concat(tostring(date),user)
5. Add a custom table that will left join the tables: data and EarliestDate. Call it DataWithEarliestDate
FROM [data] d
LEFT JOIN EarliestDate e
ON d.key = e.[Key]
Order By d.[Date]
6. Hide the data and EarliestDate tables. We will not be needing them on the dashboard.
7. The DataWithEarliestDate table should look similar to this:
8. We can now perform a simple SUM on the flagEarliest column and the result would be the Running Sum of the distinct users.