Data accumulation is SiSense's method for building ElastiCube data without completely refreshing a table. This post describes the data accumulation functionality and gives a user the ability to customize and optimize their ElastiCube build process.
There are two options for accumulating data: by table or by index. In general, accumulate by table appends all of the source data to the ElastiCube table, and accumulate by index will use an index in an Elasticube table to filter the source data that is migrated.
Accumulate by Table
This option will append all of the data selected for the build onto the current table. No comparisons are made, and no data from the source is omitted. This option can be selected by selecting the additional preferences options for a table in the ElastiCube manager, and selecting 'Accumulate Data'.
If your source data contains rows that have already been loaded, the ElastiCube table will contain duplicate rows. The image below demonstrates a scenario where duplicate rows would be inserted.
For contrast- the following example shows usage of accumulate by table that doesn't generate duplicate rows.
Accumulate by Index
The index is a column in an ElastiCube table that is used to determine whether or not a row from the source data should be inserted. It must have a data type of either an integer or a date.
When you select an integer, only source rows with a value greater than the maximum index value in the ElastiCube table will be inserted. This option will never cause current data in the ElastiCube table to be modified or deleted. The following image demonstrates this logic:
After Load 1, the maximum index value is 3. In Load 2, the source index value of 2 is not inserted (since it is less than 3), but the source index value of 4 is inserted (since it is greater than 3).