Rules To Live By When Designing An Elasticube
A list of recommendations which should be considered when developing an elasticube.
Build your ElastiCube incrementally. Start with only two tables, create an appropriate relationship between them and then conduct a build to import the data. Use the Sisense Web Application to combine data from both tables and verify the results returned are correct.
Once ou are usre the results are correct you can return to the Elasticube Manager and add one or two more tables to the existing cube. Repeat the process above as you develop your Elasticube.
Remember it's very easy to add additional data sources and tables as needed.
Be In Good Relationship
Relationships specify the logic used to combine data from one or more tables. A relationship is created by connecting fields between two or more tables. This determines what data is returned in a widget, such as a pivot table, when data is combined from two or more tables.
There are several types of relationships that may be created. By default the elasticube creates a relationship called a "Inner Join". The important thing to note is an inner join creates a data set including only those values which are present in both tables.
For example a sales system may have a table for Orders and another table for Inventory. The Orders and Inventory table may both contain a Product ID. Now a relationship with an inner join was created in the Elasticube Manager by connecting the Product ID from the Inventory table to the Product ID in the Orders table. When a widget is created using data from both tables the Inventory table and the amount sold from the order table ONLY inventory items contained in both tables would be returned.
What this means is if there was a type of product which has never been sold it would not be included in the data set described above as the product id for this product would not be contained in the Orders table.
Avoid Many To Many Relationships
Many-to-Many relationships exist when the value in each field used to create a relationship between tables is contained multiple times in each table. For example a hotel may have a table with Reservations data and a table with Payments data. In both tables the name of the guest is stored. A guest can have multiple reservations under their name as well as multiple payments for their stay recorded in their name. If a relationship between the Reservation and Payments table was created based on the guests name a many-to-many relationship would be created. In general when a field from two or more tables contains the same value, and these values are duplicated in both tables a connection created based on this field will result in a many-to-many relationship. The problem with this kind of relationship is it can create complex data sets which do not return the correct results or use excessive computer resources and do not return any results. A solution to working with many-to-many relationship can be found here.
In the ElastiCube Manager tables can have multiple relationships between them. When data is represented in a widget the ElastiCube decides which route to take between the tables in order to return the data set. A calculation path is made up of all relationships leading from one table to another. In general the shortest and most direct route between the tables is taken. In some cases, more than one path exists leading from one field to another. Sometimes this is due to poor database design, and other times it is just a realistic necessity due to the way the data is structured. In this case, there is no absolute way to determine the required path to take to calculate results. A solution to working with cycles can be found here.
In addition to incrementally building an ElastiCube, starting with two or three tables, we recommend verifying results in the Sisense Web Application at each stage before adding new tables to the cube.
Verifying results is done in the Sisense Web Application and entails combining fields from multiple tables in a single pivot to validate the data is correct. For example a sales system may have a table for Orders and another table for Inventory. The Orders and Inventory table may both contain a Product ID. Now a relationship with an inner join was created in the ElastiCube Manager by connecting the Product ID from the Inventory table to the Product ID in the Orders table. When a widget is created using data from both tables say the inventory name from the Inventory table and the amount sold from the orders table ONLY inventory items contained in both tables would be returned.
For example assume you have two tables in the ElastiCube named Orders and Inventory which have a relationship on the Product ID field. One test may be to use the Order Date field from Orders table and count the amount of inventory from the Inventory table. In practical terms this entails placing the Order Date in the rows panel and a Sum of Inventory sold in the measures panel of a pivot table in BI Studio. Results could then be compared to determine if they match with other reports and systems.
Such testing needs to be based on the following principles:
i. A base line to verify and compare must exist
ii. Data must be combined from multiple tables in a single pivot table.
iii. Tests must focus on data sets that will be used for business purposes
iv. Inaccuracies can be checked it by considering recommendations 1 to 4 listed above