Recently I have tried to speed the queries against our MySQL db that we import data into our elasticubes with. I have been adjusting the ODBC settings for the DSN we set up against our database. The ODBC gui available via the data source administrator panel in windows server is a quick way to manipulate most of the available functionality and settings of the driver.
As a brief overview we have been running custom MySQL queries using the ODBC driver because our average build imports ~5-7 million rows as aggregate keys and it is not possible or efficient to achieve the aggregation we need by using a MySQL dump. These 5-7 million rows account for 3 days worth of data in an 180 day cube and we use the sliding window functionality to delete and replace the last 3 days to keep builds and report downtime to under 5 minutes. Full builds are not an option for us because the data required to support our reporting needs is on the order of 350MM rows and 75GB. We delete the last 3 days because we have constant and unpredictable upserts on certain tables in our database (in this case our sales table) and aggregating our counts by buyer and country as a key keeps the number of rows around 5-7 as opposed to around 25 MM. Finally due to constant updates and inserts on these types of tables and they are often locked by virtue of INNODB.
We recently ran into a few road blocks regarding table locks that were slowing the build process to around 2 hours for a 5 million row query that normally takes 3 minutes. I wanted to share some of the settings that have enabled me to overcome an overloaded database and allowed me to regain 5 minute build times for millions of rows on an overburdened server.
First If you have configured the DSN with default settings (see our server specs below) the driver is most likely caching the results and not allowing the elasticube build process to add the data to the cube as it is sent. To turn this off use the "Don't cache results for forward-only cursors" checkbox under the cursors/results tab (nocache.png).
As for the table locks, we decided that since we are rebuilding the data every 4 hours that we do not care about transactional integrity for our queries against our larger tables and thus defaulting our queries to Read uncommitted is optimal for us. To do this we have included an initial statement in our driver:
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
This works beautifully on the larger queries in combination with compression support (connection.png).
So finally I ask if anyone else has experimented with these settings to speed query performance and if there are additional settings that would benefit us?
Driver version: MySQL ODNC 5.2 ANSI Driver
Windows Version: Windows Server 2008 R2 Enterprise Version 6.1.7601 Service Pack 1 Build 7601
2 x Processor Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz, 2500 Mhz, 6 Core(s), 12 Logical Processor(s)
Installed Ram: 192 GB
SiSense Version: 4.4.7700.7 64 Bit Long Index Version