When trying to pull data from a single XML file into the Elasticube, the most common approach is to leverage an XML ODBC driver, such as this offering by CData. However, these drivers typically work with only a single file. If you have a folder of XML file and want to import them all at once, this post will show you how.
Here, we will leverage a python script to parse through all your XML files and convert them to JSON. Once we have the data as JSON objects, they get pushed to a MongoDB database (already installed on your Sisense Server). Sisense Elasticube has a MongoDB ODBC Driver, so we can connect directly to your files once they are in Mongo. The last step is to automate this process as a pre-build plugin.
REQUIREMENTS & SETUP
1. Download and install the Sisense MongoDB ODBC Driver.
2. Download and run the Python 2.7 Installer
3. Install Python Dependencies - Open windows command prompt and type the following
Pip install xmltodict
Pip install pymongo
Setup the Python scripts
Download and unzip the python application here to the root of your C drive.
Open the Xml2MongoDB.py file and find the User Defined Settings at the top of the file. Make sure these settings are correct for your environment. If installing directly on your Sisense server, the only setting that needs to be changed is xmlFolder, which is the path to where your XML files are stored
On line 90 of this python script, there is a reference to the fileJson object. This example mentions a property personDataList and it's sub-property personData. In this example, our XML files contained data within this structure, but your XML files are likely a bit different. Modify this line to reflect the proper path to the data array contained within your XML files.
Open windows command prompt and change directory to where you unzipped the python scripts. For an initial test, we want to verify the script is working. In the interest of time, try first with just a few xml files. Type in the following command to test the application
You should see an output like the below
Find the RunScriptBeforeBuild.dll file, right click and view its properties. Check here to see if the file is locked by windows, and if it is then click the unblock button. Copy and paste the dll to C:\Program Files\Sisense\Prism\Server\plugins. You may need to create the plugins folder if it doesn’t exist already. Now, restart your Elasticube service
In your Elasticube Manager, open up the Pre & Post Execution Plugins window and add a Pre Execution Plugin. Create the label as RunScript and copy and paste the script below using the script icon.
If you changed the directory where you unzipped the files, just update the Script attribute to match. Also, it’s important that backslashes in this field are escaped, just like in the same code/image below.
Once set, execute a build and watch the collection using a tool like Robomongo, to ensure the python application is running properly. Now you should have the data stored in MongoDB, so follow the typical instructions to create an Elasticube from Mongo
There are a few other variables within the python script that you can edit
- mongoServer - Where to find your MongoDB Server
- mongoPort - The port that MongoDB is running on
- mongoDB - The name of the database you want to create and store data in
- mongoCollection - The name of the collection, within your MongoDB database
Please sign in to leave a comment.