Working with XML files
PURPOSE
When trying to pull data from a single XML file into the Elasticube, the most common approach is to leverage an XML ODBC driver, such as this offering by CData. However, these drivers typically work with only a single file. If you have a folder of XML file and want to import them all at once, this post will show you how.
Here, we will leverage a python script to parse through all your XML files and convert them to JSON. Once we have the data as JSON objects, they get pushed to a MongoDB database (already installed on your Sisense Server). Sisense Elasticube has a MongoDB ODBC Driver, so we can connect directly to your files once they are in Mongo. The last step is to automate this process as a pre-build plugin.
REQUIREMENTS & SETUP
1. Download and install the Sisense MongoDB ODBC Driver.
2. Download and run the Python 2.7 Installer
3. Install Python Dependencies - Open windows command prompt and type the following
cd C:\Python27\scripts
Pip install xmltodict
Pip install pymongo
If you installed python to a different location, you will need to adjust the cd command accordingly. This will make sure the xmltodict and pymongo libraries are added to your python installation
Setup the Python scripts
Download and unzip the python application here to the root of your C drive.
Open the Xml2MongoDB.py file and find the User Defined Settings at the top of the file. Make sure these settings are correct for your environment. If installing directly on your Sisense server, the only setting that needs to be changed is xmlFolder, which is the path to where your XML files are stored
On line 90 of this python script, there is a reference to the fileJson object. This example mentions a property personDataList and it's sub-property personData. In this example, our XML files contained data within this structure, but your XML files are likely a bit different. Modify this line to reflect the proper path to the data array contained within your XML files.
Open windows command prompt and change directory to where you unzipped the python scripts. For an initial test, we want to verify the script is working. In the interest of time, try first with just a few xml files. Type in the following command to test the application
python.exe Xml2MongoDB.py
You should see an output like the below
Automation
Find the RunScriptBeforeBuild.dll file, right click and view its properties. Check here to see if the file is locked by windows, and if it is then click the unblock button. Copy and paste the dll to C:\Program Files\Sisense\Prism\Server\plugins. You may need to create the plugins folder if it doesn’t exist already. Now, restart your Elasticube service
In your Elasticube Manager, open up the Pre & Post Execution Plugins window and add a Pre Execution Plugin. Create the label as RunScript and copy and paste the script below using the script icon.
<Attributes
Script="C:\\prebuildPlugins\\runscript.bat">
</Attributes>
If you changed the directory where you unzipped the files, just update the Script attribute to match. Also, it’s important that backslashes in this field are escaped, just like in the same code/image below.
Once set, execute a build and watch the collection using a tool like Robomongo, to ensure the python application is running properly. Now you should have the data stored in MongoDB, so follow the typical instructions to create an Elasticube from Mongo
ADDITIONAL NOTES
There are a few other variables within the python script that you can edit
- mongoServer - Where to find your MongoDB Server
- mongoPort - The port that MongoDB is running on
- mongoDB - The name of the database you want to create and store data in
- mongoCollection - The name of the collection, within your MongoDB database
-
I’m trying to bring an xml document into the elasticube and I am receiving some errors. Is this something you can help me with?
I was using the above documentation and was using an xml directory.
https://support.sisense.com/hc/en-us/community/posts/115002913547-Working-with-XML-files
C:\>cd prebuildPlugins
Directory of C:\prebuildPlugins
C:\prebuildPlugins>python.exe Xml2MongoDB.py
Connecting to MongoDB, prepping the database
Traceback (most recent call last):
File "Xml2MongoDB.py", line 113, in <module>
main()
File "Xml2MongoDB.py", line 69, in main
collection.drop()
File "C:\Python27\lib\site-packages\pymongo\collection.py", line 1062, in drop
self.__database.drop_collection(self.__name, session=session)
File "C:\Python27\lib\site-packages\pymongo\database.py", line 657, in drop_collection
session=session)
File "C:\Python27\lib\site-packages\pymongo\database.py", line 439, in _command
client=self.__client)
File "C:\Python27\lib\site-packages\pymongo\pool.py", line 517, in command
collation=collation)
File "C:\Python27\lib\site-packages\pymongo\network.py", line 125, in command
parse_write_concern_error=parse_write_concern_error)
File "C:\Python27\lib\site-packages\pymongo\helpers.py", line 145, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
pymongo.errors.OperationFailure: not authorized on ExternalData to execute command { drop: "PersonData", writeConcern: {} }
Please sign in to leave a comment.
Comments
3 comments