Advanced Forecasting with R
ARIMA Model Forecasting
ARIMA Model Forecasting Mean and Upper 95% Limit
Overview
In this article, we tackle how to produce a robust forecast by using the Sisense and R integration.
Understanding the Nuance of R
For forecasting functionality, we must consider this law of the R integration:
The length of the vector of data we pass to R, determined by the Xaxis, equals the vector of data we receive from R.
Length of vector out (nout) = vector in (nin)
This raises a question, however:
Doesn’t a forecast return additional days to what we have in the X axis, and therefor nin > nout?
This is precisely the issue we tackle in this article.
HighLevel Procedure
 Data Modeling
 Creating records for future dates
 R Script Modification
 Modifying the R script to slice the nin vector to future and past components
 Running the forecast on past values and returning forecasted values
 Concatenating past and forecasted values for returning to Sisense

FactData 

Date 
Units 
dimension of interest Company 
10/3/2018 
368 
a 
10/4/2018 
793 
a 
10/5/2018 
557 
a 
10/3/2018 
419 
b 
10/4/2018 
297 
b 
10/5/2018 
769 
b 
10/3/2018 
498 
c 
10/4/2018 
940 
c 
We will use custom SQL expressions to extend our data into the future with dummy records. We will need to do this for every dimension we expect to analyze our data by. Above is the table we will be using to demonstrate our modeling strategy. We’ll call it “factData”. Not all dimensions may have data available for the last day, in this case, the 5^{th}. Company c data is only available until the 4^{th}, so we will need to populate our future values dynamically across dimensions.
Data Modeling Procedure
 Create a csv or excel document with a single column called “nums” and values 1 – n. n represents how many days out you will forecast. We’ll call this document “extraDays”
 Create a custom SQL table and create the following expression
SELECT *, ‘base’ as type
FROM “factData”
UNION ALL
SELECT
*
FROM (
SELECT ADDDAYS(a.Date, b.nums) AS newDate, 1 as Units, Company, ‘forecast’ as type
FROM (
SELECT
Company, MAX(Date) AS maxDate
FROM FactData
GROUP BY Company
) a
CROSS JOIN extraDays b
) z
R Script Modification
With our data modeling done we will have a full vector of data incoming to R from Sisense. This data will include valuable historical data and temporary dummy future values, in the case above, 1. We will handle slicing the data to represent past values and future values with R vector slicing.
One convenient way to control how to select historical values is by creating a function that will accept the vector of records we run our analysis on, and another value that specifies how many days out we will forecast (matching the n values in the excel document)
module < function(fullArgs, forecastDays=30, frq=30){#function body}
We will use the forecastDays variable for both setting our forecast and slicing our vector. This can be done using the following syntax:
# Start of function
truncArgs < fullArgs[1:(length(fullArgs)forecastDays)]
We will run our analysis on a dataframe we build off truncArgs:
df3<data.frame(fcastValues=truncArgs)
In this example we are running an ARIMA model and returning 30 days:
mySeries = ts(df3[, c(‘fcastValues’)])
library('forecast')
tsSeries = ts(na.omit(mySeries), frequency=frq)
decomp = stl(tsSeries, s.window="periodic")
deseasonal_cnt < seasadj(decomp)
fit_w_seasonality = auto.arima(deseasonal_cnt, seasonal=TRUE)
seas_fcast < forecast(fit_w_seasonality, h=forecastDays)
futureValues < seas_fcast[["mean"]]
We will combine our original truncArgs with our future values into a single vector and return this vector back to Sisense. Doing this ensures we maintain the length of vector out (nout) = vector in (nin) rule.
resultSet < c(truncArgs, futureValues)
return(resultSet)
# End of function
Save the R script on the server for it to be called during the widget query.
Writing the Sisense Function
In Sisense we simply use the RDouble function and pass in the R function arguments as expected. In the R function defined above we included default values for arguments, so we do not need to include them as they default to 30.
Replace [Average Units] with a measure you are forecasting. It can be of any aggregation.
RDOUBLE(True,
"save(args, file='C:/Users/user/Documents/Rlog/log.txt');
source('C:/Users/user/Documents/Sisense/Rscript.R');
module (args[[1]]);"
, [Average Units])
We created a "type" column in our data, so we can now use it to differentiate the forecast from past values by using the “break by” in the widget editor. This is how we achieved the green to orange shift in the first example picture.
Key Words
Forecast, Seasonal Forecast, Seasonal Decomposition, ARIMA forecasting, Forecasting with Sisense, Advanced BI Forecasting

Hi Artem,
Could you maybe help me out? When using the script, I keep getting the following error:
Failed to execute the R model  Error in source("C:/Users/user/Documents/forecast.R") :
C:/Users/user/Documents/forecast.R:5:23: unexpected input
4: df3<data.frame(fcastValues=truncArgs)
5: mySeries = ts(df3[, 
Hi Hamza,
The script uses a placeholder directory and Rscript name.
Once you've written your model function you can save it to an R script and then reference the function from the Sisense function editor.
In the example above, I created an R script with a function module and then called it using the source command.
Does this help?
Cheers
Artem 
Can you attach your RDouble formula?
Considering the error is vague I would first test the function and forecasting model in R and make sure that it is returning appropriate results. Start simple by using a CSV with a single dimension's time series data and use that as the source of your R input. Run the model one line at a time and review the results in Rstudio's Environment Data and Values panel. This should give you a good idea of if your model is correct in the first place. Once validated you can bring the entire function body into the RDouble function and test by replacing your csv input with args[[1]]. If your output is correct you can package the R script into a function and Rscript file, confirm the file's permissions and reference it in RDouble. I've had different environments behave differently with the "/", so you may need to escape it appropriately with the double "//" syntax. Hope this helps!
Cheers,
Artem 
Artem Yevtushenko thanks for such a great guide.
Could you please explain on how to build such a plot as on your third screen?

Yaroslav Revenko It's actually quite simple! You have an average and upper 80th (or 90th) alpha vector to work with. Each one is represented by a separate measure in the chart. You can layer the measures visualized as area charts, as shown in example #2 and #3, to show the average first and the alpha level behind. If you make the average measure white than it will hide the measure's area underneath giving the effect of the diverging range. From there it's just a matter of increasing the area opacity.
Another option for visualizing the interval is using this post: https://support.sisense.com/hc/enus/community/posts/360016783614AddingaRangeAreatoametricinaLineChart
Please sign in to leave a comment.
Comments
9 comments