Advanced Forecasting with R
FeaturedARIMA Model Forecasting
ARIMA Model Forecasting Mean and Upper 95% Limit
Overview
In this article, we tackle how to produce a robust forecast by using the Sisense and R integration. This usecase specifically focuses on a scenario where we only have data up until today.
Understanding the Nuance of R
For forecasting functionality, we must consider this law of the R integration:
The length of the vector of data we pass to R, determined by the Xaxis, equals the vector of data we receive from R.
Length of vector out (nout) = Vector In (nin)
This raises a question, however:
Doesn’t a forecast return additional days to what we have in the X axis, and therefor nin > nout?
This is precisely the issue we tackle in this article.
HighLevel Procedure
 Data Modeling
 Creating records for future dates
 R Script Modification
 Modifying the R script to slice the nin vector to future and past components
 Running the forecast on past values and returning forecasted values
 Concatenating past and forecasted values for returning to Sisense

FactData 

Date 
Units 
dimension of interest Company 
10/3/2018 
368 
a 
10/4/2018 
793 
a 
10/5/2018 
557 
a 
10/3/2018 
419 
b 
10/4/2018 
297 
b 
10/5/2018 
769 
b 
10/3/2018 
498 
c 
10/4/2018 
940 
c 
Assuming data is uptodate to the current day, we will use custom SQL expressions to extend our data into the future with dummy records. We will need to do this for every dimension we expect to analyze our data by. Above is the table we will be using to demonstrate our modeling strategy. We’ll call it “factData”. Not all dimensions may have data available for the last day, in this case, the 5^{th}. Company c data is only available until the 4^{th}, so we will need to populate our future values dynamically across dimensions.
Data Modeling Procedure
 Create a csv or excel document with a single column called “nums” and values 1 – n. n represents how many days out you will forecast. We’ll call this document “extraDays”
 Create a custom SQL table and create the following expression
SELECT *, ‘base’ as type
FROM “factData”
UNION ALL
SELECT
*
FROM (
SELECT ADDDAYS(a.Date, b.nums) AS newDate, 1 as Units, Company, ‘forecast’ as type
FROM (
SELECT
Company, MAX(Date) AS maxDate
FROM FactData
GROUP BY Company
) a
CROSS JOIN extraDays b
) z
R Script Modification
With our data modeling done we will have a full vector of data incoming to R from Sisense. This data will include valuable historical data and temporary dummy future values, in the case above, 1. We will handle slicing the data to represent past values and future values with R vector slicing.
One convenient way to control how to select historical values is by creating a function that will accept the vector of records we run our analysis on, and another value that specifies how many days out we will forecast (matching the n values in the excel document)
module < function(fullArgs, forecastDays=30, frq=30){#function body}
We will use the forecastDays variable for both setting our forecast and slicing our vector. This can be done using the following syntax:
# Start of function
truncArgs < fullArgs[1:(length(fullArgs)forecastDays)]
We will run our analysis on a dataframe we build off truncArgs:
df3<data.frame(fcastValues=truncArgs)
In this example we are running an ARIMA model and returning 30 days:
mySeries = ts(df3[, c(‘fcastValues’)])
library('forecast')
tsSeries = ts(na.omit(mySeries), frequency=frq)
decomp = stl(tsSeries, s.window="periodic")
deseasonal_cnt < seasadj(decomp)
fit_w_seasonality = auto.arima(deseasonal_cnt, seasonal=TRUE)
seas_fcast < forecast(fit_w_seasonality, h=forecastDays)
futureValues < seas_fcast[["mean"]]
We will combine our original truncArgs with our future values into a single vector and return this vector back to Sisense. Doing this ensures we maintain the Length of vector out (nout) = Vector In (nin) rule.
resultSet < c(truncArgs, Sisense_Result)
return(resultSet)
# End of function
Save the R script on the server for it to be called during the widget query.
Writing the Sisense Function
In Sisense we simply use the RDouble function and pass in the R function arguments as expected. In the R function defined above we included default values for arguments, so we do not need to include them as they default to 30.
RDOUBLE(True,
"save(args, file='C:/Users/user/Documents/Rlog/log.txt');
source('C:/Users/user/Documents/Sisense/Rscript.R');
module (args[[1]]);"
, [Average Units])
We created a column in our data, so we can now use it to differentiate the forecast from past values by using the “break by” in the widget editor. This is how we achieved the green to orange shift in the first example picture.
Key Words
Forecast, Seasonal Forecast, Seasonal Decomposition, ARIMA forecasting, Forecasting with Sisense, Advanced BI Forecasting
Please sign in to leave a comment.
Comments
0 comments