Monday, December 19, 2016

Advanced Analytics: Missing Data? Fret no more, fill in the Missing Data on Oracle DV using R

More often than not, we find data missing in our data sources. And how do we deal with it? we either remove that data or fill it manually. But these methods have a serious implication of skewing your data and giving you wrong insights leading to flawed analysis. No more of this manual guessing or hunch based substitutions or elimination of missing data on OracleDV. You can impute(fill) the missing data in your analysis using Predictive Analytics Algorithms using this Custom R-Script. This R-Scripts uses imputation algorithms present in R data imputation packages like MICE and Hmisc.

And it is quite easy to deploy this data imputation R-Scripts in your local machine. This technique can be used in Dataflows quite easily. The R-script returns the imputed data in a tabular format which can be saved as datasource in DV or can be exported/downloaded to an excel sheet and this excel sheet can be used in Dataflows. Here are the steps to install deploy this R Script in your OracleDV:

1) Install Advanced Analytics feature in Oracle DV by clicking on the below icon. This will install Oracle R deployment. Alternatively you can install Advanced Analytics by running install_advanced_analytics.cmd present in <DV_INSTALL_DIRECTORY>

2) If not installed Mice, Hmisc R-Packages already, please install them using following instructions
    Open R console(double click Rgui.exe present in <Advanced_Analytics_Install_Dir>\bin\x64) and
    install MICE,Hmisc Packages.
    Following are the R commands to install:
     Set Proxy:
        $ Sys.setenv(http_proxy="http://<your_proxy_host>:<port>")
           set proxy appropriate to your network settings.
     Install Package:
        $ install.packages("mice")
        $ install.packages("Hmisc")
3) Download from OracleBI Public Store and unzip it.
4) Copy R.ImputeValues.xml to <DV_INSTALL_DIRECTORY>\OracleBI1\bifoundation\advanced_analytics\script_repository
5) Import the .dva project to Oracle DV. Password for the .dva is Admin123

Here is a snapshot:

No comments:

Post a Comment