Monday, December 19, 2016

Advanced Analytics: Perform Sentiment Analytics on DV using Custom R Scripts

In this blog we will discuss how to perform Sentiment Analysis on Oracle DV on textual data like, Product reviews, customer feedback and social media posts etc. It is well known that OracleDV supports R-Integration and allows users to run their Custom R-scripts. This integration is quite versatile and powerful because Oracle DV allows users to fetch results from R-Scripts in a tabular format and mash it up with data sources. In this example Sentiment Analysis is implemented using a custom R-Script which returns the tonality of the textual data. This example can be downloaded from Oracle BI Public Store.

The R-Script takes textual data as input and categorizes input into 6 categories based on tonality of the data: Very Positive, Positive, Neutral, Negative, Very Negative and Sarcasm. This tonality information can be mashed up with your source data to gain further insights. In Dataflows, you can enrich the data with sentiment information returned by the R-script. Results returned by the R-script can also be downloaded/exported to excel sheets, which can then be used in Dataflows.

Following are the steps to deploy this example in OracleDV desktop:

1) Install Advanced Analytics feature in Oracle DV by clicking on the below icon. This will install Oracle R deployment. Alternatively you can install Advanced Analytics by running install_advanced_analytics.cmd present in <DV_INSTALL_DIRECTORY>


2) If not installed RSentiment Package already, please install it using following instructions
    Open R console(double click Rgui.exe present in <Advanced_Analytics_Install_Dir>\bin\x64),
    install arules Package. Following are the R-commands to install:
     Set Proxy:
        $ Sys.setenv(http_proxy="<your_proxy_host>:<port_number>")
           set proxy appropriate to your network config.
     Install Package(updated instructions):
        $ install.packages("http://cran.r-project.org/src/contrib/Archive/RSentiment/RSentiment_1.0.4.tar.gz",repos=NULL, type="source")
3) Download Sentiment_Analysis_V1.zip from OracleBI Public Store and unzip it.
4) Copy R.Sentiment.xml to <DV_INSTALL_DIRECTORY>\OracleBI1\bifoundation\advanced_analytics\script_repository
5) Import the .dva project to Oracle DV. Password for the .dva is Admin123

Here is a snapshot:







6 comments:

Recycle/Reuse/Re-sew said...

Unfortunately, the sample demo did not work properly for me. Here is the error message:
Odbc driver returned an error (SQLExecDirectW).
State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred.
(HY000)
State: HY000. Code: 43113. [nQSError: 43113] Message returned from OBIS.
(HY000)
State: HY000. Code: 43119. [nQSError: 43119] Query Failed:
Error(s): Error in data.frame(r1, words): arguments imply differing number of rows: 9, 10 (HY000)
SQL Issued: SET VARIABLE ENABLE_DIMENSIONALITY = 1; SELECT
0 s_0,
EVALUATE_SCRIPT('filerepo://R.Sentiment.xml','sentiment','recid=%1;txt=%2',XSA('weblogic'.'Customer_Reviews')."Columns"."Review ID",XSA('weblogic'.'Customer_Reviews')."Columns"."Feedback") s_1,
XSA('weblogic'.'Customer_Reviews')."Columns"."Product Category" s_2,
CAST(NULL AS INTEGER) s_3,
CAST(NULL AS INTEGER) s_4,
REPORT_AGGREGATE(XSA('weblogic'.'Customer_Reviews')."Columns"."# of Reviews" BY ) s_5,
REPORT_SUM(CASE WHEN XSA('weblogic'.'Customer_Reviews')."Columns"."# of Reviews" <0 THEN XSA('weblogic'.'Customer_Reviews')."Columns"."# of Reviews" ELSE 0 END BY XSA('weblogic'.'Customer_Reviews')."Columns"."Product Category") s_6,
REPORT_SUM(CASE WHEN XSA('weblogic'.'Customer_Reviews')."Columns"."# of Reviews" >0 THEN XSA('weblogic'.'Customer_Reviews')."Columns"."# of Reviews" ELSE 0 END BY XSA('weblogic'.'Customer_Reviews')."Columns"."Product Category") s_7,
XSA('weblogic'.'Customer_Reviews')."Columns"."# of Reviews" s_8
FROM XSA('weblogic'.'Customer_Reviews')
WHERE
(XSA('weblogic'.'Customer_Reviews')."Columns"."Product Category" IN ('Cameras', 'Computers', 'Music Players'))
FETCH FIRST 5000001 ROWS ONLY

ORACLE BI TECHDEMO said...

Hi, Can you try installing RSentiment package version 1.0.4 using this command:

install.packages("http://cran.r-project.org/src/contrib/Archive/RSentiment/RSentiment_1.0.4.tar.gz",repos=NULL, type="source")

Also can you please try it with the latest script

can you please confirm if you encountered that problem even with

Unknown said...

I installed the package from the command you pasted, however, I am still getting error:
Odbc driver returned an error (SQLExecDirectW).
State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred.
(HY000)
State: HY000. Code: 43113. [nQSError: 43113] Message returned from OBIS.
(HY000)
State: HY000. Code: 43119. [nQSError: 43119] Query Failed:
Error(s): Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called 'stringi' (HY000)
SQL Issued: SET VARIABLE ENABLE_DIMENSIONALITY = 1; SELECT
0 s_0,
EVALUATE_SCRIPT('filerepo://R.Sentiment.xml','sentiment','recid=%1;txt=%2',XSA('weblogic'.'Customer_Reviews')."Columns"."Review ID",XSA('weblogic'.'Customer_Reviews')."Columns"."Feedback") s_1,
XSA('weblogic'.'Customer_Reviews')."Columns"."Feedback" s_2,
XSA('weblogic'.'Customer_Reviews')."Columns"."Product Category" s_3,
XSA('weblogic'.'Customer_Reviews')."Columns"."Review ID" s_4
FROM XSA('weblogic'.'Customer_Reviews')
WHERE
(XSA('weblogic'.'Customer_Reviews')."Columns"."Product Category" IN ('Cameras', 'Computers', 'Music Players'))
FETCH FIRST 5000001 ROWS ONLY
SQL Issued: SET VARIABLE ENABLE_DIMENSIONALITY = 1; SELECT
0 s_0,
EVALUATE_SCRIPT('filerepo://R.Sentiment.xml','sentiment','recid=%1;txt=%2',XSA('weblogic'.'Customer_Reviews')."Columns"."Review ID",XSA('weblogic'.'Customer_Reviews')."Columns"."Feedback") s_1,
XSA('weblogic'.'Customer_Reviews')."Columns"."Feedback" s_2,
XSA('weblogic'.'Customer_Reviews')."Columns"."Product Category" s_3,
XSA('weblogic'.'Customer_Reviews')."Columns"."Review ID" s_4
FROM XSA('weblogic'.'Customer_Reviews')
WHERE
(XSA('weblogic'.'Customer_Reviews')."Columns"."Product Category" IN ('Cameras', 'Computers', 'Music Players'))
FETCH FIRST 5000001 ROWS ONLY

Amit Sehrawat said...

installed the stringi package and it's working now. Thanks.

Amit Sehrawat said...

I have few questions over packages effectiveness. If you look at your own screenshot in the post above, Feedback: "Highly recommended" is categorized as Very Negative, which should not be case. Is there a way to tune the package?

ORACLE BI TECHDEMO said...

Hi Amit, We agree about the packages being not so effective. We are looking at other alternatives, Syuzhet is one. Please let us know if you know any packages that can perform better.

Post a Comment