Again I was at the point, where I needed to store and save to external file all the R code that was executed through sp_execute_external_script.
Soon, you will find out several interesting things. To show the example, I will start with following example:
USE [WideWorldImporters]; GO EXEC sys.sp_execute_external_script @language = N'R' ,@script = N' d <- InputDataSet c <- data.frame(Num_V1 = c(1,2,3)) c OutputDataSet <- c' ,@input_data_1 = N'SELECT 1 AS Nmbrs_From_R' WITH RESULT SETS ((Numbers_From_R INT));
The result is a column called “Numbers” with three rows, represented from the data frame. This is very easy and straight-forward.
By using dynamic management view sys.dm_exec_query_stats as following:
SELECT QM_ST.[TEXT] AS [Query] ,DM_QS.last_execution_time ,DM_QS.query_hash ,DM_QS.query_plan_hash FROM sys.dm_exec_query_stats AS DM_QS CROSS APPLY sys.dm_exec_sql_text(DM_QS.sql_handle) AS QM_ST ORDER BY DM_QS.last_execution_time DESC
Surprisingly I get only the following query returned:
sp_execute_external_script: SELECT 1 AS Nmbrs_From_R
which is far what was executed in the first place!
When using sys.dm_exec_query_plan dynamic management view to generate executed query plan, I get similar result with no R code and little sign of SQL query that was introduced to sp_execute_external_query procedure.
Relative the same results emerges when showing actual execution plan in SSMS. Only XML-UDX is showed.
So far, very slim possibility to get some extra and additional information from query statistics DMV or execution plan.
SQL SERVER PROFILER
So opening SQL Profiler and running the example sp_execute_external_script code, I was finally able to see the actual R code within profiler:
Upon closer look, we can see that profiler wraps execution of external procedure with following command SET STATISTICS XML ON/OFF. So we can store the results from profiler into a table or trace file and later filter out the R-code!
Query store is very very useful and new feature with flagship MSSQL2016. Storing the queries and execution times is therefore needed in order to do later performance analysis. So in this phase, let’s just see, if we can store external procedure code in query store.
With execution of R external procedure, I execute following query to check the Query Store (QS):
SELECT QSQT.query_text_id ,QSQT.query_sql_text ,QSP.plan_id FROM sys.query_store_plan AS QSP JOIN sys.query_store_query AS QSQ ON QSP.query_id = QSQ.query_id JOIN sys.query_store_query_text AS QSQT ON QSQ.query_text_id = QSQT.query_text_id
And the results are – in a way – not surprising at all, since many of query store statistics base on DMV. So result for my external procedure is again, very little informative in order to extract R code:
Something, we have seen already couple of times. And no sign of execution of R Script. In fact, looking from this, it is hard even to tell, this was passed to RLaunchpad.exe external program.
Sink is a R function to store the output of the executed R code into external file. With execution of any of the two T-SQL code, I will never be able to either get the results nor the R code itself.
In case of results:
EXEC sys.sp_execute_external_script @language = N'R' ,@script = N' sink("C:DataTKlogRSQLsession3.txt") d <- InputDataSet c <- data.frame(Num_V1 = c(1,2,3)) c sink() OutputDataSet <- c' ,@input_data_1 = N'SELECT 1 AS Nmbrs_From_R' WITH RESULT SETS ((Numbers_From_R INT)); EXEC sys.sp_execute_external_script @language = N'R' ,@script = N' c <- data.frame(Num_V1 = c(1,2,3)) c sink("C:DataTKlogRSQLsession3.txt")' ,@input_data_1 = N'SELECT 1 AS Nmbrs_From_R' WITH RESULT SETS NONE;
In both cases the file is created, but it is just that. Empty file. No content whatsoever.
Load will store intermediate results into file for later analysis or for semi aggreagated data, used for further calculations. So, I have tested it as following:
EXEC sys.sp_execute_external_script @language = N'R' ,@script = N' c <- data.frame(Num_V1 = c(1,2,3)) c save(c, file="C:DataTKlogRSQLsession3.rda") #load(file="C:DataTKlogRSQLsession3.rda")' ,@input_data_1 = N'SELECT 1 AS Nmbrs_From_R' WITH RESULT SETS NONE; -- LOAD RESULTS EXEC sys.sp_execute_external_script @language = N'R' ,@script = N' load(file="C:DataTKlogRSQLsession3.rda") OutputDataSet <- c' ,@input_data_1 = N'SELECT 1 AS Nmbrs_From_R' WITH RESULT SETS ((Num_V1 INT));
Extensibility Log will store information about the session but it will not store the R or R environment information or data, just session information and data. Navigate to:
C:Program FilesMicrosoft SQL ServerMSSQL13.MSSQLSERVERMSSQLLOGExtensibilityLog
to check the content and to see, if there is anything useful for your needs.
We are very limited in terms of exporting executed R code, results or Logs. Same applies for importing any additional code. We have seen that import, source are not working, whereas Load for loading *.rda files is working. At least something There should be more ways to get into the, especially with Rterm or Vanilla R, but the idea was to have everything run comfortably from the SSMS environment.
As you can see, there is little possibilities to store R code separately or store execution R logs in external files. But I presume, I haven’t exhausted all the possibilities, so there should be still some ways to try and do this.
As always, the code is available at Github.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…
Source:: R News