Testing the parallel processing brought by the RevoScaleR package with SQL Server 2016 R Services (In-Database). Following the example provided by Microsoft here https://docs.microsoft.com/en-us/sql/advanced-analytics/tutorials/r-tutorial-custom-r-functions?view=sql-server-2016 . However, not like claimed in the doc, I didn't see parallelism happen. Anyone know why?
The SQL Server was installed on premise with 8 cores. The only extra settings made on top of the example are:
set elemType = 'cores' for rxExec.
set consoleOutput = TRUE for RxInSqlServer.
My testing script in T-SQL is:
EXEC sp_execute_external_script @language = N'R',
@script = N'
# set up the connection string
sqlConnString <- "Driver=SQL Server;server=.;
database=master;
Trusted_Connection=True"
sqlCompute <- RxInSqlServer(connectionString = sqlConnString, consoleOutput = TRUE, numTasks= 4)
rxSetComputeContext(sqlCompute)
rollDice <- function()
{
cat(paste0("R Process ID = ", Sys.getpid(), " started at ", Sys.time()))
cat("\n")
result <- NULL
point <- NULL
count <- 1
while (is.null(result))
{
roll <- sum(sample(6, 2, replace=TRUE))
if (is.null(point))
{ point <- roll }
if (count == 1 && (roll == 7 || roll == 11))
{ result <- "Win" }
else if (count == 1 && (roll == 2 || roll == 3 || roll == 12))
{ result <- "Loss" }
else if (count > 1 && roll == 7 )
{ result <- "Loss" }
else if (count > 1 && point == roll)
{ result <- "Win" }
else { count <- count + 1 }
}
cat(paste0("R Process ID = ", Sys.getpid(), "completed at ", Sys.time()))
cat("\n")
result
}
sqlServerExec <- rxExec(rollDice, timesToRun=8, elemType = "cores", RNGseed="auto")
return(NULL)',
@parallel = 1
The 8 runs are clearly executed sequentially based on the console output:
STDOUT message(s) from external script:
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:10.60 ======
R Process ID = 7620 started at 2019-08-29 11:37:10.97
R Process ID = 7620completed at 2019-08-29 11:37:11.03
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:11.08 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:12.27 ======
R Process ID = 9072 started at 2019-08-29 11:37:12.80
R Process ID = 9072completed at 2019-08-29 11:37:12.84
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:12.88 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:14.29 ======
R Process ID = 8728 started at 2019-08-29 11:37:15.07
R Process ID = 8728completed at 2019-08-29 11:37:15.10
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:15.15 ======
STDOUT message(s) from external script:
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:16.31 ======
R Process ID = 8444 started at 2019-08-29 11:37:16.87
R Process ID = 8444completed at 2019-08-29 11:37:16.91
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:16.97 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:18.18 ======
R Process ID = 8244 started at 2019-08-29 11:37:18.72
R Process ID = 8244completed at 2019-08-29 11:37:18.85
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:18.93 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:20.00 ======
R Process ID = 2332 started at 2019-08-29 11:37:20.54
R Process ID = 2332completed at 2019-08-29 11:37:20.59
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:20.63 ======
STDOUT message(s) from external script:
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:21.62 ======
R Process ID = 336 started at 2019-08-29 11:37:22.24
R Process ID = 336completed at 2019-08-29 11:37:22.27
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:22.32 ======
====== WIN-6L7QANR32DF ( process 1 ) has started run at 2019-08-29 11:37:23.38 ======
R Process ID = 8280 started at 2019-08-29 11:37:23.88
R Process ID = 8280completed at 2019-08-29 11:37:23.91
====== WIN-6L7QANR32DF ( process 1 ) has completed run at 2019-08-29 11:37:23.96 ======
The Microsoft's doc seems to be misleading. Changing the computation context to RxInSqlServer doesn't seem to parallel, instead using RxLocalParallel worked.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.