R scripts in Microsoft SQL Server Management Studio

Question

My problem is that I can not understand the error message of this environment. I think it is very vague. Now I do not understand where the problem is.

EXEC sp_execute_external_script
  @language = N'R',
  @script = N'
    count = 0; x=1; y=2; m="that is good until here"
    data = as.vector(data);
    for(i in data){
        if(data[y]>data[x]){count=count+1; x=x+1; y=y+1}
        else{x=x+1; y=y+1}};
    count <- data.frame(count)',
    @output_data_1_name = N'count',
    @input_data_1_name = N'data',
    @input_data_1 = N'SELECT alcohol FROM [wine].[dbo].[wineT]'

Answer 1

Untested, try this:

EXEC sp_execute_external_script
  @language = N'R',
  @script = N'
    data = unlist(data);
    count = data.frame(count = sum(data[-length(data)] > data[-1]);',
  @output_data_1_name = N'count',
  @input_data_1_name = N'data',
  @input_data_1 = N'SELECT alcohol FROM [wine].[dbo].[wineT]'

Issues:

as.vector does not do much to a data.frame , ergo the shift to unlist(data) ;
Your missing value error is because you extend y out beyond the length of data . For instance, on the R console I can reproduce the error with this:
```
 for (i in data) { if (data[y] > data[x]) { count=count+1; x=x+1; y=y+1} else {x=x+1; y=y+1} } # Error in if (data[y] > data[x]) { (from #1): missing value where TRUE/FALSE needed count # [1] 4 x # [1] 10 y # [1] 11
```
Since length(data) is 10, then data[y] is data[11] is NA . This leads to a conditional of NA > 3 which returns NA which does not work in an if conditional. (FYI, an if conditional must always be length-1, and it must be clearly "truthy", meaning TRUE or FALSE , or a number where 0 is false and anything else is true.)
An alternative to this creates i as an index on data starting at 2 .
```
 count <- 0 for(i in seq_along(data)[-1]) { if (data[i-1] > data[i]) { count=count+1 }; x=x+1; y=y+1; } count # [1] 4
```
where seq_along(data) produces (in this example) 1:10 , but [-1] removes the first 1 , so we can index safely from 2 until the length of data .
Better yet, though, is that we don't need to loop at all: all you want to do is compare each value (except the first) with the preceding value and count how many times the previous number is greater. R vectorizes very well, so we can determine in one expression which meet that condition, and sum them up just as quickly.
```
 data # a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 # 1 5 10 8 2 4 6 9 7 3 data[-length(data)] > data[-1] # a1 a2 a3 a4 a5 a6 a7 a8 a9 # FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE TRUE
```
and sum(..) that up to get our needed result.

Answer 2

I know it is not a tidy and efficient answer but I get the right answer with this code.

  EXEC sp_execute_external_script
      @language = N"R",
      @script = N"
        count=0; x=1; y=2; z=NA;
        data = unlist(data);
        for(i in data){
            if(is.na(z)){z=FALSE}else{
            if(data[y]>data[x]){count=count+1; x=x+1; y=y+1}
            else{x=x+1; y=y+1}}};
        count <- data.frame(count)",
        @output_data_1_name = N"count",
        @input_data_1_name = N"data",
        @input_data_1 = N"SELECT column1 FROM [wine].[dbo].[data]"

R scripts in Microsoft SQL Server Management Studio

Question

2 answers

solution1
0 2021-12-14 17:01:08

solution2
0 2021-12-15 09:53:40

R scripts in Microsoft SQL Server Management Studio

Question

2 answers

solution1 0 2021-12-14 17:01:08

solution2 0 2021-12-15 09:53:40

solution1
0 2021-12-14 17:01:08

solution2
0 2021-12-15 09:53:40