简体   繁体   English

R使用字符串来检索数据列

[英]R use a character string to retrieve data column

I am struggling with something that looks simple; 我正在努力寻找一些看起来很简单的东西; but I am stuck on it for quite some time now. 但是我已经坚持了很长时间。

I have a data.frame quite long but here's a sample that would represent it. 我有一个很长的data.frame,但是这里有一个样本可以代表它。

my.dataframe<-data.frame(PointA.X=sample(100,4))
my.dataframe$PointA.Y<-sample(100,4)
my.dataframe$PointB.X<-sample(100,4)
my.dataframe$PointB.Y<-sample(100,4)

     PointA.X PointA.Y PointB.X PointB.Y
1       93       98       46       45
2       58        3       80       89
3       61       64       17       14
4       56       46       65       23

I am looking for making a function that would use two arguments from which more would be arguments would arise. 我正在寻找一个使用两个参数的函数,从中将出现更多的参数。

MyFunction<-function(Start, End){
XStart <- get(as.character(paste0("Mydataframe$" , Start , ".X")))
XEnd   <- get(as.character(paste0("Mydataframe$" , End   , ".X")))
YStart <- get(as.character(paste0("Mydataframe$" , Start , ".Y")))
YEnd   <- get(as.character(paste0("Mydataframe$" , End   , ".Y" )))
sqrt(((XStart - XEnd) ^ 2 + (YStart - YEnd) ^ 2))
} # End of My Function

In this case I would define the StartPoint and the EndPoint to calculate the length of a segment between them. 在这种情况下,我将定义起点和终点以计算它们之间的线段长度。 MyFunction("PointA", "PointB") MyFunction(“ PointA”,“ PointB”)

To my understanding in 据我了解

MyFunction("PointA", "PointB")

the following 下列

as.character(paste0("Mydataframe$" , Start , ".X")) 

returns 退货

"Mydataframe$PointA.X"

which is a valid column in my dataframe Using get() is looking for an object instead of looking for the actual data. 这是我数据框中的有效列。使用get()查找对象而不是查找实际数据。

That's where I am stuck. 那就是我被困住的地方。 Is there a function for returning to the value I am looking for? 是否有返回我想要的值的函数?

Thank you all in advance 谢谢大家

Try this. It may help.


MyFunction<-function(Start, End){
XStart <- eval(parse(text=paste("my.dataframe$",Start,".X", sep = "")))
XEnd   <- eval(parse(text=paste("my.dataframe$",End,".X", sep = "")))
YStart <- eval(parse(text=paste("my.dataframe$",Start,".Y", sep = "")))
YEnd   <- eval(parse(text=paste("my.dataframe$",End,".Y", sep = "")))
sqrt(((XStart - XEnd) ^ 2 + (YStart - YEnd) ^ 2))
}

As suggested by Richard, it is possible to subset a data frame using a string into the brackets [[]] but not with the $ symbol. 正如Richard所建议的,可以使用字符串将数据子集放入括号[[]]中,但不能使用$符号。

So advice for the future: use brackets... 所以对未来的建议:使用方括号...

  MyFunction<-function(Start, End){
  XStart <- my.dataframe[[paste0(Start, ".X")]]
  YStart <- my.dataframe[[paste0(Start, ".Y")]]

  XEnd <- my.dataframe[[paste0(End, ".X")]]
  YEnd <- my.dataframe[[paste0(End, ".Y")]]

  sqrt(((XStart - XEnd) ^ 2 + (YStart - YEnd) ^ 2))
} # End of My Function

MyFunction("PointA", "PointB") # Note the arguments are provided as characters
> [1] 39.20459 80.52950 34.17601  6.00000

More interesting I can also loop the function across the column names. 更有趣的是,我还可以跨列名称循环该函数。 So if the datagrams is longer. 因此,如果数据报更长。

my.dataframe<-data.frame(PointA.X=sample(100,4))
my.dataframe$PointA.Y<-sample(100,4)
my.dataframe$PointB.X<-sample(100,4)
my.dataframe$PointB.Y<-sample(100,4)
my.dataframe$PointC.X<-sample(100,4)
my.dataframe$PointC.Y<-sample(100,4)

And the function remains the same: 并且功能保持不变:

MyFunction<-function(Start, End){
XStart <- my.dataframe[[paste0(Start, ".X")]]
YStart <- my.dataframe[[paste0(Start, ".Y")]]

XEnd <- my.dataframe[[paste0(End, ".X")]]
YEnd <- my.dataframe[[paste0(End, ".Y")]]

sqrt(((XStart - XEnd) ^ 2 + (YStart - YEnd) ^ 2))
} # End of My Function

I can build a for loop : 我可以建立一个for循环:

for (VariableI in seq(from=1, to=length(colnames(my.dataframe)), by=2)){
Start<-unlist(strsplit(colnames(my.dataframe)[VariableI], "[.]"))[1]
End<-unlist(strsplit(colnames(my.dataframe)[VariableI+2], "[.]"))[1]
assign(paste0(Start,End), MyFunction(Start, End)) 
}

Create the following objects 创建以下对象

 PointAPointB
    [1] 32.57299 74.30343 73.08215 83.25863
    PointBPointC
    [1]  5.385165 90.609050 68.883960 58.137767

I guess I am just missing PointAPointC. 我想我只是想念PointAPointC。 I might use the combine function to walk around this 我可能会使用Combine函数来解决这个问题

 combn(colnames(my.dataframe), 2)

    [,1]       [,2]       [,3]       [,4]       [,5]       [,6]       [,7]       [,8]       [,9]       [,10]     
[1,] "PointA.X" "PointA.X" "PointA.X" "PointA.X" "PointA.X" "PointA.Y" "PointA.Y" "PointA.Y" "PointA.Y" "PointB.X"
[2,] "PointA.Y" "PointB.X" "PointB.Y" "PointC.X" "PointC.Y" "PointB.X" "PointB.Y" "PointC.X" "PointC.Y" "PointB.Y"
     [,11]      [,12]      [,13]      [,14]      [,15]     
[1,] "PointB.X" "PointB.X" "PointB.Y" "PointB.Y" "PointC.X"
[2,] "PointC.X" "PointC.Y" "PointC.X" "PointC.Y" "PointC.Y"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM