简体   繁体   English

如何从数据中子集列

[英]How do I subset columns from on data

 rankhospital <- function(state = factor(), outcome = factor(), num = factor()) { #read data caremeasures <- read.csv("D:/data science specialization/course stuff/rprogw3/outcome-of-care-measures.csv", na.strings = "NA", stringsAsFactors = FALSE) #separate required columns requiredOutcomes <- caremeasures[11, 17, 23] #assign columns names names(requiredOutcomes[3]) <- "heart attack" names(requiredOutcomes[4]) <- "heart failure" names(requiredOutcomes[5]) <- "pneumonia" arrangedData <- order(requiredOutcomes[caremeasures$State == state, c(caremeasures$Hospital.Name, outcome)]) if (num == "best"){ result <- arrangedData[1, 1] return(result) } else if (num == "worst"){ result <- arrangedData[nrow(arrangedData[,1]),1] return(result) } else result <- arrangedData[num, 1] return(result) }

This code is supposed to return the name of a single hospital that corresponds to the inputs given to the function, yet I'm getting an error stating:此代码应该返回与提供给 function 的输入相对应的单个医院的名称,但我收到一条错误消息:

Error in requiredOutcomes[caremeasures$State == state, c(caremeasures$Hospital.Name,  : 
  incorrect number of dimensions

I don't have your data, so I'll project what I think is leading up to the problem.我没有你的数据,所以我会预测我认为导致问题的原因。

The reference to caremeasures[11, 17, 23] is likely not doing what you need, and it is therefore returning something you aren't expecting.caremeasures[11, 17, 23]的引用可能没有做你需要的事情,因此它返回了你不期望的东西。 Try it with caremeasures[,c(11, 17, 23)] .试试caremeasures[,c(11, 17, 23)]

I'll try to show what's going on using mtcars :我将尝试展示使用mtcars发生了什么:

requiredOutcomes <- mtcars[1,2,3]
requiredOutcomes
# [1] 6
requiredOutcomes[1,2]
# Error in requiredOutcomes[1, 2] : incorrect number of dimensions

Because mtcars is a data.frame , your [ indexing uses [.data.frame under the hood.因为mtcars是一个data.frame ,所以您的[索引在后台使用[.data.frame This translates something like这翻译成类似

# equivalent
mtcars[1, 2]
`[.data.frame`(mtcars, 1, 2)

The "arguments" (yes, it's just a regular function) are: “参数”(是的,它只是一个常规函数)是:

str(formals(`[.data.frame`))
# Dotted pair list of 4
#  $ x   : symbol 
#  $ i   : symbol 
#  $ j   : symbol 
#  $ drop: language if (missing(i)) TRUE else length(cols) == 1

which means that your 11, 17, 23 arguments are effectively这意味着您的11, 17, 23 arguments 有效

`[.data.frame`(requireOutcomes, 11, 17, 23)

which is applied to the arguments as应用于 arguments 为

`[.data.frame`(x = requireOutcomes, i = 11, j = 17, drop = 23)

Okay, so x= makes sense (the data).好的,所以x=有意义(数据)。 i= gives your row selection (11), and j= gives the column selection. i=给出您的行选择 (11),而j=给出列选择。 However, when R expected a logical , then anything that is not zero is considered TRUE , so this is effectively但是,当 R 期望为logical时,任何不为零的都被认为是TRUE ,所以这是有效的

`[.data.frame`(x = requireOutcomes, i = 11, j = 17, drop = TRUE)

which completely loses your intent (I suspect) by returning a scalar (single value, a vector of length 1 in R).通过返回一个标量(单个值,R中长度为1的向量)完全失去了你的意图(我怀疑)。 Side note: had you used 0 or FALSE , then you would have returned a data.frame with 1 row and 1 column.旁注:如果您使用0FALSE ,那么您将返回一个具有 1 行和 1 列的data.frame

Here's a method for debugging what's going on so that you are able to find this for yourself next time.这是一种调试正在发生的事情的方法,以便您下次能够自己找到它。

myfunc <- function(x) {
  res <- x[1,2,3]
  return(res[1:3,])
}
myfunc(mtcars)
# Error in res[1:3, ] (from #3) : incorrect number of dimensions

Okay, we see the same error.好的,我们看到了同样的错误。 We'll use debug(myfunc) (whatever your function name is), though you can achieve similar results by placing browser() at a specific place within your function.我们将使用debug(myfunc) (无论您的 function 名称是什么),尽管您可以通过将browser()放在 function 中的特定位置来获得类似的结果。

debug(myfunc)
myfunc(mtcars)
# debugging in: myfunc(mtcars)
# debug at #1: {
#     res <- x[1, 2, 3]
#     return(res[1:3, ])
# }
# Browse[2]> 

We're now in R's debugger, giving us step execution tracing.我们现在在 R 的调试器中,为我们提供了步骤执行跟踪。 Typing n executes the n ext line;键入n执行n行; you can read more of the commands with ?browser .您可以使用?browser阅读更多命令。

n
# debug at #2: res <- x[1, 2, 3]
# Browse[2]> 
n
# debug at #3: return(res[1:3, ])
# Browse[2]> 
res
# [1] 6

("debug at" shows the next line to be executed, so we have not yet run return(...) .) With this, we can see that res -- which we think should be a data.frame -- is just a single number. (“debug at”显示要执行的下一行,所以我们还没有运行return(...) 。)这样,我们可以看到res我们认为应该是一个data.frame只是一个号码。 Huh.嗯。 Now look back at the code and figure out what happened.现在回头看看代码并弄清楚发生了什么。 To me (and in this simple example), it's clearly x[1,2,3] that's a problem.对我来说(在这个简单的例子中),很明显x[1,2,3]是个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM