简体   繁体   English

使用read.csv跳过r中的最后一列

[英]Skipping last column in r with read.csv

I was on that post read.csv and skip last column in R but did not find my answer, and try to check directly in Answer ... but that's not the right way (thanks mjuarez for taking the time to get me back on track. 我在那篇文章read.csv上跳过R中的最后一栏,但没有找到我的答案,并尝试直接在答案中查看...但这不是正确的方法(感谢mjuarez花时间让我回到正轨。

The original question was: 最初的问题是:

I have read several other posts about how to import csv files with read.csv but skipping specific columns. 我已经阅读了几篇关于如何使用read.csv导入csv文件但跳过特定列的帖子。 However, all the examples I have found had very few columns, and so it was easy to do something like: 但是,我发现的所有示例都只有很少的列,所以很容易做到这样的事情:

  columnHeaders <- c("column1", "column2", "column_to_skip") columnClasses <- c("numeric", "numeric", "NULL") data <- read.csv(fileCSV, header = FALSE, sep = ",", col.names = columnHeaders, colClasses = columnClasses) 

All answer were good, but does not work for what I entended to do. 所有答案都很好,但对我想做的事情不起作用。 So I asked my self and other: 所以我问自己和他人:

And in one function, does data <- read_csv(fileCSV)[,(ncol(data)-1)] could work? 在一个函数中, data <- read_csv(fileCSV)[,(ncol(data)-1)]是否可以工作?

I've tried in one line of R to get on data , all 5 of first 6 columns, so not the last one. 我已经尝试在R一行中获取data ,前6列中的所有5列,所以不是最后一列。 To do so, I would like to use "-" in the number of column, do you think it's possible? 为此,我想在列数中使用“ - ”,您认为这是可能的吗? How can I do that? 我怎样才能做到这一点?

Thanks! 谢谢!

In base r it has to be 2 steps operation. 在基础r它必须是2步操作。 Example: 例:

> data <- read.csv("test12.csv")
> data
# 3 columns are returned
          a b c
1 1/02/2015 1 3
2 2/03/2015 2 4

# last column is excluded 
> data[,-ncol(data)]
          a b
1 1/02/2015 1
2 2/03/2015 2

one cannot write data <- read.csv("test12.csv")[,-ncol(data)] in base r . 一个人不能在基础r写入data <- read.csv("test12.csv")[,-ncol(data)]

But if you know max number of columns in your csv (say 3 in my case) then one can write: 但是如果您知道csv最大列数(在我的例子中为3),那么可以写:

df <- read.csv("test12.csv")[,-3]
df
          a b
1 1/02/2015 1
2 2/03/2015 2

It's not possible in one line as the data variable is not yet initialized when you call it. 由于data变量在调用时尚未初始化,因此无法在一行中进行。 So the command ncol(data) will trigger an error. 因此命令ncol(data)将触发错误。

You would need to use two lines of code to first load your data into the data variable and then remove the last column by either using data[,-ncol(data)] or data[,1:(ncol(data)-1)] . 您需要使用两行代码首先将数据加载到data变量中,然后使用data[,-ncol(data)]data[,1:(ncol(data)-1)]删除最后一列data[,1:(ncol(data)-1)]

The right hand side of an assignment is processed first so this line from the question: 首先处理赋值的右侧,以便从以下问题开始:

data <- read.csv(fileCSV)[,(ncol(data)-1)]

is trying to use data before it is defined. 在定义之前尝试使用data Also note what the above is saying is to take only the 2nd last field. 还要注意上面所说的只是第二个字段。 To get all but the last field: 获得除最后一个字段之外的所有字段

data <- read.csv(fileCSV)
data <- data[-ncol(data)]

If you know the name of the last field, say it is lastField , then this works and unlike the code above does not read the whole file and then remove the last field but rather only reads in fields other than the last. 如果您知道最后一个字段的名称,说它是lastField ,那么这个工作原理并且与上面的代码不同,它不读取整个文件,然后删除最后一个字段,而只读取除最后一个字段以外的字段。 Also it is only one line of code. 它也只有一行代码。

read.csv(fileCSV, colClasses = c(lastField = "NULL"))

If you don't know the name of the last field but you do know how many fields there are, say n , then either of these would work: 如果您不知道最后一个字段的名称但是您知道有多少字段,比如n ,则其中任何一个都可以工作:

read.csv(fileCSV)[-n]

read.csv(fileCSV, colClasses = replace(rep(NA, n), n, "NULL"))

Another way to do it without first reading in the last field is to first read in the header and first line to calculate the number of fields (assuming that all records have the same number) and then re-read the file using that. 在没有首先读取最后一个字段的情况下执行此操作的另一种方法是首先读取标题和第一行以计算字段数(假设所有记录具有相同的数字),然后使用它重新读取文件。

n <- ncol(read.csv(fileCSV, nrows = 1))

making use of one of the prior two statements involving n . 利用涉及n的前两个陈述之一。

Not a single function, but at least a single line, using dplyr (disclaimer: I never use dplyr or magrittr , so a more optimized solution must exist using these libraries) 使用dplyr不是单一功能,而是至少一行(免责声明:我从不使用dplyrmagrittr ,所以使用这些库必须存在更优化的解决方案)

library(dplyr)
dat = read.table(fileCSV) %>% select(., which(names(.) != names(.)[ncol(.)]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM