[英]Cannot read file with “#” and space using read.table or read.csv in R
I have a file where the first row is a header. 我有一个文件,其中第一行是标题。 The header can have spaces and the # symbol (there may be other special characters as well).
标头可以包含空格和#符号(也可以有其他特殊字符)。 I am trying to read this file using read.csv or read.table but it keeps throwing me errors:
我正在尝试使用read.csv或read.table读取此文件,但它不断抛出错误:
undefined columns selected
more columns than column names
My tab-delimited chromFile file looks like: 我的制表符分隔的chromFile文件看起来像:
Chromosome# Chr chr Size UCSC NCBI36/hg18 NCBIBuild36 NCBIBuild37
1 Chr1 chr1 247199719 247249719 247249719 249250621
2 Chr2 chr2 242751149 242951149 242951149 243199373
Command: 命令:
chromosomes <- read.csv(chromFile, sep="\t",skip =0, header = TRUE, )
I want to first look for a way to read the file as it as without replacing the space or # with some other readable symbol. 我想首先寻找一种读取文件的方式,而不用其他可读符号替换空格或#。
From the documentation ( ?read.csv
): 从文档(
?read.csv
):
comment.char character: a character vector of length one containing a single character or an empty string.
comment.char character:长度为一个的字符向量,包含一个字符或一个空字符串。 Use "" to turn off the interpretation of comments altogether.
使用“”可以完全关闭注释的解释。
The default is comment.char = "#"
which is causing you trouble. 默认值为
comment.char = "#"
,这comment.char = "#"
您带来麻烦。 Following the documentation, you should use comment.char = ""
. 根据文档,您应该使用
comment.char = ""
。
Spaces in the header is another issue which, as mrdwab kindly pointed out, can be addressed by setting check.names = FALSE
. 如mrdwab所指出,标头中的空格是另一个问题,可以通过设置
check.names = FALSE
来解决。
chromosomes <- read.csv(chromFile, sep = "\t", skip = 0, header = TRUE,
comment.char = "", check.names = FALSE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.