无法在R中使用read.table或read.csv读取带有“＃”和空格的文件

Question

I have a file where the first row is a header. 我有一个文件，其中第一行是标题。 The header can have spaces and the # symbol (there may be other special characters as well). 标头可以包含空格和＃符号（也可以有其他特殊字符）。 I am trying to read this file using read.csv or read.table but it keeps throwing me errors: 我正在尝试使用read.csv或read.table读取此文件，但它不断抛出错误：

undefined columns selected 

more columns than column names

My tab-delimited chromFile file looks like: 我的制表符分隔的chromFile文件看起来像：

Chromosome# Chr chr Size    UCSC NCBI36/hg18    NCBIBuild36 NCBIBuild37
1   Chr1    chr1    247199719   247249719   247249719   249250621
2   Chr2    chr2    242751149   242951149   242951149   243199373

Command: 命令：

chromosomes <- read.csv(chromFile, sep="\t",skip =0, header = TRUE,  )

I want to first look for a way to read the file as it as without replacing the space or # with some other readable symbol. 我想首先寻找一种读取文件的方式，而不用其他可读符号替换空格或＃。

Answer 1

From the documentation ( ?read.csv ): 从文档（ ?read.csv ）：

comment.char character: a character vector of length one containing a single character or an empty string. comment.char character：长度为一个的字符向量，包含一个字符或一个空字符串。 Use "" to turn off the interpretation of comments altogether. 使用“”可以完全关闭注释的解释。

The default is comment.char = "#" which is causing you trouble. 默认值为comment.char = "#" ，这comment.char = "#"您带来麻烦。 Following the documentation, you should use comment.char = "" . 根据文档，您应该使用comment.char = "" 。

Spaces in the header is another issue which, as mrdwab kindly pointed out, can be addressed by setting check.names = FALSE . 如mrdwab所指出，标头中的空格是另一个问题，可以通过设置check.names = FALSE来解决。

chromosomes <- read.csv(chromFile, sep = "\t", skip = 0, header = TRUE,
                        comment.char = "", check.names = FALSE)

无法在R中使用read.table或read.csv读取带有“＃”和空格的文件

问题描述

1 个解决方案

解决方案1
39 已采纳 2012-10-07 18:26:27

无法在R中使用read.table或read.csv读取带有“＃”和空格的文件

问题描述

1 个解决方案

解决方案1 39 已采纳 2012-10-07 18:26:27

解决方案1
39 已采纳 2012-10-07 18:26:27