简体   繁体   English

R 中的 read.table 没有读取所有列

[英]read.table in R is not reading all columns

I had posted before a question related to this, but the solutions didn´t solve completely my problem.我之前发布过一个与此相关的问题,但解决方案并没有完全解决我的问题。

I have a table that contains the characters "'", and "#" and when I read it using read.table() to read it and it cannot skip the rows that contain those characters.我有一个包含字符“'”和“#”的表,当我使用 read.table() 读取它时,它不能跳过包含这些字符的行。

I am reading the file using the command:我正在使用以下命令读取文件:

table<- read.table("table.txt",header =TRUE, sep ="\t",quote="'",skip=8,fill=TRUE, comment.char="#",check.names=F)

This is only reading the first column of the table and not the entire table like it was supposed to do, any suggestions how to solve this?这只是读取表格的第一列,而不是像应该做的那样读取整个表格,有什么建议可以解决这个问题吗?在此处输入图片说明

An example line of the table containing # is:包含 # 的表的示例行是:

Homo sapiens    Unigene Hs.549823   ILMN_110080 HS.549823   Hs.549823       Hs.549823       5053715 AI732602            ILMN_1846799    5910129 S   320 GCAGGTTGTTATTGTTGCTGAGCGGGGTGTGTGGGTGGCTAACGAGAGGG  11  +   61276241-61276290       zo26g12.x5 Stratagene colon (#937204) Homo sapiens cDNA clone IMAGE:588070 3, mRNA sequence

Try using readLines() instead to get the raw lines, then splice them according to your delimiter尝试使用readLines()来获取原始行,然后根据您的分隔符拼接它们

library(stringr)

# Open Connection to file
pathToFile <- path.expand("~/path/to/file/myfile.txt")
f <- file(pathToFile, "rb")  

# Read in lines
rawText <- readLines(f)

problemFreeTable <- 
  sapply(rawText, str_split, "\t")  # replace "\t" with "," or the appropriate delim. 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM