[英]How to read a one lined CSV in R?
I have been working on a dummy dataset recently and i found out that the data provided to me was all in single line. 我最近一直在研究一个虚拟数据集,我发现提供给我的数据都是单行的。 A similiar example for the same is depicted as follows: 一个类似的例子描述如下:
Name,Age,Gender,Occupation A,10,M,Student B,11,M,Student C,11,F,Student
i want to import the data and obtain an output as follows: 我想导入数据并获得如下输出:
Name Age Gender Occupation
A 10 M Student
B 11 M Student
C 12 F Student
a case may arise that a value might be missing. 可能会出现可能缺少价值的情况。 a logic is required to import such data. 导入此类数据需要逻辑。 Can anyone help me out to build a logic behind the import of such data sets. 任何人都可以帮助我构建导入此类数据集的逻辑。
i tried the normal import but it really didn't helped. 我尝试了正常的导入,但它确实没有帮助。 just imported the file by read.csv()
function and it didn't gave me an expected result. 刚刚通过read.csv()
函数导入文件,它没有给我一个预期的结果。
EDIT: what if the data is like: 编辑:如果数据如下:
Name,Age,Gender,Occupation ABC XYZ,10,M,Student B,11,M,Student C,11,F,Student
and i want an output like: 我想要一个像这样的输出:
Name Age Gender Occupation
ABC XYZ 10 M Student
B 11 M Student
C 12 F Student
You could read your file in with readLines
, turn spaces into line breaks, and then read it with read.csv
: 您可以使用readLines
读取文件,将空格转换为换行符,然后使用read.csv
读取它:
# txt <- readLines("my_data.txt") # with a real data file
txt <- readLines(textConnection("Name,Age,Gender,Occupation A,10,M,Student B,11,M,Student C,11,F,Student"))
read.csv(text=gsub(" ","\n",txt))
output 产量
Name Age Gender Occupation
1 A 10 M Student
2 B 11 M Student
3 C 11 F Student
If you have millions of records, you will probably want to speed up this process, so I suggest using data.table
's fread
instead of read.csv
, which can also take a shell command to pre-process the file before reading in R, and sed
will be a lot faster then doing the string manipulation in R. 如果你有数百万条记录,你可能想要加快这个过程,所以我建议使用data.table
的fread
而不是read.csv
,它也可以在读取R之前使用shell命令预处理文件,并且sed
将比在R中进行字符串操作快得多。
Eg if you have this CSV stored at /tmp/x.csv
, you can try something like: 例如,如果您将此CSV存储在/tmp/x.csv
,则可以尝试以下操作:
> data.table::fread("sed 's/ /\\n/g' /tmp/x.csv")
Name Age Gender Occupation
1: A 10 M Student
2: B 11 M Student
3: C 11 F Student
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.