繁体   English   中英

使用read.table()导入带有数字的长字符字段

[英]Importing long character field with numbers with read.table()

我尝试使用代表文档编号的列导入大型数据集。 此字段包含一个前导零的25位数字。 我尝试使用read.table()导入数据,但是即使在导入过程中将“字符”分配为类时,对于此特定字段也始终为“ 1e + 19”。

# import elyte
colnames<-c("patnr","name","birthday","sex","casenr","Bew","Art","docnr","date","time","none","Na","K","Cl","Ca","corCa")
classes <- rep("character",length(colnames))
ELYTE <- read.table(file="ELYTE.TXT",skip=3,comment.char="",sep="|",col.names=colnames, header=FALSE, colClasses=classes)

原始数据如下所示:0010000005 | Weber | 19091220 | 1 | 0000337340 | 00000 | LAB | 0000010000000000000011524 | 20000127 | 084800 || 140 | 3.7 | 100 | 2.1 | 0010000005 |韦伯| 19091220 | 1 | 0000337340 | 00000 |实验室| 0000010000000000000011541 | 20000127 | 080200 |||||| 0010000005 |韦伯| 19091220 | 1 | 0000337340 | 00000 | LAB | 0000010000000000000011562 | 20000127 | 101800 || 140 | 4.6 | 101 | 2.2 | 0010000005 |韦伯| 19091220 | 1 | 0000337340 | 00000 | LAB | 0000010000000000000011579 | 20000127 | 134500 || 138 | 4.0 2.2 || | 0010000005 |韦伯| 19091220 | 1 | 0000337340 | 00000 | LAB | 0000010000000000000011591 | 20000128 | 084200 || 138 | 3.6 | 98 | 2.1 | 0010000005 |韦伯| 19091220 | 1 | 0000337340 | 00000 |实验室| 0000010000000000000011593 | 20000128 | 085900 |||||| 0010000005 |韦伯| 19091220 | 1 | 0000337340 | 00000 | LAB | 0000010000000000000011653 | 20000129 | 093400 || 140 | 4.2 | 99 | 2.2 | 0010000005 |韦伯| 19091220 | 1 | 0000337340 | 00000 |实验室| 0000010000000000000011717 | 20000129 | 094100 ||||||

我得到的是以下内容:

姓名生日性别casenr Bew Art docnr日期时间无Na K Cl Ca corCa

1 0010000005韦伯19091220 1 0000337340 00000 LAB 1e + 19 20000127 084800 140 3.7 100 2.1
2 0010000005韦伯19091220 1 0000337340 00000 LAB 1e + 19 20000127 080200
3 0010000005韦伯19091220 1 0000337340 00000 LAB 1e + 19 20000127 101800 140 4.6 101 2.2
4 0010000005韦伯19091220 1 0000337340 00000 LAB 1e + 19 20000127 134500 138 4.0 2.2
5 0010000005 Weber 19091220 1 0000337340 00000 LAB 1e + 19 20000128 084200 138 3.6 98 2.1
6 0010000005韦伯19091220 1 0000337340 00000 LAB 1e + 19 20000128 085900

如何防止“ docnr”转换为“ 1e + 19”?

...例如通过设置栏键入character ,就像你一样:

txt <- "0010000005|Weber|19091220|1|0000337340|00000|LAB|0000010000000000000011524|20000127|084800||140|3.7|100|2.1| 0010000005|Weber|19091220|1|0000337340|00000|LAB|0000010000000000000011541|20000127|080200|||||| 0010000005|Weber|19091220|1|0000337340|00000|LAB|0000010000000000000011562|20000127|101800||140|4.6|101|2.2| 0010000005|Weber|19091220|1|0000337340|00000|LAB|0000010000000000000011579|20000127|134500||138|4.0||2.2| 0010000005|Weber|19091220|1|0000337340|00000|LAB|0000010000000000000011591|20000128|084200||138|3.6|98|2.1| 0010000005|Weber|19091220|1|0000337340|00000|LAB|0000010000000000000011593|20000128|085900|||||| 0010000005|Weber|19091220|1|0000337340|00000|LAB|0000010000000000000011653|20000129|093400||140|4.2|99|2.2| 0010000005|Weber|19091220|1|0000337340|00000|LAB|0000010000000000000011717|20000129|094100||||||"
txt <- gsub(" ", "\n", txt)
colnames<-c("patnr","name","birthday","sex","casenr","Bew","Art","docnr","date","time","none","Na","K","Cl","Ca","corCa")
classes <- rep("character",length(colnames))
ELYTE <- read.table(text = txt, skip=3,comment.char="", sep="|", col.names=colnames, header=FALSE, colClasses=classes)
ELYTE
# patnr  name birthday sex     casenr   Bew Art                     docnr     date   time none  Na   K Cl  Ca corCa
# 1 0010000005 Weber 19091220   1 0000337340 00000 LAB 0000010000000000000011579 20000127 134500      138 4.0    2.2      
# 2 0010000005 Weber 19091220   1 0000337340 00000 LAB 0000010000000000000011591 20000128 084200      138 3.6 98 2.1      
# 3 0010000005 Weber 19091220   1 0000337340 00000 LAB 0000010000000000000011593 20000128 085900                          
# 4 0010000005 Weber 19091220   1 0000337340 00000 LAB 0000010000000000000011653 20000129 093400      140 4.2 99 2.2      
# 5 0010000005 Weber 19091220   1 0000337340 00000 LAB 0000010000000000000011717 20000129 094100                          

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM