[英]convert raw data file to RData file
I am trying to make a RData file from a raw numeric space deliminated text file, ie 我试图从原始的数字空间分隔文本文件,即RData文件
11 33 55
22 33 45
25 78 00
44 87 99 ....
I have another R script which needs to load this new RData file and perform linear regression with the data using mapreduce (rhipe). 我还有另一个R脚本,需要加载此新RData文件并使用mapreduce(rhipe)对数据执行线性回归。 Thus when i save this RObject I need to read it back this way:
因此,当我保存此RObject时,我需要以这种方式读回它:
data <- strsplit(unlist(map.values)," ")
#so that I can run regression like:
y<- unlist(lapply(data,"[[",1))
x1<-unlist(lapply(data,"[[",2))
x2<-unlist(lapply(data,"[[",3))
lm(y~x1+x2)
I have tried many ways to save my data into the RData object, including table, list and as.character, but non of the succeed so that i can read it using my above method. 我尝试了多种方法将数据保存到RData对象中,包括表,列表和as.character,但都不成功,因此可以使用上述方法读取它。 How can I save my original file so that I can read it in the way I have above?
如何保存我的原始文件,以便可以按照上面的方式读取它? Thank you.
谢谢。
(ps. i cannot use load / read.table functions since i am reading from a HDFS file inside the mapper) (ps。我无法使用load / read.table函数,因为我正在从映射器中的HDFS文件读取)
If I understand you correctly, you want your stored object to be a bunch of strings of the form "number - space - number" . 如果我理解正确,那么您希望存储的对象是一串形式为“ number-space-number”的字符串。 In that case, use
sprintf
在这种情况下,请使用
sprintf
foo <- sprintf('%d %d %d',my_data[1,])
as an example of creating the first row. 作为创建第一行的示例。 Run a loop or
*apply
to build the entire array. 运行循环或
*apply
以构建整个阵列。 Save that character string array to an RData
file. 将该字符串数组保存到
RData
文件。 This should at least be close to what you want. 这至少应该接近您想要的。
Note: I suppose it's futile to suggest improving the far-end code which does the data sorting and regressions? 注意:我认为建议改进数据分类和回归的远端代码是徒劳的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.