简体   繁体   English

数据集,其中包含每个观测值的预定义类列表-R中

[英]Dataset that holds a list of pre-defined classes for each observation - in R

I'm new to R and need to keep a dataset that contains for each observation (let's say - a user) a list of classes (let's say events). 我是R的新手,需要保留一个数据集,该数据集包含每个观察值(比如说一个用户)的类列表(比如说事件)。 for example - for each user_ID I hold a list of events, every event class contains the fields: name, time, type. 例如-对于每个user_ID,我保存一个事件列表,每个事件类都包含以下字段:名称,时间,类型。

My question is - what is the optimal way to hold such data in R? 我的问题是-在R中保存此类数据的最佳方法是什么? I have several millions of such observations so I need to hold it in optimal manner (in terms of space). 我有数百万个这样的观察结果,因此我需要以最佳方式(就空间而言)保存它。

In addition, after I decide how to hold it, I need create it from within python, as my original data is in python dict. 另外,在决定如何保存它之后,我需要在python内部创建它,因为我的原始数据在python dict中。 What is the best way to do it? 最好的方法是什么?

Thanks! 谢谢!

You can save your dict as a .csv using the csv module for Python. 您可以使用Python的csv模块将dict保存为.csv。

mydict = {"a":1, "b":2, "c":3}
with open("test.csv", "wb") as myfile:
    w = csv.writer(myfile)
    w.writerows(mydict.items())

Then just load it into R with read.csv . 然后使用read.csv其加载到R中。

Of course, depending on what your Python dict looks like, you may need some more post processing, but without a reproducible example it's hard to say what that would be. 当然,取决于您的Python字典是什么样子,您可能需要更多的后期处理,但是如果没有可重现的示例,很难说是什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM