[英]How to create a sparse DataFrame from a list of dicts
I create DataFrame
from a list of dicts like this:我从这样的字典列表中创建DataFrame
:
pd.DataFrame([{"id":"a","v0":3,"v2":"foo"},
{"id":"b","v1":1,"v4":"ouch"}]).set_index(
"id",verify_integrity=True)
v0 v2 v1 v4
id
a 3.0 foo NaN NaN
b NaN NaN 1.0 ouch
Alas, for some inputs I run out of RAM in the DataFrame
constructor, and I wonder if there is a way to make pandas produce a sparse DataFrame
from the list of dicts .唉,对于某些输入,我在DataFrame
构造函数中用完了 RAM,我想知道是否有办法让DataFrame
从列表中生成稀疏的 DataFrame。
I suggest to use the dytpe='Sparse'
for this.我建议为此使用dytpe='Sparse'
。
If all elements are numbers you can use dytpe='Sparse'
, dytpe='Sparse[int]'
or dytpe='Sparse[float]'
如果所有元素都是数字,您可以使用dytpe='Sparse'
、 dytpe='Sparse[int]'
或dytpe='Sparse[float]'
data = [{"id":'a',"v0":3,"v2":6},
{"id":'b',"v1":1,"v4":7}]
index = [item.pop('id') for item in data]
pd.DataFrame(data, index=index, dtype="Sparse")
If any value is a string you have to use dytpe='Sparse[str]'
.如果任何值是字符串,则必须使用dytpe='Sparse[str]'
。
data = [{"id":'a',"v0":3,"v2":'foo'},
{"id":'b',"v1":1,"v4":'ouch'}]
df = pd.DataFrame(data, dtype="Sparse[str]").set_index("id",verify_integrity=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.