如何从字典列表中创建稀疏 DataFrame

Question

I create DataFrame from a list of dicts like this:我从这样的字典列表中创建DataFrame ：

pd.DataFrame([{"id":"a","v0":3,"v2":"foo"},
              {"id":"b","v1":1,"v4":"ouch"}]).set_index(
                 "id",verify_integrity=True)
     v0   v2   v1    v4
id                    
a   3.0  foo  NaN   NaN
b   NaN  NaN  1.0  ouch

Alas, for some inputs I run out of RAM in the DataFrame constructor, and I wonder if there is a way to make pandas produce a sparse DataFrame from the list of dicts .唉，对于某些输入，我在DataFrame构造函数中用完了 RAM，我想知道是否有办法让DataFrame从列表中生成稀疏的 DataFrame。

Answer 1

I suggest to use the dytpe='Sparse' for this.我建议为此使用dytpe='Sparse' 。

If all elements are numbers you can use dytpe='Sparse' , dytpe='Sparse[int]' or dytpe='Sparse[float]'如果所有元素都是数字，您可以使用dytpe='Sparse' 、 dytpe='Sparse[int]'或dytpe='Sparse[float]'

data = [{"id":'a',"v0":3,"v2":6},
        {"id":'b',"v1":1,"v4":7}]
index = [item.pop('id') for item in data]
pd.DataFrame(data, index=index, dtype="Sparse")

If any value is a string you have to use dytpe='Sparse[str]' .如果任何值是字符串，则必须使用dytpe='Sparse[str]' 。

data = [{"id":'a',"v0":3,"v2":'foo'},
        {"id":'b',"v1":1,"v4":'ouch'}]
df = pd.DataFrame(data, dtype="Sparse[str]").set_index("id",verify_integrity=True)

如何从字典列表中创建稀疏 DataFrame

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-02-23 15:58:51

如何从字典列表中创建稀疏 DataFrame

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-02-23 15:58:51

解决方案1
2 已采纳 2021-02-23 15:58:51