[英]Fastest way to store data from Pandas DataFrame
I was checking out Fastest way to iterate through a pandas dataframe? 我正在检查迭代熊猫数据帧的最快方法吗? and I wasn't sure if it could be applied to my situation.
我不确定是否可以将其应用于我的情况。 I want to make a dictionary of the samples and features in the DataFrame
我想为DataFrame中的示例和功能制作字典
#DF_gex is a DataFrame
D_sample_Data = {}
class Sample:
def __init__(self,D_key_value):
self.D_key_value = D_key_value
for i in range(DF_gex.shape[0]):
D_key_value = {}
sample = DF_gex.index[i]
for j in range(DF_gex.shape[1]):
key = DF_gex.columns[j]
value = DF_gex.iloc[i,j]
D_key_value[key] = value
D_sample_Data[sample].D_key_value = D_key_value
I basically have a class called Sample in this case, in the Sample class I store a dictionary for each instance (D_key_value). 在这种情况下,我基本上有一个称为Sample的类,在Sample类中,我为每个实例(D_key_value)存储一个字典。 Right now i'm iterating through every row and every column.
现在,我正在遍历每一行和每一列。
Is there a quicker way of doing this? 有更快的方法吗? I know that Pandas is based on Numpy arrays which have special features for indexing.
我知道Pandas基于Numpy数组,该数组具有用于索引的特殊功能。 Can one of those ways be used for this?
可以使用其中一种方法吗?
In the end, I will have a dictionary object D_sample_Data where I input a sample name and get a class instance. 最后,我将有一个字典对象D_sample_Data,在其中输入样本名称并获取类实例。 In that class instance, there will be a dictionary object unique to that sample key.
在该类实例中,将存在该样本键唯一的字典对象。
If you simply want a dictionary of dictionary , where the keys for the outer dictionary are the indexes and the keys for the inner dictionaries are columns and the value are the corresponding value at that index-column (or dictionary of classes containing dictionary). 如果您只想使用dictionary字典,则外部字典的键为索引,内部字典的键为列,值是该索引列(或包含字典的类的字典)上的对应值。
Then you don't need loops, you can simply use DataFrame.to_dict()
method. 然后,您不需要循环,只需使用
DataFrame.to_dict()
方法即可。 Example - 范例-
resultdict = df.T.to_dict()
Or from Pandas version 0.17.0 you can also use the keyword argument orient='index'
. 或者从Pandas版本0.17.0开始,您还可以使用关键字参数
orient='index'
。 Example - 范例-
resultdict = df.to_dict(orient='index')
Demo - 演示-
In [73]: df
Out[73]:
Col1 Col2 Col3
a 1 2 3
b 4 5 6
c 7 8 9
In [74]: df.T.to_dict()
Out[74]:
{'a': {'Col1': 1, 'Col2': 2, 'Col3': 3},
'b': {'Col1': 4, 'Col2': 5, 'Col3': 6},
'c': {'Col1': 7, 'Col2': 8, 'Col3': 9}}
If you want the values of the outer dictionary to be of type class Sample
, though I hardly doubt that is useful at all , then you can do - 如果您希望外部词典的值是
class Sample
的类型,尽管我几乎不怀疑这样做很有用,那么您可以-
class Sample:
def __init__(self,D_key_value):
self.D_key_value = D_key_value
resultdict = df.T.to_dict()
resultdict = {k:Sample(v) for k,v in resultdict.items()}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.