I was checking out Fastest way to iterate through a pandas dataframe? and I wasn't sure if it could be applied to my situation. I want to make a dictionary of the samples and features in the DataFrame
#DF_gex is a DataFrame
D_sample_Data = {}
class Sample:
def __init__(self,D_key_value):
self.D_key_value = D_key_value
for i in range(DF_gex.shape[0]):
D_key_value = {}
sample = DF_gex.index[i]
for j in range(DF_gex.shape[1]):
key = DF_gex.columns[j]
value = DF_gex.iloc[i,j]
D_key_value[key] = value
D_sample_Data[sample].D_key_value = D_key_value
I basically have a class called Sample in this case, in the Sample class I store a dictionary for each instance (D_key_value). Right now i'm iterating through every row and every column.
Is there a quicker way of doing this? I know that Pandas is based on Numpy arrays which have special features for indexing. Can one of those ways be used for this?
In the end, I will have a dictionary object D_sample_Data where I input a sample name and get a class instance. In that class instance, there will be a dictionary object unique to that sample key.
If you simply want a dictionary of dictionary , where the keys for the outer dictionary are the indexes and the keys for the inner dictionaries are columns and the value are the corresponding value at that index-column (or dictionary of classes containing dictionary).
Then you don't need loops, you can simply use DataFrame.to_dict()
method. Example -
resultdict = df.T.to_dict()
Or from Pandas version 0.17.0 you can also use the keyword argument orient='index'
. Example -
resultdict = df.to_dict(orient='index')
Demo -
In [73]: df
Out[73]:
Col1 Col2 Col3
a 1 2 3
b 4 5 6
c 7 8 9
In [74]: df.T.to_dict()
Out[74]:
{'a': {'Col1': 1, 'Col2': 2, 'Col3': 3},
'b': {'Col1': 4, 'Col2': 5, 'Col3': 6},
'c': {'Col1': 7, 'Col2': 8, 'Col3': 9}}
If you want the values of the outer dictionary to be of type class Sample
, though I hardly doubt that is useful at all , then you can do -
class Sample:
def __init__(self,D_key_value):
self.D_key_value = D_key_value
resultdict = df.T.to_dict()
resultdict = {k:Sample(v) for k,v in resultdict.items()}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.