I have a dictionary like this., keys as 'Start postions' and values as list of entries, each entry contains multiple other values.
dict1 = {28878779:
[[0.63078648931418,'BRCA','Primary Blood Derived Cancer','chr16'],
[0.913319324289701, 'BRCA', 'Primary Blood Derived Cancer', 'chr16'],
[0.4291909025802871, 'BRCA', 'Primary Blood Derived Cancer', 'chr16'],
[0.7571498628201009, 'BRCA', 'Primary Blood Derived Cancer', 'chr16'],
[0.20053355013001398, 'BRCA', 'Primary Blood Derived Cancer', 'chr16'],
[0.47222708511173905, 'BRCA', 'Primary Blood Derived Cancer', 'chr16'],
[0.5421979810611359, 'BRCA', 'Primary Blood Derived Cancer', 'chr16'],
[0.517080694962231, 'BRCA', 'Primary Blood Derived Cancer', 'chr16'],
[0.354578922865826, 'BRCA', 'Primary Blood Derived Cancer', 'chr16'],
[0.47933127476003706, 'BRCA', 'Primary Blood Derived Cancer', 'chr16']]
116276795:
[[0.0295335249313507,'BRCA','Primary Blood Derived Cancer','chr12'],
[0.0225709542480921, 'BRCA', 'Primary Blood Derived Cancer', 'chr12'],
[0.0230930552162406, 'BRCA', 'Primary Blood Derived Cancer', 'chr12'],
[0.0226794373583645, 'BRCA', 'Primary Blood Derived Cancer', 'chr12'],
[0.0465238706721383, 'BRCA', 'Primary Blood Derived Cancer', 'chr12'],
[0.0308525159082739, 'BRCA', 'Primary Blood Derived Cancer', 'chr12'],
[0.0280263565564701, 'BRCA', 'Primary Blood Derived Cancer', 'chr12']]
...}
I want to convert the dictionary into dataframe like this., A dataframe which contains dictionary's keys and values (each entry of the values) into rows of dataframe.
Start Beta_value Cancer Stage Chromosome
28878779 0.63078648931418 BRCA Primary Blood Derived Cancer chr16
28878779 0.913319324289701 BRCA Primary Blood Derived Cancer chr16
.
.
116276795 0.029533524931350 BRCA Primary Blood Derived Cancer chr12
116276795 0.0225709542480921 BRCA Primary Blood Derived Cancer chr12
.
.
I tried this..
dlist = [[key,value[i][0],value[i][1],value[i][2],value[i][3]]
for key,value in dict1.items()
for i in value]
beta = pd.DataFrame(d, columns =
['Start','Beta_value','Cancer','Stage','Chromosome'])
It is showing some type error:
TypeError: list indices must be integers or slices, not list
what am I supposed to do?
Variable i
return lists, so need indexing them:
dlist = [[key,i[0],i[1],i[2],i[3]] for key,value in dict1.items() for i in value]
Or add key to list:
dlist = [[key] + i for key,value in dict1.items() for i in value]
#alternative
#dlist = [(key, *i) for key,value in dict1.items() for i in value]
beta = pd.DataFrame(dlist, columns=['Start','Beta_value','Cancer','Stage','Chromosome'])
print (beta)
Start Beta_value Cancer Stage Chromosome
0 28878779 0.630786 BRCA Primary Blood Derived Cancer chr16
1 28878779 0.913319 BRCA Primary Blood Derived Cancer chr16
2 28878779 0.429191 BRCA Primary Blood Derived Cancer chr16
3 28878779 0.757150 BRCA Primary Blood Derived Cancer chr16
4 28878779 0.200534 BRCA Primary Blood Derived Cancer chr16
5 28878779 0.472227 BRCA Primary Blood Derived Cancer chr16
6 28878779 0.542198 BRCA Primary Blood Derived Cancer chr16
7 28878779 0.517081 BRCA Primary Blood Derived Cancer chr16
8 28878779 0.354579 BRCA Primary Blood Derived Cancer chr16
9 28878779 0.479331 BRCA Primary Blood Derived Cancer chr16
10 116276795 0.029534 BRCA Primary Blood Derived Cancer chr12
11 116276795 0.022571 BRCA Primary Blood Derived Cancer chr12
12 116276795 0.023093 BRCA Primary Blood Derived Cancer chr12
13 116276795 0.022679 BRCA Primary Blood Derived Cancer chr12
14 116276795 0.046524 BRCA Primary Blood Derived Cancer chr12
15 116276795 0.030853 BRCA Primary Blood Derived Cancer chr12
16 116276795 0.028026 BRCA Primary Blood Derived Cancer chr12
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.