简体   繁体   English

嵌套 Python 字典到 Pandas 数据帧

[英]Nested Python dictionary to Pandas dataframes

I have a nested Python dictionary:我有一个嵌套的 Python 字典:

d={'CON-2': {'gene-ODF3': [2.0, 44474],'gene-SCGB1C1': [0.184937, 36615], 'gene-TRNAN-GUU-19': [32.0, 443]},'CON-1':{'gene-ODF3': [10.00, 44474], 'gene-SCGB1C1': [0.184937, 36615], 'gene-TRNAN-GUU-19': [30.0, 443], 'gene-LOC103247846': [20.0, 22111]}}

I would like to plot the FPKM of each gene (the first value) against its DNA transcript abundance (the second value) on a scatterplot.我想 plot 将每个基因的 FPKM(第一个值)与其 DNA 转录本丰度(第二个值)放在散点图上。 I have tried a few different things, such as:我尝试了一些不同的方法,例如:

CON_1=pd.DataFrame(d['CON-1'].items(),columns=['FPKM','Fraction-0'])
CON_2=pd.DataFrame(d['CON-2'].items(),columns=['FPKM','Fraction-0'])

df=pd.DataFrame.from_dict({(i,j): d[i][j]
                           for i in d.keys()
                           for j in d[i].keys()},
                           orient='index')

But I cannot separate the two values from each other.但我无法将这两个值彼此分开。 I would like to generate a separate data frame for each condition (CON-1 and CON-2), like this:我想为每个条件(CON-1 和 CON-2)生成一个单独的数据框,如下所示:

gene       FPKM    DNA-abundance
gene-ODF3  2.0     44474
pd.DataFrame(d)['CON-1'].apply(pd.Series)\
                        .rename(columns={0:'FPKM',1:'DNA-abundance'})
#                        FPKM  DNA-abundance
#gene-ODF3          10.000000        44474.0
#gene-SCGB1C1        0.184937        36615.0
#gene-TRNAN-GUU-19  30.000000          443.0
#gene-LOC103247846  20.000000        22111.0

Same for the other condition.其他情况也一样。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM