简体   繁体   English

python3 rpy2 ggplot2结合多个数据帧图

[英]python3 rpy2 ggplot2 combine multiple dataframes plot

I am struggling in trying to make a plot that combines datas from 2 different DFs in ggplot2 form rpy2. 我正在努力制作一个绘图,该绘图以ggplot2的形式从rpy2组合来自2个不同DF的数据。

I can not make it works, it is like it can take only one DF per time. 我无法使其正常运行,就像每次只能使用一个DF。

I have 2 rpy2 df: 我有2个rpy2 df:

r_df1 = pandas2ri.py2rpy(df1)
r_df_int = pandas2ri.py2rpy(df_int)

the first DB is a db of chromosome, positions and characteristics of the variants: 第一个数据库是变体的染色体,位置和特征的数据库:

df1.head()

 name chr pos status dp low
 31 1-3395085-C-T 1 3395085 T 88 0
 32 1-16202978-G-A 1 16202978 T 162 0
 5 1-11826252-C-T 1 11826252 T 296 0
 33 1-17257079-G-A 1 17257079 T 288 1
 71 1-33318561-T-C 1 33318561 T 10 0

the second DB is just the DB with intervals to pass to geom_rect: 第二个数据库只是间隔要传递给geom_rect的数据库:

df_int

 chr starts ends
 0 1 0 5
 1 2 5 10
 2 3 10 16
 3 4 16 19
 4 5 19 24
 5 6 24 31
 6 7 31 36
 7 8 36 40
 8 9 40 42
 9 10 42 45
 10 11 45 50
 11 12 50 54
 12 13 54 55
 13 14 55 62
 14 15 62 64
 15 16 64 67
 16 17 67 74
 17 18 74 75
 18 19 75 82
 19 20 82 85
 20 22 85 88
 21 30 88 92

and try to make them combining in one plot: 并尝试将它们合并为一个图:

pp2 = ggplot2.ggplot(r_df_int) + \
    ggplot2.geom_rect( ggplot2.aes_string(xmin = 'starts', xmax = 'ends', ymin = '0', ymax = '5', fill = 'factor(chr)'), alpha=0.5 ) + \
    ggplot2.geom_point( data = r_df1, ggplot2.aes_string(x='sort(order(pos))', y='log(dp)', col='factor(chr)', size='dp', shape = 'factor(low)') )  + \
    ggplot2.theme_minimal()


pp2.plot()

File "<stdin>", line 3
SyntaxError: positional argument follows keyword argument

With only one it works. 只有一个它有效。

Does anybody has got a clue? 有人知道吗?

As the error message indicates it, the error is on the third line of your last expression and is about Python not allowing unnamed arguments after named arguments in a call (it is possible with R, not with Python). 正如错误消息所指示的那样,该错误位于您最后一个表达式的第三行,并且与Python不允许在调用中的命名参数之后使用未命名的参数有关(R可能是这样,而不是Python)。

Either move data=r_df1 to after aes_string , or give a name to the second argument: 任一移动data=r_df1aes_string ,或提供一个名称的第二个参数:

ggplot2.geom_point(data=r_df1,
                   mapping=ggplot2.aes_string(x='sort(order(pos))',
                   y='log(dp)', col='factor(chr)', size='dp',
                   shape='factor(low)')) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM