简体   繁体   中英

R eulerr package - Displays wrong euler Diagram

I am trying to create an Euler diagram with the R package eulerr. I am using the following code:

vd <- euler(c(A = 54, B = 22, C = 53, D= 26 ,"A&B" = 20, "A&C" = 29, "A&D"=10, "B&C" = 16, "B&D"=5, "C&D"=7,"A&B&C" = 14, "A&B&D"=5, "A&C&D"=4, "B&C&D"=3,"A&B&C&D"=3),input = c("union"), shape="ellipse")

plot(vd, labels = c("A", "B", "C","D"), main = "Databases",Count=TRUE, quantities = TRUE)

I am getting the following result: 在此处输入图片说明 But the resulting Euler-plot is wrong:

  • Not all of B should bei included in A
  • B should be 22 in total ( in the picture it only shows a total count of 20)
  • C should be 53 in total (not 51)

How can I fix this or is this a package error?

The error_plot is shows the following: Region error: 区域错误 Residuals:

残差 Unfortunately the Residual-plot doesn't show the residuals. Nonetheless the missing cases are shown in the "normal" residual statistic below.

        original fitted residuals regionError
A             15     15         0       0.004
B              0      0         0       0.000
C             19     19         0       0.005
D             13     13         0       0.003
A&B            4      4         0       0.001
A&C           14     14         0       0.003
A&D            4      4         0       0.001
B&C            2      0         2       0.022
B&D            0      0         0       0.000
C&D            3      3         0       0.001
A&B&C         11     11         0       0.003
A&B&D          2      2         0       0.000
A&C&D          1      1         0       0.000
B&C&D          0      0         0       0.000
A&B&C&D        3      3         0       0.001

diagError: 0.022 
stress:    0.004 

The reason why some areas are left out is simple: the diagram is inexact and is missing some areas. There is no place to put the label for B&C so that's why B and C are missing 2 units. There likely isn't any way (or at least eulerr cannot find it) to perfectly represent your combination with an Euler diagram using ellipses. You either have to accept that it is inexact or try another solution.

Similarly, the residual plot cannot show the missing residuals graphically because there is no area representing them. I am, by the way, the author of this package and I do have something better in mind for the residual plot which would display missing areas as well, but I haven't had time to implement it yet.

Regarding how to fix the issue, it depends on the level of precission you want. From the nVenn algorithm, I authored the nVennR package to create quasi-proportional Euler diagrams. With the caveats mentioned in the link, you can represent larger numbers of sets and show the relative size of each region. In your example,

library(nVennR)
myV <- createVennObj(nSets = 4, sNames = c('A', 'B', 'C', 'D'), sSizes = c(0, 26, 53, 7, 22, 5, 16, 3, 54, 10, 29, 4, 20, 5, 14, 3))
myV <- plotVenn(nVennObj = myV)

And the result would be: 欧拉图

Depending on your requirements, this may not be satisfactory. The proportionality is in the area of the circles, not the regions (you can see that the region 1, 2, 3, 4 - A&B&C&D - has empty space. However, this strategy overcomes the limitations of regular shapes in these representations mentioned by Johan Larsson. If you are interested, there are more details in the vignette .

Euler can go wrong in a number of instances, for instance:

vd <- euler(c(A=23578,B=30492,C=63610,"A&B"=563,"A&C"=624,"B&C"=1600,"A&B&C"=308))
plot(vd, labels = c("1", "2", "3"), main = "overlap", cex=2)

displays a diagram with NO overlapping regions for the three categories.

i think this is simply an inaccurate tool to use.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM