简体   繁体   中英

Draw a Lorenz curve in R

I would like to draw a Lorenz curve and calculate a Gini index with the objective to determine how much parasites does the top 20% most infected hosts support.

Here is my data set:

Number of parasites per host:

parasites = c(0,1,2,3,4,5,6,7,8,9,10)

Number of hosts associated with each number of parasites given above:

hosts = c(18,20,28,19,16,10,3,1,0,0,0)

To represent the Lorenz curve:

I manually calculated the cumulative percentage of parasites and hosts:

cumul_parasites <- cumsum(parasites)/max(cumsum(parasites))
cumul_hosts <- cumsum(hosts)/max(cumsum(hosts))
plot(cumul_hosts, cumul_parasites, type= "l")

在此处输入图片说明

I also tested the function Lc (package ineq ):

Lc.p <- Lc(parasites,n=hosts)
plot(Lc.p)

在此处输入图片说明

Why are the two curves (manual and function Lc ) different ?

The 2 graphs are different because when you calculate the cumulative precentage (the degree) you must multiply it with the frequency.

The right solution would be:

parasites = c(0,1,2,3,4,5,6,7,8,9,10)
hosts = c(18,20,28,19,16,10,3,1,0,0,0)
cumul_parasites <- cumsum(parasites*hosts)/max(cumsum(parasites*hosts))
cumul_hosts <- cumsum(hosts)/max(cumsum(hosts))
plot(cumul_hosts, cumul_parasites, type= "l")
lines(cumul_hosts, cumul_parasites,col = 2, lwd = 2, type = "p")
legend("topleft", c('My calc', 'LC'), col = 1:2, lty = 1, box.col = 1)

and this fits the Lc calculation exactly.

lc和我的计算比较

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM