简体   繁体   中英

Plotting weighted population densities in r

I'm analyzing a batch of data in R which I have plotted the population density on. I would also like to generate a value density plot. For example:

      dog.breed    weight.lbs
[1]   Labrador     63
[2]   Maltese      6
[3]   Dalmatian    55
[4]   Poodle       51
[5]   Maltese      4
[6]   Dalmatian    48
[7]   Poodle       56

The standard density plot will count the # of occurrences for each breed and then output a nice curve, as such:

      dog.breed    x
[1]   Labrador     1
[2]   Maltese      2
[3]   Dalmatian    2
[4]   Poodle       2

However what I am trying to obtain is a similarly smooth curve tracing the sum of the weights for each breed, as such:

      dog.breed    x
[1]   Labrador     63
[2]   Maltese      10
[3]   Dalmatian    103
[4]   Poodle       107

I can do this by establishing a series of points, such as in the final example, and then fitting a curve. But that's messy. I was hoping someone knew of clean package that could do the heavy lifting.

Thanks for the help.

Some Clarification:

How about another example. Suppose I have 50 stores and for every patron I know and how much they spend each time they come to the store. A density plot of the patron population on the stores would reveal information about how many people are attending each store. I'm looking for the equivalent plot, but for how much all people are spending at each store. Meh?

If you are using base R, you want to look at aggregate :

data <- read.table(text="dog.breed    weight.lbs
Labrador     63
Maltese      6
Dalmatian    55
Poodle       51
Maltese      4
Dalmatian    48
Poodle       56", header=TRUE, )

aggregate(. ~ dog.breed, data=data, sum)

#  dog.breed weight.lbs
#1 Dalmatian        103
#2  Labrador         63
#3   Maltese         10
#4    Poodle        107

If you are looking for a way to plot directly from the data without having to do anything, ggplot is your friend:

require(ggplot2)
ggplot(data, aes(x=dog.breed, y=weight.lbs)) +
  geom_bar(stat="identity")

ggplot(data, aes(x=dog.breed)) +
  geom_bar(aes(weight=weight.lbs))

The first graph plots multiple y values for each x, where geom_bar defaults to a "stack" value for the position arg, thus giving the sums over x. The second graph works because geom_bar defaults to the stat_bin producing a histogram for, but with the specification of a weight . Both produce equivalent output:

情节

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM