how do you find the max value of one column (column B) when the values of Column C = X. How would I keep the label from Column A. Let's say my data is called my.data, column a = Country name, column b = number of children born and column C = year the child was born. So how do find the maximum number of children born in year 2001, keeping the name of the country in line?
Thanks, I'm very sorry, I'm new to R
There are many options in R (and questions on SO) for doing this kind of operation
I will give a data.table
solution because I like the easy syntax for this kind of query
For efficiency for large data sets. I also gives a very easy syntax for subsetting. (.SD references the subset created by i
and by
)
library(data.table)
DT <- data.table(my.data)
DT[year==2001, .SD[which.max(births)]]
Or this is the same without needing .SD
DT[year==2001][which.max(births)]
my.data <- expand.grid(
Country = c('Swaziland', 'Australia', 'Tuvalu', 'Turkmenistan'),
year = 1990:2012 )
my.data$births <- rpois(nrow(my.data), lambda = 500)
DT <- data.table(my.data)
DT[year==2001, .SD[which.max(births)]]
## Country year births
## 1: Swaziland 2001 501
births_2001 <- subset(my.data, year == 2001)
births_2001[which.max(births_2001$births),]
## Country year births
## 45 Swaziland 2001 501
There are a number of ways to do this. I'll break it up so you can hopefully see what's going on better.
my.data <- data.frame(
country=c("Australia","France","Germany","Honduras","Nepal","Honduras"),
children=c(120000,354000,380000,540000,370000,670000),
year=c(2000,2001,2001,2002,2001,2003)
)
myd01 <- my.data[my.data$year==2001,] # pulls out the 2001 data
myd01[myd01$children==max(myd01$children),] # finds the rows with the maximum
> aggregate(.~ year,data=my.data, FUN= max)
这也将解决问题。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.