简体   繁体   中英

R: Creating a table with the highest values by year

I hope I don't ask a question that has been asked already, but I couldn't quite find what I was looking for. I am fairly new to R and have no experience with programming.

I want to make a table with the top 10 values of three sections for each year If my data looks somthing like this:

Year Country Test1 Test2 Test3
2000 ALB     500   497   501
2001 ALB     NA    NA    NA
...
2000 ARG     502   487   354
2001 ARG     NA    NA    NA
...

(My years go from 2000 to 2015, I only have observations for every three years, and even in those years still a lot of NA's for some countries or tests)

I would like to get a table in which I can see the 10 top values for each test for each year. So for the year 2000,2003,2006,...,2015 the top ten values and the countries that reached those values for test 1,2&3.

AND then (I am not sure if this should be a separate question) I would like to get the table into Latex.

Easier to see top values this way.
You could use dcast and melt from data.table package:

# convert to data table
setDT(df)

# convert it to long format and select the columns to used
df1 <- melt(df, id.vars=1:2)
df1 <- df1[,c(1,2,4)]

# get top values year and country
df1 <- df1[,top_value := .(list(sort(value, decreasing = T))), .(Year, Country)][,.(Year, Country, top_value)]

print(df1)

   Year Country   top_value
 1: 2000     ALB 501,500,497
 2: 2001     ALB            
 3: 2000     ARG 502,487,354
 4: 2001     ARG            
 5: 2000     ALB 501,500,497
 6: 2001     ALB            
 7: 2000     ARG 502,487,354
 8: 2001     ARG            
 9: 2000     ALB 501,500,497
10: 2001     ALB            
11: 2000     ARG 502,487,354
12: 2001     ARG 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM