[英]How to apply 3 function to create a new dataframe
I'm new here on StackOverflow.我是 StackOverflow 的新手。 I would like to apply 3 function to a dataframe in order to create a new dataframe.我想将 3 function 应用于 dataframe 以创建新的 dataframe。
emiscore$rank19<-rank(-emiscore$"2019")
emi_P_19<-filter(emiscore,rank19<31)
emi_P_19<-emi_P_19[order(emi_P_19$Name),]
The top 10 lines of emi_P_19 appears as following: emi_P_19 的前 10 行如下所示:
structure(list(Name = c("LA Z BOY", "1 800 FLOWERS.COM 'A'",
"AGEAS (EX FORTIS)", "AGFA GEVAERT", "AIR FRANCE KLM", "ANHEUSER BUSCH INBEV"
), DATATYPE = c("TRESGENERS", "TRESGENERS", "TRESGENERS", "TRESGENERS",
"TRESGENERS", "TRESGENERS"), `2019` = c(0, 0, NA, NA, NA, NA),
`2018` = c(8.33, 0, 22.15, 64.46, 97.92, 58.47), `2017` = c(0,
0, 0, 63.11, 97.83, 49.14), `2016` = c(0, 0, 0, 58.65, 95.83,
61.46), `2015` = c(NA, NA, 0, 64.89, 93.27, 67.71), `2014` = c(NA,
NA, 0, 60.26, 94.57, 59.78), `2013` = c(NA, NA, 0, 64.63,
96.74, 77.17), `2012` = c(NA, NA, 0, 67.86, 98.96, 75), `2011` = c(NA,
NA, 0, 67.07, 96.81, 70.93), `2010` = c(NA, NA, 17.05, 71.25,
98.98, 88.46), `2009` = c(NA, NA, 11.59, 68.92, 88.16, 92.65
), `2008` = c(NA, NA, 18.85, 71.21, 92.42, 77.59), `2007` = c(NA,
NA, 50.93, 79.69, 80.36, 78), delisted = c("NO", "NO", "NO",
"NO", "NO", "NO"), rank20 = c(535, 535, 646, 647, 648, 649
), rank19 = c(535, 535, 646, 647, 648, 649)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
So essentially, I want to rank, take the top 30 companies, order them alphabetically to create a new dataframe with the names (column named "Name") of the companies per each year from 2007 to 2019. The end goal is to obtain the list per each year that displays the names of the companies ranked and filtered as above, in alphabetic order.所以本质上,我想排名,取前 30 家公司,按字母顺序排列以创建一个新的 dataframe,其中包含 2007 年至 2019 年每年的公司名称(名为“名称”的列)。最终目标是获得每年的列表,按字母顺序显示如上所述排名和过滤的公司名称。
As @Parfait mentioned it becomes very easier to do data manipulation if you keep data in long format, you could do something like this:正如@Parfait 所提到的,如果您以长格式保存数据,则进行数据操作会变得非常容易,您可以执行以下操作:
library(dplyr)
result <- emiscore %>%
tidyr::pivot_longer(cols = `2019`:`2007`, names_to = 'year') %>%
group_by(year) %>%
top_n(30, value)
This selects top 30 values for each year.这将选择每年的前 30 个值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.