I'm trying to extract material infos whose price increase and decrease the most top 3 base on pct_change
column.
Data:
df <- structure(list(material = c("Copper", "Aluminum", "Iron", "Zinc",
"Nickel", "Silver", "Gold", "Tin"), price = c(17125, 8312, 2228.5,
2934, 4315, 8178, 4411, 680), pct_change = c(0.025449102, 0.007166746,
-0.024939838, 0.062470043, -0.043873255, -0.004625122, 0.045031392,
-0.037508846)), class = "data.frame", row.names = c(NA, -8L))
My expected result will be a paragraph of text as follows:
text <- 'The top 3 commodities that price rise most are: Zinc (6.25%), Gold (4.5%), and Copper (2.54%),
the top 3 commodities that fall most are: Nickel (-4.39%), Tin (-3.75%) and Iron (-2.49%).'
My trial code works, but not concise, does someone could share other more efficient solutions? Thanks.
top3 <- df %>%
arrange(desc(pct_change)) %>%
mutate(pct_change=scales::percent(pct_change)) %>%
slice_head(n=3)
tail3 <- df %>%
arrange(pct_change) %>%
mutate(pct_change=scales::percent(pct_change)) %>%
slice_head(n=3)
com_name_up1 <- top3$material[1]
com_pct_up1 <- top3$pct_change[1]
com_name_up2 <- top3$material[2]
com_pct_up2 <- top3$pct_change[2]
com_name_up3 <- top3$material[3]
com_pct_up3 <- top3$pct_change[3]
com_name_down1 <- tail3$material[1]
com_pct_down1 <- tail3$pct_change[1]
com_name_down2 <- tail3$material[2]
com_pct_down2 <- tail3$pct_change[2]
com_name_down3 <- tail3$material[3]
com_pct_down3 <- tail3$pct_change[3]
text <- glue('The top 3 commodities that price rose most are: {com_name_up1} ({com_pct_up1}),
{com_name_up2} ({com_pct_up2}), and {com_name_up3} ({com_pct_up3}),
the top 3 commodities that fell most are: {com_name_down1} ({com_pct_down1}),
{com_name_down2} ({com_pct_down2}) and {com_name_down3} ({com_pct_down3}).')
To be honest I'm not sure if it's actually much shorter but you could look to glue the material
and pct_change
within the table first.
I've then grouped it up and collapsed the strings
df %>%
arrange(desc(pct_change)) %>%
mutate(
t1 = sprintf('%s (%.2f%%)', material, pct_change*100),
rank1 = case_when(
row_number() <= 3 ~ 'Top',
row_number() > n() -3 ~ 'Bot'
)
) %>%
group_by(rank1) %>%
summarise(
t2 = paste(t1, collapse = ', ')
)
rank1 t2
<chr> <chr>
1 Bot Iron (-2.49%), Tin (-3.75%), Nickel (-4.39%)
2 Top Zinc (6.25%), Gold (4.50%), Copper (2.54%)
3 NA Aluminum (0.72%), Silver (-0.46%)
Two general suggestions:
glue
has a very flexible syntax that allows you to pass any valid R expression into the "{...}"
. Utilizing this feature will help shorten your code.Here is the code
report3 <- function(df, f) {
df |>
f(pct_change, n = 3L) |>
dplyr::mutate(pct_change = scales::percent(pct_change)) |>
glue::glue_data("{material} ({pct_change})")
}
top3 <- report3(df, dplyr::slice_max)
bot3 <- report3(df, dplyr::slice_min)
text <- glue::glue('The top 3 commodities that price rose most are: \\
{top3[[1L]]}, {top3[[2L]]}, and {top3[[3L]]}; \\
the top 3 commodities that fell most are: \\
{bot3[[1L]]}, {bot3[[2L]]} and {bot3[[3L]]}.')
Output
> text
The top 3 commodities that price rose most are: Zinc (6.2%), Gold (4.5%), and Copper (2.5%); the top 3 commodities that fell most are: Nickel (-4.39%), Tin (-3.75%) and Iron (-2.49%).
Another possible solution (instead of stringr::str_c
, you could use the more convenient stringr::str_glue
, as @ekoam well suggests):
library(tidyverse)
df <- structure(list(material = c("Copper", "Aluminum", "Iron", "Zinc",
"Nickel", "Silver", "Gold", "Tin"), price = c(17125, 8312, 2228.5,
2934, 4315, 8178, 4411, 680), pct_change = c(0.025449102, 0.007166746,
-0.024939838, 0.062470043, -0.043873255, -0.004625122, 0.045031392,
-0.037508846)), class = "data.frame", row.names = c(NA, -8L))
top3 <- slice_max(df, pct_change, n = 3)
bottom3 <- slice_min(df, pct_change, n = 3)
str_c("The top 3 commodities that price rise most are: ",
top3$material[1]," (", round(100*top3$pct_change[1], 2),"%), ",
top3$material[2]," (", round(100*top3$pct_change[2],2),"%), and ",
top3$material[3]," (", round(100*top3$pct_change[3]),"%), the top 3
commodities that fall most are: ", bottom3$material[1]," (",
round(100*bottom3$pct_change[1], 2),"%), ", bottom3$material[2]," (",
round(100*bottom3$pct_change[2],2),"%), and ", bottom3$material[3],"
(", round(100*bottom3$pct_change[3]),"%).")
#> [1] "The top 3 commodities that price rise most are: Zinc (6.25%), Gold (4.5%), and Copper (3%), the top 3 commodities that fall most are: Nickel (-4.39%), Tin (-3.75%), and Iron (-2%)."
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.