繁体   English   中英

从 dataframe 中提取值并将它们填充到 R 中的模板短语中

[英]Extract values from dataframe and fill them into a template phrase in R

给定如下数据集:

df <- structure(list(type = structure(c(2L, 3L, 1L), .Label = c("negative", 
"positive", "zero"), class = "factor"), count = c(10L, 5L, 8L
), percent = c(43.5, 21.7, 34.8)), class = "data.frame", row.names = c(NA, 
-3L))

出去:

在此处输入图像描述

我想将表中的值填充到模板短语中,如下所示:

2020年我们有10个城市实现positive增长,占所有城市的43.5 %; 5个城市zero增长,占所有城市的21.7 %; negative增长的城市有8个,占全部城市的21.7 %。

模板:

2020 年,我们有{}个城市有{}个增长,覆盖所有城市的{} %; {}个城市有{}个增长,覆盖所有城市的{} %; {}个城市有{}个增长,覆盖所有城市的{} %。

我怎么能在 R 中做到这一点?

您可以使用paste0 / sprintf创建一个简单的句子,并使用 dataframe 中的相应值更改占位符。

这是另一种不需要列出 dataframe 中的每个单独值的方法。

string <- 'In 2020, we have %s cities have %s growth, which covers %s %% of all cities; %s cities have %s growth, which covers %s %% of all cities; and %s cities have %s growth, which covers %s %% of all cities'
do.call(sprintf, c(as.list(c(t(df[c(2, 1, 3)]))), fmt = string))

#[1] "In 2020, we have 10 cities have positive growth, which covers 43.5 % of all #cities;  5 cities have zero growth, which covers 21.7 % of all cities; and  8 #cities have negative growth, which covers 34.8 % of all cities"

df[c(2, 1, 3)]用于对列进行重新排序,以便count是第一列并type第二列。 这是必需的,因为您的句子总是首先具有count值,然后是type和 last percent c(t(df[c(2, 1, 3)]))以行方式将 dataframe 更改为向量,该向量作为不同的 arguments 传递给sprintf

我建议在基本字符串文字函数上使用胶水package,因为 a) 它更具可读性,b) 模板的中间部分是为数据框的每一行重复的相同短语,因此我们可以使用glue_data()来减少重复:

library(glue)

# Example data
df <- structure(list(type = structure(c(2L, 3L, 1L), .Label = c("negative", "positive", "zero"),
        class = "factor"), count = c(10L, 5L, 8L), percent = c(43.5, 21.7, 34.8)),
    class = "data.frame", row.names = c(NA, -3L))

growth <- glue_data(df, "{count} cities have {type} growth, which covers {percent}% of all cities")

# Add "and ..." to the last phrase:
growth[length(growth)] <- glue("and ", growth[length(growth)])

glue("In 2020, we have ", glue_collapse(growth, sep = "; "), ".")
#> In 2020, we have 10 cities have positive growth, which covers 43.5% of all cities; 5 cities have zero growth, which covers 21.7% of all cities; and 8 cities have negative growth, which covers 34.8% of all cities.

代表 package (v1.0.0) 于 2021 年 2 月 24 日创建

这还具有扩展到具有任意行数的数据框的优点。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM