从 dataframe 中提取值并将它们填充到 R 中的模板短语中

Question

Given a dataset as follows:给定如下数据集：

df <- structure(list(type = structure(c(2L, 3L, 1L), .Label = c("negative", 
"positive", "zero"), class = "factor"), count = c(10L, 5L, 8L
), percent = c(43.5, 21.7, 34.8)), class = "data.frame", row.names = c(NA, 
-3L))

Out:出去：

I woudl like to fill the value from table to the template phrase as follows:我想将表中的值填充到模板短语中，如下所示：

In 2020, we have 10 cities have positive growth, which covers 43.5 % of all cities; 2020年我们有10个城市实现positive增长，占所有城市的43.5 %； 5 cities have zero growth, which covers 21.7 % of all cities; 5个城市zero增长，占所有城市的21.7 %； and 8 cities have negative growth, which covers 21.7 % of all cities. negative增长的城市有8个，占全部城市的21.7 %。

Template:模板：

In 2020, we have {} cities have {} growth, which covers {} % of all cities; 2020 年，我们有{}个城市有{}个增长，覆盖所有城市的{} %； {} cities have {} growth, which covers {} % of all cities; {}个城市有{}个增长，覆盖所有城市的{} %； and {} cities have {} growth, which covers {} % of all cities. {}个城市有{}个增长，覆盖所有城市的{} %。

How could I do that in R?我怎么能在 R 中做到这一点？

Answer 1

You can create a simple sentence with paste0 / sprintf and change the placeholders with respective values from the dataframe.您可以使用paste0 / sprintf创建一个简单的句子，并使用 dataframe 中的相应值更改占位符。

This is another way which does not require listing each individual value from the dataframe.这是另一种不需要列出 dataframe 中的每个单独值的方法。

string <- 'In 2020, we have %s cities have %s growth, which covers %s %% of all cities; %s cities have %s growth, which covers %s %% of all cities; and %s cities have %s growth, which covers %s %% of all cities'
do.call(sprintf, c(as.list(c(t(df[c(2, 1, 3)]))), fmt = string))

#[1] "In 2020, we have 10 cities have positive growth, which covers 43.5 % of all #cities;  5 cities have zero growth, which covers 21.7 % of all cities; and  8 #cities have negative growth, which covers 34.8 % of all cities"

df[c(2, 1, 3)] is used to reorder the columns so that count is the 1st column and type 2nd. df[c(2, 1, 3)]用于对列进行重新排序，以便count是第一列并type第二列。 This is needed since your sentence always has count value first, then type and last percent .这是必需的，因为您的句子总是首先具有count值，然后是type和 last percent 。 c(t(df[c(2, 1, 3)])) changes the dataframe to vector in a row-wise fashion which is passed to sprintf as different arguments. c(t(df[c(2, 1, 3)]))以行方式将 dataframe 更改为向量，该向量作为不同的 arguments 传递给sprintf 。

Answer 2

I'd recommend using the glue package over base string literal functions because a) it's more readable and b) the middle part of your template is the same phrase repeated for each row of your data frame, so we can use glue_data() to reduce repetition:我建议在基本字符串文字函数上使用胶水package，因为 a) 它更具可读性，b) 模板的中间部分是为数据框的每一行重复的相同短语，因此我们可以使用glue_data()来减少重复：

library(glue)

# Example data
df <- structure(list(type = structure(c(2L, 3L, 1L), .Label = c("negative", "positive", "zero"),
        class = "factor"), count = c(10L, 5L, 8L), percent = c(43.5, 21.7, 34.8)),
    class = "data.frame", row.names = c(NA, -3L))

growth <- glue_data(df, "{count} cities have {type} growth, which covers {percent}% of all cities")

# Add "and ..." to the last phrase:
growth[length(growth)] <- glue("and ", growth[length(growth)])

glue("In 2020, we have ", glue_collapse(growth, sep = "; "), ".")
#> In 2020, we have 10 cities have positive growth, which covers 43.5% of all cities; 5 cities have zero growth, which covers 21.7% of all cities; and 8 cities have negative growth, which covers 34.8% of all cities.

^{Created on 2021-02-24 by the reprex package (v1.0.0)}^{由代表 package (v1.0.0) 于 2021 年 2 月 24 日创建}

This also has the advantage of scaling to a data frame with any number of rows.这还具有扩展到具有任意行数的数据框的优点。

从 dataframe 中提取值并将它们填充到 R 中的模板短语中

问题描述

2 个解决方案

解决方案1
3 已采纳 2021-02-24 10:07:55

解决方案2
2 2021-02-24 10:47:57

从 dataframe 中提取值并将它们填充到 R 中的模板短语中

问题描述

2 个解决方案

解决方案1 3 已采纳 2021-02-24 10:07:55

解决方案2 2 2021-02-24 10:47:57

解决方案1
3 已采纳 2021-02-24 10:07:55

解决方案2
2 2021-02-24 10:47:57