從 dataframe 中提取值並將它們填充到 R 中的模板短語中

Question

給定如下數據集：

df <- structure(list(type = structure(c(2L, 3L, 1L), .Label = c("negative", 
"positive", "zero"), class = "factor"), count = c(10L, 5L, 8L
), percent = c(43.5, 21.7, 34.8)), class = "data.frame", row.names = c(NA, 
-3L))

出去：

我想將表中的值填充到模板短語中，如下所示：

2020年我們有10個城市實現positive增長，占所有城市的43.5 %； 5個城市zero增長，占所有城市的21.7 %； negative增長的城市有8個，占全部城市的21.7 %。

模板：

2020 年，我們有{}個城市有{}個增長，覆蓋所有城市的{} %； {}個城市有{}個增長，覆蓋所有城市的{} %； {}個城市有{}個增長，覆蓋所有城市的{} %。

我怎么能在 R 中做到這一點？

Answer 1

您可以使用paste0 / sprintf創建一個簡單的句子，並使用 dataframe 中的相應值更改占位符。

這是另一種不需要列出 dataframe 中的每個單獨值的方法。

string <- 'In 2020, we have %s cities have %s growth, which covers %s %% of all cities; %s cities have %s growth, which covers %s %% of all cities; and %s cities have %s growth, which covers %s %% of all cities'
do.call(sprintf, c(as.list(c(t(df[c(2, 1, 3)]))), fmt = string))

#[1] "In 2020, we have 10 cities have positive growth, which covers 43.5 % of all #cities;  5 cities have zero growth, which covers 21.7 % of all cities; and  8 #cities have negative growth, which covers 34.8 % of all cities"

df[c(2, 1, 3)]用於對列進行重新排序，以便count是第一列並type第二列。 這是必需的，因為您的句子總是首先具有count值，然后是type和 last percent 。 c(t(df[c(2, 1, 3)]))以行方式將 dataframe 更改為向量，該向量作為不同的 arguments 傳遞給sprintf 。

Answer 2

我建議在基本字符串文字函數上使用膠水package，因為 a) 它更具可讀性，b) 模板的中間部分是為數據框的每一行重復的相同短語，因此我們可以使用glue_data()來減少重復：

library(glue)

# Example data
df <- structure(list(type = structure(c(2L, 3L, 1L), .Label = c("negative", "positive", "zero"),
        class = "factor"), count = c(10L, 5L, 8L), percent = c(43.5, 21.7, 34.8)),
    class = "data.frame", row.names = c(NA, -3L))

growth <- glue_data(df, "{count} cities have {type} growth, which covers {percent}% of all cities")

# Add "and ..." to the last phrase:
growth[length(growth)] <- glue("and ", growth[length(growth)])

glue("In 2020, we have ", glue_collapse(growth, sep = "; "), ".")
#> In 2020, we have 10 cities have positive growth, which covers 43.5% of all cities; 5 cities have zero growth, which covers 21.7% of all cities; and 8 cities have negative growth, which covers 34.8% of all cities.

^{由代表 package (v1.0.0) 於 2021 年 2 月 24 日創建}

這還具有擴展到具有任意行數的數據框的優點。

從 dataframe 中提取值並將它們填充到 R 中的模板短語中

問題描述

2 個解決方案

解決方案1
3 已采納 2021-02-24 10:07:55

解決方案2
2 2021-02-24 10:47:57

從 dataframe 中提取值並將它們填充到 R 中的模板短語中

問題描述

2 個解決方案

解決方案1 3 已采納 2021-02-24 10:07:55

解決方案2 2 2021-02-24 10:47:57

解決方案1
3 已采納 2021-02-24 10:07:55

解決方案2
2 2021-02-24 10:47:57