简体   繁体   English

从 dataframe 中提取值并将它们填充到 R 中的模板短语中

[英]Extract values from dataframe and fill them into a template phrase in R

Given a dataset as follows:给定如下数据集:

df <- structure(list(type = structure(c(2L, 3L, 1L), .Label = c("negative", 
"positive", "zero"), class = "factor"), count = c(10L, 5L, 8L
), percent = c(43.5, 21.7, 34.8)), class = "data.frame", row.names = c(NA, 
-3L))

Out:出去:

在此处输入图像描述

I woudl like to fill the value from table to the template phrase as follows:我想将表中的值填充到模板短语中,如下所示:

In 2020, we have 10 cities have positive growth, which covers 43.5 % of all cities; 2020年我们有10个城市实现positive增长,占所有城市的43.5 %; 5 cities have zero growth, which covers 21.7 % of all cities; 5个城市zero增长,占所有城市的21.7 %; and 8 cities have negative growth, which covers 21.7 % of all cities. negative增长的城市有8个,占全部城市的21.7 %。

Template:模板:

In 2020, we have {} cities have {} growth, which covers {} % of all cities; 2020 年,我们有{}个城市有{}个增长,覆盖所有城市的{} %; {} cities have {} growth, which covers {} % of all cities; {}个城市有{}个增长,覆盖所有城市的{} %; and {} cities have {} growth, which covers {} % of all cities. {}个城市有{}个增长,覆盖所有城市的{} %。

How could I do that in R?我怎么能在 R 中做到这一点?

You can create a simple sentence with paste0 / sprintf and change the placeholders with respective values from the dataframe.您可以使用paste0 / sprintf创建一个简单的句子,并使用 dataframe 中的相应值更改占位符。

This is another way which does not require listing each individual value from the dataframe.这是另一种不需要列出 dataframe 中的每个单独值的方法。

string <- 'In 2020, we have %s cities have %s growth, which covers %s %% of all cities; %s cities have %s growth, which covers %s %% of all cities; and %s cities have %s growth, which covers %s %% of all cities'
do.call(sprintf, c(as.list(c(t(df[c(2, 1, 3)]))), fmt = string))

#[1] "In 2020, we have 10 cities have positive growth, which covers 43.5 % of all #cities;  5 cities have zero growth, which covers 21.7 % of all cities; and  8 #cities have negative growth, which covers 34.8 % of all cities"

df[c(2, 1, 3)] is used to reorder the columns so that count is the 1st column and type 2nd. df[c(2, 1, 3)]用于对列进行重新排序,以便count是第一列并type第二列。 This is needed since your sentence always has count value first, then type and last percent .这是必需的,因为您的句子总是首先具有count值,然后是type和 last percent c(t(df[c(2, 1, 3)])) changes the dataframe to vector in a row-wise fashion which is passed to sprintf as different arguments. c(t(df[c(2, 1, 3)]))以行方式将 dataframe 更改为向量,该向量作为不同的 arguments 传递给sprintf

I'd recommend using the glue package over base string literal functions because a) it's more readable and b) the middle part of your template is the same phrase repeated for each row of your data frame, so we can use glue_data() to reduce repetition:我建议在基本字符串文字函数上使用胶水package,因为 a) 它更具可读性,b) 模板的中间部分是为数据框的每一行重复的相同短语,因此我们可以使用glue_data()来减少重复:

library(glue)

# Example data
df <- structure(list(type = structure(c(2L, 3L, 1L), .Label = c("negative", "positive", "zero"),
        class = "factor"), count = c(10L, 5L, 8L), percent = c(43.5, 21.7, 34.8)),
    class = "data.frame", row.names = c(NA, -3L))

growth <- glue_data(df, "{count} cities have {type} growth, which covers {percent}% of all cities")

# Add "and ..." to the last phrase:
growth[length(growth)] <- glue("and ", growth[length(growth)])

glue("In 2020, we have ", glue_collapse(growth, sep = "; "), ".")
#> In 2020, we have 10 cities have positive growth, which covers 43.5% of all cities; 5 cities have zero growth, which covers 21.7% of all cities; and 8 cities have negative growth, which covers 34.8% of all cities.

Created on 2021-02-24 by the reprex package (v1.0.0)代表 package (v1.0.0) 于 2021 年 2 月 24 日创建

This also has the advantage of scaling to a data frame with any number of rows.这还具有扩展到具有任意行数的数据框的优点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 dataframe 中提取顶部正负值并使用 R 将它们填充到格式化文本中 - Extract top positive and negative values from dataframe and fill them into a formatted text using R 从列表中提取值和属性并将它们转换为 R 中的 dataframe - Extract values and attributes from a list and convert them into a dataframe in R 根据另一个数据帧R中的值填充数据帧中的缺失值 - Fill missing values in a dataframe based on values from another dataframe R 从 R dataframe 中提取逗号分隔值 - extract comma separated values from R dataframe 无法从R中的句子中提取确切的短语 - Not able to extract exact phrase from the sentence in R 用R中另一个数据框的值填写缺失值(NA) - Fill in missing values (NAs) with values from another dataframe in R 根据 R 中另一个 DataFrame 的条件从 DataFrame 中提取值 - Extract values from a DataFrame based on condition on another DataFrame in R 如何使用R中另一个数据帧的值填充列 - How to fill a column using values from another dataframe in R R:从向量中提取值并在函数中使用它们来计算差异 - R:Extract values from the vector and use them in function to calculate the difference 如何从数据框中获取某些值,对其进行平均,然后将它们放入 r 中的新数据框中? - How do I take certain values from a dataframe, average them, and place them in a new dataframe in r?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM