繁体   English   中英

如何将堆叠的长 dataframe 转换为宽格式

[英]How to convert a stacked long dataframe to wide format

我正在尝试处理一个 json 文件,中途我得到了这个

数据1

名称 价值
data_id1 8538u40952
data_id2 40942094i2
data_text1 啦啦啦
数据文本2 我们喜欢吃馅饼和做馅饼
data_metrics_likes1 0
data_metrics_likes2 5个
data_users_id1 284u94u20942
data_users_id2 094200220030

我如何让数据看起来像这样

文字编号 文本 喜欢 用户身份
8538u40952 啦啦啦 0 284u94u20942
40942094i2 我们喜欢吃馅饼和做馅饼 5个 094200220030
library(tidyverse)

data <- tribble(
  ~name, ~value,
  "data_id1", "8538u40952",
  "data_id2", "40942094i2",
  "data_text1", "la la la",
  "data_text2", "we love pie eating and pie making",
  "data_metrics_likes1", "0",
  "data_metrics_likes2", "5",
  "data_users_id1", "284u94u20942",
  "data_users_id2", "094200220030"
)

data %>%
  mutate(
    id2 = name %>% str_extract("[0-9]+$"), # ensure unique rows
    name = name %>% str_remove("[0-9]+$") %>% str_remove("^data_")
  ) %>%
  pivot_wider(names_from = name, values_from = value) %>%
  select(`text id` = id, text, likes=metrics_likes, userid=users_id) %>%
  type_convert()
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   `text id` = col_character(),
#>   text = col_character(),
#>   likes = col_double(),
#>   userid = col_character()
#> )
#> # A tibble: 2 × 4
#>   `text id`  text                              likes userid      
#>   <chr>      <chr>                             <dbl> <chr>       
#> 1 8538u40952 la la la                              0 284u94u20942
#> 2 40942094i2 we love pie eating and pie making     5 094200220030

reprex package (v2.0.0) 创建于 2022-05-05

使用tidyr ,您可以通过extract()分离name列,然后将数据转换为宽形式。

library(tidyr)

data %>%
  extract(name, c("name", "row"), "data_(.+)(\\d+)") %>%
  pivot_wider()

# # A tibble: 2 x 5
#   row   id         text                              metrics_likes users_id    
#   <chr> <chr>      <chr>                             <chr>         <chr>
# 1 1     8538u40952 la la la                          0             284u94u20942
# 2 2     40942094i2 we love pie eating and pie making 5             094200220030

使用reshape的基础 R 选项

reshape(
    transform(
        df,
        id = gsub(".*(\\d+)$", "\\1", name),
        name = gsub(".*?_(.*)\\d+$", "\\1", name)
    ),
    direction = "wide",
    idvar = "id",
    timevar = "name"
)

  id   value.id                        value.text value.metrics_likes
1  1 8538u40952                          la la la                   0
2  2 40942094i2 we love pie eating and pie making                   5
  value.users_id
1   284u94u20942
2   094200220030

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM