如何在 R 中将我的数据帧从宽转换为长？

Question

我在将数据框从宽转换为长时遇到问题。 我很清楚那里有很多优秀的小插曲，它们非常精确地解释了 gather() 或 pivot_longer() （例如https://www.storybench.org/pivoting-data-from-columns-to-rows-和-回到-tidyverse/ ）。 尽管如此，我现在仍然被困了好几天，这让我发疯。 于是，我特意去网上问问。 你。

我有一个看起来像这样的数据框：

id     <- c(1,2,3)
year   <- c(2018,2003,2011)
lvl    <- c("A","B","C")
item.1 <- factor(c("A","A","C"),levels = lvl)
item.2 <- factor(c("C","B","A"),levels = lvl)
item.3 <- factor(c("B","B","C"),levels = lvl)
df     <- data.frame(id,year,item.1,item.2,item.3)

所以我们为每个观察（例如电影）都有一个 id 变量。 我们有一个年份变量，表示观察发生的时间（例如电影上映的时间）。 我们有三个因素变量来评估观察的不同特征（例如演员、故事情节和电影音乐）。 这三个因子变量共享相同的因子水平“A”、“B”或“C”（例如，电影的演员阵容是“优秀”、“还可以”或“糟糕”）。

但在我最疯狂的梦想中，数据更像是这样的：

id.II     <- c(rep(1, 9), rep(2, 9), rep(3,9))
year.II   <- c(rep(2018, 9), rep(2003, 9), rep(2011,9))
item.II   <- rep(c(c(1,1,1),c(2,2,2),c(3,3,3)),3)
rating.II <- rep(c("A", "B", "C"), 9)
number.II  <- c(1,0,0,0,0,1,0,1,0,1,0,0,0,1,0,0,1,0,0,0,1,1,0,0,0,0,1)
df.II     <- data.frame(id.II,year.II,item.II,rating.II,number.II)

因此，现在数据框将更可用于进一步分析。 例如，下一步将计算每年被评为“优秀”的电影的数量（甚至更高的百分比）。

year.III   <- factor(c(rep(2018, 3), rep(2003, 3), rep(2011,3)))
item.III   <- factor(rep(c(1, 2, 3), 3))
number.A.III <- c(1,0,0,1,0,0,0,1,0)
df.III     <- data.frame(year.III,item.III,number.A.III)

ggplot(data=df.III, aes(x=year.III, y=number.A.III, group=item.III)) +
  geom_line(aes(color=item.III))+
  geom_point(aes(color=item.III))+
  theme(panel.background = element_blank(),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        legend.position = "bottom")+
  labs(colour="Item")

或者对我来说更重要的是，显示每个项目（演员、讲故事、电影音乐）被评为“优秀”、“还可以”和“糟糕”的百分比。

item.IV   <- factor(rep(c(c(1,1,1),c(2,2,2),c(3,3,3)),3))
rating.IV <- factor(rep(c("A", "B", "C"), 9))
number.IV <- c(2,0,1,1,1,1,0,2,1)
df.IV     <- data.frame(item.IV,rating.IV,number.IV)
df.IV

ggplot(df.IV,aes(fill=rating.IV,y=number.IV,x=item.IV))+
  geom_bar(position= position_fill(reverse = TRUE), stat="identity")+
  theme(axis.title.y = element_text(size = rel(1.2), angle = 0),
        axis.title.x = element_blank(),
        panel.background = element_blank(),
        legend.title = element_blank(),
        legend.position = "bottom")+
  labs(x = "Item")+
  coord_flip()+
  scale_x_discrete(limits = rev(levels(df.IV$item.IV)))+
  scale_y_continuous(labels = scales::percent)

我的主要问题是：如何将数据框 df 转换为 df.II？ 那会让我很开心。 错误的。 我的周末。

如果您还可以提示如何从 df.II 继续到 df.III 和 df.IV，那绝对是令人兴奋的。 但是，我不想因为我的问题给你太多负担。

最好的祝愿 Jascha

Answer 1

这是否达到了您的需要？

library(tidyverse)

df_long <- df %>%
  pivot_longer(cols = item.1:item.3, names_to = "item", values_to = "rating") %>%
  mutate(
    item = str_remove(item, "item.")
  )


df2 <- crossing(
  df_long,
  rating_all = unique(df_long$rating)
) %>%
  mutate(n = rating_all == rating) %>%
  group_by(id, year, item, rating_all) %>%
  summarise(n = sum(n))

df3 <- df2 %>%
  filter(item == "3")

如何在 R 中将我的数据帧从宽转换为长？

问题描述

1 个解决方案

解决方案1
0 2021-02-05 16:29:36

如何在 R 中将我的数据帧从宽转换为长？

问题描述

1 个解决方案

解决方案1 0 2021-02-05 16:29:36

解决方案1
0 2021-02-05 16:29:36