简体   繁体   中英

Pivot table from multiple columns while grouping in R

I am struggling to convert a wide dataset into a long one with pivot_longer in R. For the example below, I have catch data for each species as a column. I'd like to output a dataframe where the FIRST column is Species, and each row is a datapoint, with Year and Country also as columns. I want to later group and average these so that I can plot them. Does anyone know how to achieve this with pivot_longer?

 Data <- data.frame(
 Country = c("a", "a", "c", "c", "a", "b"),
 Year = c("1990", "1990", "1991", "1992", "1990", "1990"),
 Tiger_Shark = c(0,1,4,7,5,6)  ,
 Whale_shark = c(0, 20, 14, 19, 2,7),
 White_shark = c(0, 0, 12, 29, 1,8))

I have tried the following:

data %>% 
pivot_longer( cols= Tiger_Shark:White_shark,  
          names_to = c(" Tiger", "Whale", "White"),
          values_to = "catch") 

But this does not retain the Year or Country columns, which I'd like to keep. I think the solution is simple but I am not familiar with pivot_longer. Thank you so much!!

If we want to use ggplot , reshape to 'long' format and then plot. The summarisation can be done within summarise after grouping and then do the reshaping

library(dplyr)
library(tidyr)
library(ggplot2)
Data %>%
   group_by(Year, Country) %>% 
   summarise(across(everything(), mean)) %>%
   pivot_longer(cols = Tiger_Shark:White_shark) %>%
   ggplot(aes(x = Country, y = value, fill = Year)) + 
     geom_col()

Or first reshape to 'long' format and then do a group by summarise

Data %>% 
 pivot_longer(cols = Tiger_Shark:White_shark) %>%
 group_by(Country, Year) %>% 
 summarise(value = mean(value))

The names_to should be a single element here as these columns seems to belong to a single group

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM