简体   繁体   中英

R pivot to wide and back to long (multiple groups)

I've been using wide table format to create a migration variable (year, municipality -> year, municipality, move) and was wondering if I can flip it back into long table format. However, I now 2 groups per year instead of one. I looked through the existing posts on SO, but couldn't find anything similar.

Here's what I have done:

library(tidyverse)
library(rlang)

# sample data
mydata <- data.frame(id = sort(rep(1:10,3)),
                     year = rep(seq(2009,2011),10),
                     municip = sample(c(NA,1:3),30,replace=TRUE))

The data looks like this:

id year municip
1 2009 2
1 2010 1
1 2011 3
2 2009 1
2 2010 1
2 2011 3
3 2009 NA
3 2010 NA
3 2011 NA
# turn sideways
mydata.wide <- mydata %>%
  pivot_wider(names_from = year,
              names_prefix = "municip.",
              values_from = municip)

Now it looks like this:

id municip.2009 municip.2010 municip.2011
1 2 1 3
2 1 1 3
3 NA NA NA
4 1 NA 3
5 1 NA 2
6 3 2 2
7 2 NA 3
8 3 NA 3
9 NA 1 NA
10 1 NA 2

Then I'm adding a migration variable (in reality this is done for 12 years):

# create migration variable
for (i in 2009:2010){
  
  text.string <- paste0("mydata.wide <- mydata.wide %>%
          mutate(move.",i+1," = case_when(
            is.na(municip.",i,") & is.na(municip.",i+1,") ~ \"NA\",
            is.na(municip.",i,") & !is.na(municip.",i+1,") ~ \"1\",
            !is.na(municip.",i,") & !is.na(municip.",i+1,") 
               & municip.",i," != municip.",i+1," ~ \"3\",
            !is.na(municip.",i,") & is.na(municip.",i+1,") ~ \"4\",   
            TRUE ~ \"2\"
          ))")
  
  eval(parse_expr(text.string))                        
}

# NA: missing in both cases
# 1: move into region
# 2: stayed in region
# 3: moved within region
# 4: moved out of region

Now the table looks like this:

id municip.2009 municip.2010 municip.2011 move.2010 move.2011
1 2 1 3 3 3
2 1 1 3 2 3
3 NA NA NA NA NA
4 1 NA 3 4 1
5 1 NA 2 4 1
6 3 2 2 3 2
7 2 NA 3 4 1
8 3 NA 3 4 1
9 NA 1 NA 1 4
10 1 NA 2 4 1

What I want to do is to flip it back to create something like this:

id year municip move
1 2009 2 NA
1 2010 1 3
1 2011 3 3
2 2009 1 NA
2 2010 1 2
2 2011 3 3
3 2009 NA NA
3 2010 NA NA
3 2011 NA NA

I'm not sure if this can be done with just pivot_longer on it's own. I tried a couple of variations. Any ideas?

You can try this:

df <- tribble(~id, ~municip.2009, ~municip.2010, ~municip.2011, ~move.2010, ~move.2011,
1,  2,  1,  3,  3,  3,
2,  1,  1,  3,  2,  3,
3,  NA, NA, NA, NA, NA,
4,  1,  NA, 3,  4,  1,
5,  1,  NA, 2,  4,  1,
6,  3,  2,  2,  3,  2,
7,  2,  NA, 3,  4,  1,
8,  3,  NA, 3,  4,  1,
9,  NA, 1,  NA, 1,  4,
10, 1,  NA, 2,  4,  1
)


df %>%
  pivot_longer(cols = -1, names_to = "temp1", values_to = "count") %>% 
  separate(col = temp1, c("temp2", "year")) %>% 
  pivot_wider(names_from = temp2, values_from = count)

pivot_longer collects municip and move in the same column; with separate split municip and move by the years ; finally with pivot_wider you get the final result.

Don't think sideways, think longways!

Now, I cannot answer your question completly, because I don't really understand what you are calculating. Is it some sort of factor (1-4)? But I believe you can finish this yourself. Consider the following:

> mydata %>% group_by(id) %>% 
  arrange(year) %>% 
  mutate(last_year = lag(municip)) %>% 
  ungroup %>% 
  arrange(id) %>% as.data.frame # ignore this line, it is simply for the pleasure of seeing the data.frame
   id year municip last_year
1   1 2009       3        NA
2   1 2010       2         3
3   1 2011      NA         2
4   2 2009      NA        NA
5   2 2010      NA        NA
6   2 2011       1        NA
7   3 2009       3        NA
8   3 2010       2         3
9   3 2011       2         2
10  4 2009       2        NA
11  4 2010      NA         2
12  4 2011       1        NA
13  5 2009       3        NA
14  5 2010      NA         3
15  5 2011       2        NA
16  6 2009       1        NA
17  6 2010       3         1
18  6 2011       2         3
19  7 2009       3        NA
20  7 2010       2         3
21  7 2011       2         2
22  8 2009      NA        NA
23  8 2010      NA        NA
24  8 2011       3        NA
25  9 2009       1        NA
26  9 2010      NA         1
27  9 2011       1        NA
28 10 2009       3        NA
29 10 2010      NA         3
30 10 2011      NA        NA

You see? In long-form, you now can simply continue with

%>% mutate(move = case_when(
  is.na(.$municip) & is.na(.$last_year) ~ \"NA\",
  # etc.
  ))

Did you want the comparision from year i to the following year? Use the function lead instead of lag .

Lastly, your text-code might not work; when using case_when you have to refer to variables in the piped result with .$ .

Something like this?

mydata.wide %>% 
  pivot_longer(
    cols = -id,
    names_pattern = "([a-z]+?)\\.(\\d+)", 
    names_to = c("name", "year"),
    values_to = "val",
    values_transform = list(val = as.character)
    ) %>% 
  pivot_wider(
    names_from = name,
    values_from = val
  ) %>% 
  print(n=30)
 A tibble: 30 × 4
      id year  municip move 
   <int> <chr> <chr>   <chr>
 1     1 2009  2       NA   
 2     1 2010  3       3    
 3     1 2011  NA      4    
 4     2 2009  2       NA   
 5     2 2010  NA      4    
 6     2 2011  2       1    
 7     3 2009  1       NA   
 8     3 2010  2       3    
 9     3 2011  1       3    
10     4 2009  NA      NA   
11     4 2010  NA      NA   
12     4 2011  1       1    
13     5 2009  NA      NA   
14     5 2010  2       1    
15     5 2011  3       3    
16     6 2009  3       NA   
17     6 2010  3       2    
18     6 2011  3       2    
19     7 2009  NA      NA   
20     7 2010  NA      NA   
21     7 2011  NA      NA   
22     8 2009  NA      NA   
23     8 2010  2       1    
24     8 2011  NA      4    
25     9 2009  3       NA   
26     9 2010  2       3    
27     9 2011  NA      4    
28    10 2009  2       NA   
29    10 2010  3       3    
30    10 2011  1       3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM