简体   繁体   中英

Rearranging data frame for a Heat map plot in R

I am trying to plot a heat map with ggpolt and to do this I want to rearrange my data frame that like this:

 country    2012  2013  2014  2015  
   AUS        2    5     6     1    
   AUT        3    3     1     5    
   BEL        1    8     2     8    
   NED        5    3     0     5

into a date frame that looks like this:

country  year  value
  AUS    2012   2
  AUS    2013   5
  AUS    2014   6
  AUS    2015   1
  AUT    2012   3
  AUT    2013   3 
  AUT    2014   1
  AUT    2015   5
  BEL    2012   1
  BEL    2013   8 
  BEL    2014   2 
  BEL    2015   8
  NED    2012   5
  NED    2013   3
  NED    2014   0
  NED    2014   5

Namely, from a data frame with rows of years to a three-column data frame of country, year and a corresponding value.

THANKS

We can use pivot_longer

library(tidyr)
pivot_longer(df1, cols = -country, names_to = 'year')
#    country year value
#1      AUS 2012     2
#2      AUS 2013     5
#3      AUS 2014     6
#4      AUS 2015     1
#5      AUT 2012     3
#6      AUT 2013     3
#7      AUT 2014     1
#8      AUT 2015     5
#9      BEL 2012     1
#10     BEL 2013     8
#11     BEL 2014     2
#12     BEL 2015     8
#13     NED 2012     5
#14     NED 2013     3
#15     NED 2014     0
#16     NED 2015     5

data

df1 <- structure(list(country = c("AUS", "AUT", "BEL", "NED"), `2012` = c(2L, 
3L, 1L, 5L), `2013` = c(5L, 3L, 8L, 3L), `2014` = c(6L, 1L, 2L, 
0L), `2015` = c(1L, 5L, 8L, 5L)), class = "data.frame", row.names = c(NA, 
-4L))

Using melt :

library(data.table)
setDT(df); melt(df, id.vars = "country", variable.name = "year")

        country year value
 #1:     AUS     2012     2
 #2:     AUT     2012     3
 #3:     BEL     2012     1
 #4:     NED     2012     5
 #5:     AUS     2013     5
 #6:     AUT     2013     3
 #7:     BEL     2013     8
 #8:     NED     2013     3
 #9:     AUS     2014     6
#10:     AUT     2014     1
#11:     BEL     2014     2
#12:     NED     2014     0
#13:     AUS     2015     1
#14:     AUT     2015     5
#15:     BEL     2015     8
#16:     NED     2015     5

data

df <- structure(list(country = structure(1:4, .Label = c("AUS", "AUT", "BEL", "NED"), class = "factor"), `2012` = c(2L, 3L, 1L, 5L), `2013` = c(5L, 3L, 8L, 3L), `2014` = c(6L, 1L, 2L, 0L), `2015` = c(1L,5L, 8L, 5L)), class = "data.frame", row.names = c(NA, -4L))

A base R solution is this (re-using @akrun's data):

First, unlist values in df1[,2:5] and store in vector:

values <- as.numeric(unlist(df1[,2:5]))

Next, repeat country values and year values an appropriate number of times and store in vectors:

countries <- rep(df1$country, length(counts)/length(df1$country))
years <- rep(names(df1[,2:5]),length(counts)/length(df1$country))

Then combine all three vectors in new dataframe:

df1_long <-data.frame(countries, years, values)

Finally, order df1_long in (default) aphabetical order of df1_long$countries :

df1_long_ord <- df1_long[order(df1_long$countries),]

Result:

df1_long_ord
   countries years values
1        AUS  2012      2
5        AUS  2012      5
9        AUS  2012      6
13       AUS  2012      1
2        AUT  2013      3
6        AUT  2013      3
10       AUT  2013      1
14       AUT  2013      5
3        BEL  2014      1
7        BEL  2014      8
11       BEL  2014      2
15       BEL  2014      8
4        NED  2015      5
8        NED  2015      3
12       NED  2015      0
16       NED  2015      5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM