简体   繁体   中英

Group by based on a column value and then add the group as a row to a dataframe in r

I have a dataframe like below:

sample mu count
sample1 T 10
sample1 G 3
sample2 T 4
sample2 G 2

Now I want to group these data like below:

        T G
sample1 10 3
sample2 4 2

Samples names as row names, mu values as column names and the count values are cell values in the desired dataframe.

We can use xtabs from base R

xtabs(count ~ sample + mu, df1)

-output

       mu
sample     G  T
  sample1  3 10
  sample2  2  4

Or use tapply

with(df1, tapply(count, list(sample, mu), I))
        G  T
sample1 3 10
sample2 2  4

data

df1 <- structure(list(sample = c("sample1", "sample1", "sample2", "sample2"
), mu = c("T", "G", "T", "G"), count = c(10L, 3L, 4L, 2L)), 
class = "data.frame", row.names = c(NA, 
-4L))
library(tidyverse)

df <- read_table("sample mu count
sample1 T 10
sample1 G 3
sample2 T 4
sample2 G 2")

df %>%  
  pivot_wider(names_from = mu, 
              values_from = count)

# A tibble: 2 x 3
  sample      T     G
  <chr>   <dbl> <dbl>
1 sample1    10     3
2 sample2     4     2

You can use dcast :

library(data.table)
dcast(setDT(df),sample~mu,value.var="count")

Output:

    sample     G     T
    <char> <int> <int>
1: sample1     3    10
2: sample2     2     4

Input:

df = structure(list(sample = c("sample1", "sample1", "sample2", "sample2"
), mu = c("T", "G", "T", "G"), count = c(10L, 3L, 4L, 2L)), row.names = c(NA, 
-4L), class = "data.frame")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM