简体   繁体   English

R从下面的行中减去每隔一行

[英]R Subtracting every other row from row below

So I searched for a few other questions but they weren't quite what I was looking for. 因此,我搜索了其他一些问题,但这些问题与我所寻找的不完全相同。

I have a data frame, with samples in columns, and conditions in rows. 我有一个数据框,列中有示例,行中有条件。 The data is arranged as below, except there are around 200 rows and around 30000 columns: 数据排列如下,除了大约200行和30000列:

donor_id time  stimulation         Gene_1         Gene_2         Gene_3         Gene_4         Gene_5         Gene_6         Gene_7         Gene_8
A        0.5h         U         80.56644               0        55.68308        3.567304        6.465864        1.095409        490.3318        2.322889
A        0.5h         Stim      79.37402               0        55.88619        4.394622        6.503430        1.190555        453.7305        0.169858
A        1h           U         62.73152               0        53.01435        3.596723        7.272073        0.736384        349.6818        1.307157
A        1h           Stim      54.82245               0        53.17697        3.445614        5.385228        1.520416        332.2109        1.378058
B        0.5h         U         69.89228               0        51.78394        2.410192        5.668343        1.482302        377.0095        0.589922
B        0.5h         Stim      64.42587               0        52.67998        1.085260        8.958538        0.977994        382.8479        0.312372
B        1h           U         56.47391        0.323123        52.93331        2.925232        5.650667        1.396532        356.9900        1.657515
B        1h           Stim      0.25548         0.085027        49.85429        1.355360        5.030664        2.175491        218.5442        0.290898

I want to subtract all of the "U" rows from the "Stim" rows, leaving me half the number of rows I started with. 我想从“刺激”行中减去所有“ U”行,剩下的行数只有我的一半。 Each row in the full table does have a unique combination of donor_id and time 完整表中的每一行都具有donor_id和时间的唯一组合

All the similar questions I can find by searching seem to want either to subtract one row from everything else, or want to subtract every row from the row above it, rather than every other row. 我通过搜索可以找到的所有类似问题似乎都是要从其他所有内容中减去一行,或者是要从其上方的行中减去每一行,而不是每隔一行。 I am sure there must be some way using a FOR loop or a lapply, but I can't figure out how to get it across all rows and columns. 我确信必须有某种方式可以使用FOR循环或lapply,但是我不知道如何在所有行和列中使用它。

This is a base R option: 这是基本的R选项:

aggregate(df[4:11], by = list("donor_id" = df$donor_id, "time" = df$time), diff)

  donor_id time    Gene_1    Gene_2   Gene_3    Gene_4    Gene_5    Gene_6    Gene_7
1        A 0.5h  -1.19242  0.000000  0.20311  0.827318  0.037566  0.095146  -36.6013
2        B 0.5h  -5.46641  0.000000  0.89604 -1.324932  3.290195 -0.504308    5.8384
3        A   1h  -7.90907  0.000000  0.16262 -0.151109 -1.886845  0.784032  -17.4709
4        B   1h -56.21843 -0.238096 -3.07902 -1.569872 -0.620003  0.778959 -138.4458
     Gene_8
1 -2.153031
2 -0.277550
3  0.070901
4 -1.366617

Or a dplyr solution: dplyr解决方案:

df %>%
  group_by(donor_id, time) %>%
  summarise_at(vars(starts_with("Gene")), diff)

# Groups:   donor_id [2]
  donor_id time  Gene_1 Gene_2 Gene_3 Gene_4  Gene_5  Gene_6  Gene_7  Gene_8
  <fct>    <fct>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1 A        0.5h   -1.19  0      0.203  0.827  0.0376  0.0951  -36.6  -2.15  
2 A        1h     -7.91  0      0.163 -0.151 -1.89    0.784   -17.5   0.0709
3 B        0.5h   -5.47  0      0.896 -1.32   3.29   -0.504     5.84 -0.278 
4 B        1h    -56.2  -0.238 -3.08  -1.57  -0.620   0.779  -138.   -1.37 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM