简体   繁体   English

如何在 R 中创建具有匹配行和列的 data.frame 列表

[英]How to create a list of data.frame with matched rows and columns in R

Suppose I have two data frames df1 and df2 :假设我有两个数据框df1df2

    set.seed(123)
    df1 <- data.frame(id=sample(letters[1:10], 10, replace = F), 
                      x=rnorm(10), y=rnorm(10), z=rnorm(10), u=rnorm(10))
    df1
   id           x           y           z           u
1   f -1.02642090  0.91899661 -1.02412879 -0.07130809
2   i -0.71040656 -0.57534696  0.11764660  1.44455086
3   g  0.25688371  0.60796432 -0.94747461  0.45150405
4   b -0.24669188 -1.61788271 -0.49055744  0.04123292
5   e -0.34754260 -0.05556197 -0.25609219 -0.42249683
6   c -0.95161857  0.51940720  1.84386201 -2.05324722
7   d -0.04502772  0.30115336 -0.65194990  1.13133721
8   j -0.78490447  0.10567619  0.23538657 -1.46064007
9   a -1.66794194 -0.64070601  0.07796085  0.73994751
10  h -0.38022652 -0.84970435 -0.96185663  1.90910357
    df2 <- data.frame(id=sample(letters[2:11], 10, replace = F), 
                      x=rnorm(10), y=rnorm(10), z=rnorm(10), v=rnorm(10))
    df2
   id           x           y           z           v
1   j -1.27745077 -0.08868545 -0.56426954  1.84483867
2   e  1.17719205 -1.59548490  0.97031123 -0.98191715
3   c  0.90250583  0.85170932 -0.01863398  2.19600376
4   h -1.26130418 -0.71356081  0.36237035 -0.20466767
5   b  0.83745515  1.06643034  2.01130559  0.97514294
6   i -2.34829031 -0.53624259 -1.17796750 -0.86756612
7   k  0.61097114  0.53591706 -0.75517048 -0.50118759
8   g -0.04786774 -1.82862663 -0.33128448  0.78559116
9   f -2.39919771 -1.81353336 -0.28370270 -2.10224732
10  d -0.01931896  1.37261371  0.31415290 -0.04220493

I would create a list or an object (prefer) with matched common rows (by id) and column names from df1, df2... such as我会创建一个列表或一个 object (首选),其中包含来自df1, df2...例如

df_lst
df1
  id          x          y          z
1  b -0.4456620 -0.4727914  1.2538149
2  c -1.2650612 -1.9666172  0.1533731
3  d  0.4978505  0.8377870  0.5539177
4  e  1.7869131 -1.6866933  0.6886403
5  f  0.3598138 -0.2179749 -0.2950715
6  g -0.5558411 -0.6250393  0.8215811
7  h  1.2240818 -1.0678237  0.4264642
8  i  0.4007715 -1.0260044  0.8951257
9  j -0.6868529  0.7013559 -1.1381369

df2
  id          x          y            z
1  b -1.0700682  0.4120223 -0.279333528
2  c -0.2416898 -0.1524106 -0.778997240
3  d  1.6232025  0.6343621 -0.685706846
4  e  1.2283928  2.1499193 -0.735026156
5  f  0.2760235 -1.3343536 -1.427685784
6  g -1.0489755  0.4958705  0.619283535
7  h -0.5208693  1.2339762 -0.006198262
8  i -0.7729782 -0.9007918 -0.319393809
9  j -0.4682005 -0.2288958 -0.374800093

We can use intersect to get the common, names and 'id' from each dataset.我们可以使用intersect从每个数据集中获取 common、 names和 'id'。 Then subset the rows with %in% and select the intersect ing columns然后用%in%selectintersect列进行subset

nm1 <- intersect(names(df1), names(df2))
nm2 <- intersect(df1$id, df2$id)
df1new <- subset(df1, id %in% nm2, select =nm1)
df1new <- df1new[order(df1new$id),]
df2new <- subset(df2, id %in% nm2, select = nm1)
df2new <- df2new[order(df2new$id),]

If there are many datasets, place them in a list , use Reduce to get the intersect ing column names and 'id'如果有很多数据集,将它们放在一个list ,使用Reduce获取intersect的列名和 'id'

lst1 <- list(df1, df2)
nm1 <- Reduce(intersect, lapply(lst1, names))
nm2 <- Reduce(intersect, lapply(lst1, `[[`, "id"))

lst2 <- lapply(lst1, subset, subset = id %in% nm2, select = nm1)

If it needs to be order ed如果需要order

lst2 <- lapply(lst1, function(x) {
            x1 <- subset(x, id %in% nm2, select = nm1)
            x1 <- x1[order(x1$id),]
            row.names(x1) <- NULL
            x1
         })

-output -输出

lst2
[[1]]
  id          x          y          z
1  b -0.4456620 -0.4727914  1.2538149
2  c -1.2650612 -1.9666172  0.1533731
3  d  0.4978505  0.8377870  0.5539177
4  e  1.7869131 -1.6866933  0.6886403
5  f  0.3598138 -0.2179749 -0.2950715
6  g -0.5558411 -0.6250393  0.8215811
7  h  1.2240818 -1.0678237  0.4264642
8  i  0.4007715 -1.0260044  0.8951257
9  j -0.6868529  0.7013559 -1.1381369

[[2]]
  id          x          y            z
1  b -1.0700682  0.4120223 -0.279333528
2  c -0.2416898 -0.1524106 -0.778997240
3  d  1.6232025  0.6343621 -0.685706846
4  e  1.2283928  2.1499193 -0.735026156
5  f  0.2760235 -1.3343536 -1.427685784
6  g -1.0489755  0.4958705  0.619283535
7  h -0.5208693  1.2339762 -0.006198262
8  i -0.7729782 -0.9007918 -0.319393809
9  j -0.4682005 -0.2288958 -0.374800093

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM