根据选定的变量/列名称过滤和设置R数据帧

Question

I am trying to subset a large data set with many variables/columns names, say ax1, ax2, ax3, ax4, ax5, ...,ax20, bx1...bx20...zx1...zx20. 我正在尝试对包含许多变量/列名称的大型数据集进行子集设置，例如ax1，ax2，ax3，ax4，ax5，...，ax20，bx1 ... bx20 ... zx1 ... zx20。 For example, suppose the subset data I want to obtain are on variables ax3, ax5, ax11, ax19,..., bx3, bx5, cx11, cx19,...,zx3, zx5, zx11, zx19. 例如，假设我要获取的子集数据位于变量ax3，ax5，ax11，ax19，...，bx3，bx5，cx11，cx19，...，zx3，zx5，zx11，zx19上。

I have tried the following code in R but it is becoming very lengthy and cumbersome. 我已经在R中尝试了以下代码，但是它变得非常冗长和繁琐。

setwd("")
abc<- read.table("abc.txt", header=TRUE)
new.abc<-data.frame(abc$ax3,abc$ax5,abc$ax5,abc$ax11,abc$ax19,  
abc$bx3,abc$bx5,abc$bx5,abc$bx11,abc$bx19)

The code is becoming longer as I need to continue with cx3, cx5, cx11, cx19,...,zx3, xz5, zx11, zx19. 由于我需要继续使用cx3，cx5，cx11，cx19，...，zx3，xz5，zx11，zx19，因此代码变得越来越长。 I am looking for an alternative approach that can avoid this lengthy coding. 我正在寻找一种可以避免这种冗长编码的替代方法。 Your help is greatly appreciated. 非常感谢您的帮助。

Answer 1

You could create columns programmatically. 您可以以编程方式创建列。 If they follow the same structure as mentioned in the question, we can do 如果他们遵循问题中提到的相同结构，我们可以

cols <- c(outer(paste0(letters, "x"), c(3, 5, 11, 19), paste0))
cols
#[1] "ax3"  "bx3"  "cx3"  "dx3"  "ex3"  "fx3"  "gx3"  "hx3"  "ix3"  "jx3"  "kx3"...

and then use it to subset the dataframe 然后用它来子集数据框

new.abc[, cols]

If we also want to preserve column order, we can use gtools::mixedsort 如果我们也想保留列顺序，可以使用gtools::mixedsort

new.abc[, gtools::mixedsort(cols)]

根据选定的变量/列名称过滤和设置R数据帧

问题描述

1 个解决方案

解决方案1
2 2019-08-26 05:07:01

根据选定的变量/列名称过滤和设置R数据帧

问题描述

1 个解决方案

解决方案1 2 2019-08-26 05:07:01

解决方案1
2 2019-08-26 05:07:01