简体   繁体   English

R - 仅保留列名与字符串匹配的列

[英]R - keep only columns with column names that match a string

I am relatively new to R and I've had a lot of luck finding answers on here, but this one has me stumped after 2 days of trying things.我对 R 比较陌生,我很幸运在这里找到了答案,但是在尝试了 2 天之后,这个问题让我难倒了。

I have a dataframe with column names like this:我有一个列名的数据框,如下所示:

TargetID sample1.beta sample1.avg sample1.error sample1.pval sample2.beta sample2.avg sample2.error sample2.pval TargetID sample1.beta sample1.avg sample1.error sample1.pval sample2.beta sample2.avg sample2.error sample2.pval

This repeats for thousands of samples.这对数千个样本重复。 I need to create multiple separate data frames for each piece of data: one for beta, one for avg, one for error, one for pval.我需要为每条数据创建多个单独的数据框:一个用于 beta,一个用于 avg,一个用于错误,一个用于 pval。 I also need to keep the 1st column with the TargetID in all data frames.我还需要在所有数据框中保留带有 TargetID 的第一列。 The resulting data frames would have column names like:生成的数据框将具有如下列名:

TargetID sample1.beta sample2.beta sample3.beta TargetID sample1.beta sample2.beta sample3.beta

TargetID sample1.pval sample2.pval sample3.pval目标ID sample1.pval sample2.pval sample3.pval

etc.等。

I have found answers for subsetting data frames but they don't seem to apply to selecting all columns that contain a specific string (and keeping the 1st column).我找到了对数据框进行子集化的答案,但它们似乎不适用于选择包含特定字符串的所有列(并保留第一列)。

I've also been exploring whether this is better done with the txt file before I import into R with awk.在使用 awk 导入到 R 之前,我也一直在探索使用 txt 文件是否更好地完成此操作。

Use grepl or grep in the second position of "[" with a pattern that includes TargetID and the subset string applied to names(dfrm_name) :在 "[" 的第二个位置使用greplgrep ,其模式包含TargetID和应用于names(dfrm_name)的子集字符串:

 avg_sub <- dfrm[ , grepl( "^TargetID|avg$", names(dfrm) ]

The "^" pattern matches the beginning of a string, while the "$" pattern matches the end of a string. “^”模式匹配字符串的开头,而“$”模式匹配字符串的结尾。

You can try (as you don't supply example data using mtcars ):您可以尝试(因为您不使用mtcars提供示例数据):

library(dplyr)
# select the column mpg and all the columns containing an r
head(mtcars %>% select(mpg, contains("r")))
                   mpg drat gear carb
Mazda RX4         21.0 3.90    4    4
Mazda RX4 Wag     21.0 3.90    4    4
Datsun 710        22.8 3.85    4    1
Hornet 4 Drive    21.4 3.08    3    1
Hornet Sportabout 18.7 3.15    3    2
Valiant           18.1 2.76    3    1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM