简体   繁体   中英

How can I create a new txt file by matching two different txt file and finding the same values in R?

I have 2 text files: File A and File B.

I will match the first column of File A and the first row of File B.

If the values of the first column in File A is in the first row of File B, I want to get those values along their all column values and the first row values that correspond to them.

File A:

"...1" "AZD5153" "I-BET-762" "I-BRD9" "JQ1" "OTX-015" "PFI-1" "RVX-208"
"1" "697" 0.155445 1.328728 7.6345 7.553337 0.496983 1.776878 24.540592
"2" "5637" 11.767517 66.561037 314.672133 3.891947 17.54448 10.27559 261.520227
"3" "22RV1" 2.144765 9.04165 193.4228 4.448654 19.315063 9.55938 72.036416
"4" "23132-87" 1.882177 41.26784 33.482054 10.959235 9.025218 19.621473 75.332425
"5" "42-MG-BA" 2.252297 26.56874 54.934795 7.92924 10.276993 7.937254 64.873664
"6" "639-V" 6.412568 16.979172 30.882936 12.444024 21.915518 6.449247 96.50391

File B:

"...1" "1321N1" "143B" "22RV1" "23132-87" "42-MG-BA"
"1" "100009676_at" 61161 62052 61249 66154 54236
"2" "10000_at" 81556 66152 45676 43519 66723
"3" "10001_at" 97864 99699 8872 91376 10029
"4" "10002_at" 37977 40304 38455 37085 36431
"5" "10003_at" 35458 38504 40458 39508 41589
"6" "100048912_at" 40034 37959 41465 39271 39157
"7" "100049716_at" 42744 46775 52087 47239 42522

Expected File:

"...1" "22RV1" "23132-87" "42-MG-BA"
"1" "100009676_at" 61249 66154 54236
"2" "10000_at" 45676 43519 66723
"3" "10001_at" 8872 91376 10029
"4" "10002_at" 38455 37085 36431
"5" "10003_at" 40458 39508 41589
"6" "100048912_at" 41465 39271 39157
"7" "100049716_at" 52087 47239 42522

First of all, ensure you have the correct paths to FILEA.txt and FILEB.txt, as well as the desired path to FILEC.txt. In my case, I did:

path_to_file_A <- path.expand("~/FILEA.txt")
path_to_file_B <- path.expand("~/FILEB.txt")
path_to_file_C <- path.expand("~/FILEC.txt")

Now the following code should work:

A <- read.table(path_to_file_A, header = TRUE, check.names = FALSE)
B <- read.table(path_to_file_B, header = TRUE, check.names = FALSE)

result <- cbind(B[1], B[na.omit(match(A[[1]], names(B)))])
write.table(result, path_to_file_C)

Which results in:

FILEC.txt

"...1" "22RV1" "23132-87" "42-MG-BA"
"1" "100009676_at" 61249 66154 54236
"2" "10000_at" 45676 43519 66723
"3" "10001_at" 8872 91376 10029
"4" "10002_at" 38455 37085 36431
"5" "10003_at" 40458 39508 41589
"6" "100048912_at" 41465 39271 39157
"7" "100049716_at" 52087 47239 42522

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM