I have five different text files that all contain 7 digit lines. I want to create a table that tells me, using 1s and 0s, whether the line is present in the file.
For example:
file1.txt file2.txt file3.txt
xxxxxxx 1 0 1
xxxxxxx 0 1 1
xxxxxxx 1 1 1
I have little to no experience in R, or any kind of coding. Can someone help me? I can add more information if someone asks.
Assuming three files exist:
==> file1.txt <==
1111111
3333333
==> file2.txt <==
2222222
3333333
==> file3.txt <==
1111111
2222222
3333333
I would read the three files into separate dataframes:
file1=read.csv('file1.txt', header=FALSE)
file2=read.csv('file3.txt', header=FALSE)
file3=read.csv('file3.txt', header=FALSE)
Add a flag to each:
file1$file1.txt <- rep(1,nrow(file1))
file2$file2.txt <- rep(1,nrow(file2))
file3$file3.txt <- rep(1,nrow(file3))
Perform an outer join using merge
, replace NA values with 0 and name rows with the values
merged=merge(merge(file1,file2, all=TRUE), file3, all=TRUE)
merged[is.na(merged)] <- 0
rownames(merged) <- merged[,1]
merged[,1] <- NULL
merged is now:
file1.txt file2.txt file3.txt
1111111 1 0 1
2222222 0 1 1
3333333 1 1 1
See in particular: https://stat.ethz.ch/R-manual/R-devel/library/utils/html/read.table.html https://stat.ethz.ch/R-manual/R-devel/library/base/html/merge.html
And: How to join (merge) data frames (inner, outer, left, right)?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.