I have two data frames Test and User.
Test has 100 000 rows while User has 1 400 000 rows. I want to extract specific vectors from User data frame and merge this with Test data frame. Ex I want Income and Cat for every row in Test from User. Rows in Test is with repeated elements and I want any one value from User file. I want to keep the test file without removing duplicates.
Ex for Name A Income is 100 , Cat is M & L. Since M occurs first I need M.
> Test
Name Income Cat
A
B
C
D
...
User Cat Income
A M 100
B M 320
C U 400
D L 900
A L 100
..
I used for loop but takes lot of time. I do not want to use merge function.
for (i in 1:nrow(Test)
{
{ Test[i,"Cat"]<-User[which(User$Name==Test[i,"Name"]),"Cat"][1]}
{ Test[i,"Income"]<-User[which(User$Name==Test[i,"Name"]),"Income"][1]}}
I used merge as well but the overall count for Test file is more than 100k rows. It is appending extra elements.
I want a faster way to do by avoiding for loop and merge. Can someone suggest any apply family functions.
You can use match
to find the first matching row (then vectorize the copying):
# Setup the data
User=data.frame(User=c('A','B','C','D','A'),Cat=c('M','M','U','L','L'),
Income=c(100,320,400,900,100))
Test=data.frame(Name=c('A','B','C','D'))
Test$Income<-NA
Test$Cat<-NA
> Test
Name Income Cat
1 A NA NA
2 B NA NA
3 C NA NA
4 D NA NA
## Copy only the first match to from User to Test
Test[,c("Income","Cat")]<-User[match(Test$Name,User$User),c("Income","Cat")]
> Test
Name Income Cat
1 A 100 M
2 B 320 M
3 C 400 U
4 D 900 L
Using dplyr
package you can do something like this:
library(dplyr)
df %>% group_by(Name) %>% slice(1)
For your example, you get:
Original data frame:
df
Name Cat Income
1 A M 100
2 B M 320
3 C U 400
4 D L 900
5 A L 100
Picking first occurrence:
df %>% group_by(Name) %>% slice(1)
Source: local data frame [4 x 3]
Groups: Name [4]
Name Cat Income
(chr) (chr) (int)
1 A M 100
2 B M 320
3 C U 400
4 D L 900
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.