简体   繁体   中英

Automating finding and converting values in r

I have a sample dataset with 45 rows and is given below.

 itemid                    title release_date
16    573          Body Snatchers          1993
17    670          Body Snatchers          1993
41   1645        Butcher Boy, The          1998
42   1650        Butcher Boy, The          1998
1     218               Cape Fear          1991
18    673               Cape Fear          1962
27   1234   Chairman of the Board          1998
43   1654   Chairman of the Board          1998
2     246             Chasing Amy          1997
5     268             Chasing Amy          1997
11    309                Deceiver          1997
37   1606                Deceiver          1997
28   1256 Designated Mourner, The          1997
29   1257 Designated Mourner, The          1997
12    329      Desperate Measures          1998
13    348      Desperate Measures          1998
9     304           Fly Away Home          1996
15    500           Fly Away Home          1996
26   1175               Hugo Pool          1997
39   1617               Hugo Pool          1997
31   1395       Hurricane Streets          1998
38   1607       Hurricane Streets          1998
10    305          Ice Storm, The          1997
21    865          Ice Storm, The          1997
4     266      Kull the Conqueror          1997
19    680      Kull the Conqueror          1997
22    876             Money Talks          1997
24    881             Money Talks          1997
35   1477              Nightwatch          1997
40   1625              Nightwatch          1997
6     274                 Sabrina          1995
14    486                 Sabrina          1954
33   1442     Scarlet Letter, The          1995
36   1542     Scarlet Letter, The          1926
3     251         Shall We Dance?          1996
30   1286         Shall We Dance?          1937
32   1429           Sliding Doors          1998
45   1680           Sliding Doors          1998
20    711  Substance of Fire, The          1996
44   1658  Substance of Fire, The          1996
23    878          That Darn Cat!          1997
25   1003          That Darn Cat!          1997
34   1444          That Darn Cat!          1965
7     297             Ulee's Gold          1997
8     303             Ulee's Gold          1997

what I am trying to do is to convert the itemid based on the movie name and if the release date of the movie is same. for example, The movie 'Ulee's Gold' has two item id's 297 & 303. I am trying to find a way to automate the process of checking the release date of the movie and if its same, itemid[2] of that movie should be replaced with itemid[1]. For the time being I have done it manually by extracting the itemid's into two vectors x & y and then changing them using vectorization. I want to know if there is a better way of getting this task done because there are only 18 movies with multiple id's but the dataset has a few hundred. Finding and processing this manually will be very time consuming.

I am providing the code that I have used to get this task done.

x <- c(670,1650,1654,268,1606,1257,348,500,1617,1607,865,680,881,1625,1680,1658,1003,303)
y<- c(573,1645,1234,246,309,1256,329,304,1175,1395,305,266,876,1477,1429,711,878,297)


for(i in 1:18)
{
  df$itemid[x[i]] <- y[i]

}

Is there a better way to get this done?

I think you can do it in dplyr straightforwardly:

Using your comment above, a brief example:

itemid <- c(878,1003,1444,297,303)
title <- c(rep("That Darn Cat!", 3), rep("Ulee's Gold", 2))
year <- c(1997,1997,1965,1997,1997)

temp <- data.frame(itemid,title,year)
temp

library(dplyr)

temp %>% group_by(title,year) %>% mutate(itemid1 = min(itemid))

(I changed 'release_date' to 'year' for some reason... but this basically groups the title/year together, searches for the minimum itemid and the mutate creates a new variable with this lowest 'itemid'.

which gives:

#  itemid          title year itemid1
#1    878 That Darn Cat! 1997     878
#2   1003 That Darn Cat! 1997     878
#3   1444 That Darn Cat! 1965    1444
#4    297    Ulee's Gold 1997     297
#5    303    Ulee's Gold 1997     297

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM