I'm trying to add ID codes to a dataframe (creel), which is a time series of shellfish landings from 2000 to 2015. The first few columns are given below:
head(creel,10)
week year boat fID
1 2000-W01 2000 Mousa NA
2 2000-W01 2000 Yell NA
3 2000-W01 2000 Foula NA
4 2000-W01 2000 Foula NA
5 2000-W02 2000 Foula NA
6 2000-W02 2000 Papa Stour NA
7 2000-W02 2000 Fetlar NA
8 2000-W02 2000 Unst NA
9 2000-W03 2000 Foula NA
10 2000-W03 2000 Fair Isle NA
...
The IDs are for boat owners, which change over time. I have details for who owned which boat and when, and have created unique codes to add to the "fID" column (which I created and filled with NAs). For the purposes of this, let's say "aa" is the ID for the Mousa, "ab" for Yell, "ac" for Foula etc. in the rows above. If the owner of Mousa then buys a new boat which is in the dataframe, the "aa" would go with them and be assigned to the new boat name.
The week vector was created from actual dates, using the ISOweek function. The week vector is an ordered factor, so R knows the correct order from start to finish in the time series:
creel$week <- as.ordered(creel$week)
#Levels:2000-W01 < 2000-W02 < 2000-W03 < 2000-W04 < 2000-W05<...<2015-W53
I have tried to add the unique fID codes for one boat to begin with, but it did not work:
creel$fID[which(creel$boat=="Mousa" & creel$week=>"2004-W53" & creel$week=<"2015-W53"),] <- as.factor("aa")
"aa" is the code I'd like to assign in the fID vector, only between weeks 2004-W53 and 2015-W53. I wasn't sure if R would recognise > or < when using with the weeks - I did find that unclass(creel$week) gives actual values for week numbers which could maybe be used instead.
I also tried using ifelse, but only for boats where the owner did not change throughout the entire dataset (in which case the week is irrelevant). Something like this, (also didn't work!):
creel$fID <- ifelse(creel$boat=="Unst", as.factor(creel$fID=="ad"), NA)
The dataset is very large, although I'm happy to do each person/boat combo individually if it's easier.
UPDATE: Here's an example of the other df I have, which details who owned which boat, and when:
Person code boat1 date_from date_to boat2 date_from2 date_to2
1 Bob aa Mousa 2002-W53 2005-W34 <NA> <NA> <NA>
2 Bill ab Yell 1999-W52 2010-W52 <NA> <NA> <NA>
3 James ac Foula 1999-W52 2005-W26 Mousa 2005-W35 2015-W53
4 Tom ad Unst 1999-W52 2015-W53 <NA> <NA> <NA>
5 Willie ae Fetlar 2007-W35 2015-W53 <NA> <NA> <NA>
6 Wayne af Yell 2011-W01 2013-W13 <NA> <NA> <NA>
You can see that James owned the "Mousa" after Bob, and that Wayne owned the "Yell" after Bill. I need James' ID to remain as "ac" for the weeks he owned both the Foula and the Mousa (ie so I can trace the fisherman through time, not necessarily just the boat).
Here is what I would do, though, there likely are better methods. I use dplyr
but only marginally to count to observations per week. I believe everything else is done in base R.
library(dplyr)
creel$ref.week<- rep(1:length(unique(creel$week)),
(creel %>% group_by(week) %>% summarise(n= n()))$n)
#add a reference column
creel.subset<-creel[creel$ref.week %in% c(1,2),]
#subset the weeks you want by that reference column. Obviously your
#reference weeks will be different.
creel.subset$fID<-with(creel.subset, ifelse(boat =="Mousa", "aa",
ifelse(boat == "Yell", "ab",
ifelse(boat == "Foula", "ac", NA))))
#name the fID's however you want. This is just example.
creel.subset
week year boat fID ref.week
1 2000-W01 2000 Mousa aa 1
2 2000-W01 2000 Yell ab 1
3 2000-W01 2000 Foula ac 1
4 2000-W01 2000 Foula ac 1
5 2000-W02 2000 Foula ac 2
6 2000-W02 2000 Papa_Stour <NA> 2
7 2000-W02 2000 Fetlar <NA> 2
8 2000-W02 2000 Unst <NA> 2
If you want to put it all back together into one big data.frame
:
creel.back_together<-rbind(creel.subset, creel[!creel$ref.week %in% c(1,2),])
creel.back_together
week year boat fID ref.week
1 2000-W01 2000 Mousa aa 1
2 2000-W01 2000 Yell ab 1
3 2000-W01 2000 Foula ac 1
4 2000-W01 2000 Foula ac 1
5 2000-W02 2000 Foula ac 2
6 2000-W02 2000 Papa_Stour <NA> 2
7 2000-W02 2000 Fetlar <NA> 2
8 2000-W02 2000 Unst <NA> 2
9 2000-W03 2000 Foula <NA> 3
10 2000-W03 2000 Fair_Isle <NA> 3
Edit: I spent over an hour trying to figure out how to make this work with the ISOweek
values, but no luck. I definitely think this would be easier to do dealing with regular date values. Here is my solution with the additional data.frame
you provided which i called mydata
, though it ended up becoming mydata3
. I do expect this to be fairly slow for large data sets, but I am pretty sure it does what you are wanting:
library(ISOweek)
library(lubridate)
library(data.table)
fullWeek<-function(x){
paste(x, "-1", sep = "")
}
creel$week<-as.character(creel$week)
creel$week<-fullWeek(creel$week)
creel$week<-ISOweek2date(creel$week)
creel$week<-as_date(ymd(creel$week))
mydata1<-mydata[,1:5]
mydata2<-mydata[,c(1:2,6:8)]
colnames(mydata2)<-colnames(mydata1)
mydata3<-na.omit(rbind(mydata1, mydata2))
mydata3[,4:5]<-sapply(mydata3[,4:5], fullWeek)
mydata3[,4:5]<-lapply(mydata3[,4:5], ISOweek2date)
mydata3[,4:5]<-lapply(mydata3[,4:5], function(x) as_date(ymd(x)))
## undoing all of the ISOweek nonsense
for(i in 1:nrow(mydata3)){
boat1<-mydata3[i,]$boat1
date_from<-mydata3[i,]$date_from
date_to<-mydata3[i,]$date_to
code<-mydata3[i,]$code
for(j in 1:nrow(creel)){
boat2<-creel[j,]$boat
date<-creel[j,]$week
if(boat1 == boat2 && date %between% c(date_from, date_to)) {
creel[j,]$fID<-code
}
}
}
creel
week year boat fID
2000-01-03 2000 Mousa <NA>
2000-01-03 2000 Yell ab
2000-01-03 2000 Foula ac
2000-01-03 2000 Foula ac
2000-01-10 2000 Foula ac
2000-01-10 2000 Papa_Stour <NA>
2000-01-10 2000 Fetlar <NA>
2000-01-10 2000 Unst ad
2000-01-17 2000 Foula ac
2000-01-17 2000 Fair_Isle <NA>
Now, if you decide you want to work with the ISOweek
dates out of convenience, then:
creel$week<-ISOweek(creel$week)
creel
week year boat fID
2000-W01 2000 Mousa <NA>
2000-W01 2000 Yell ab
2000-W01 2000 Foula ac
2000-W01 2000 Foula ac
2000-W02 2000 Foula ac
....
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.