[英]Using apply with a user-defined function in R
我在r中定義了以下函數:
#A function that compares color and dates to determine if there is a match
getTagColor <- function(color, date){
for (i in (1:nrow(TwistTieFix))){
if ((color == TwistTieFix$color_match[i]) &
(date > TwistTieFix$color_match[i]) &
(date <= TwistTieFix$julian_cut_off_date[i])) {
Data$color_code <- TwistTieFix$color_code[i]
print(Data$color_code)
}
}
}
然后,我嘗試使用apply()將函數應用於每一行。
#Apply the above function to the data set
testData <- apply(Data, 1, getTagColor(Data$tag_color,Data$julian_date))`
該代碼的目標是在Data中使用兩個變量,並根據TwistTieFix中的信息在Data(color_code)中找到另一個值放入新列中。 當我運行代碼時,我得到警告列表,說
In if ((color == TwistTieFix$color_match[i]) & (date > ... :
the condition has length > 1 and only the first element will be used
我無法確定該函數為什么不使用每一行中的日期和顏色,而不能在函數中使用它(至少我認為這是錯誤的)。 謝謝!
以下是使用的數據幀的示例:
TwistTieFix
color_name date color_code cut_off_date color_match julian_start julian_cut_off_date
yellow 2013-08-12 y1 2001-07-02 yellow 75 389
blue 2000-09-28 b1 2001-08-12 blue 112 430
數據
coll_date julian_date tag_color
2013-08-13 76 yellow
2013-08-14 76 yellow
2000-09-29 112 blue
數據具有更多由不同變量組成的列,但不允許我包含所有列。 但是,我已經在函數中引用的數據中包含了列。 數據集使用read.csv加載到r中,並且來自Excel csv文件。
在我看來,您似乎想加入Data和TwistTieFix,其中tag_color=color_match
和julian_start <= julian_date <= julian_cut_off_date
。 這是dput
形式的樣本數據dput
TwistTieFix <- structure(list(color_name = structure(c(2L, 1L), .Label = c("blue",
"yellow"), class = "factor"), date = structure(c(2L, 1L), .Label = c("2000-09-28",
"2013-08-12"), class = "factor"), color_code = structure(c(2L,
1L), .Label = c("b1", "y1"), class = "factor"), cut_off_date = structure(1:2, .Label = c("2001-07-02",
"2001-08-12"), class = "factor"), color_match = structure(c(2L,
1L), .Label = c("blue", "yellow"), class = "factor"), julian_start = c(75L,
112L), julian_cut_off_date = c(389L, 430L)), .Names = c("color_name",
"date", "color_code", "cut_off_date", "color_match", "julian_start",
"julian_cut_off_date"), class = "data.frame", row.names = c(NA,
-2L))
Data <- structure(list(coll_date = structure(c(2L, 3L, 1L), .Label = c("2000-09-29",
"2013-08-13", "2013-08-14"), class = "factor"), julian_date = c(76L,
76L, 112L), tag_color = structure(c(2L, 2L, 1L), .Label = c("blue",
"yellow"), class = "factor")), .Names = c("coll_date", "julian_date",
"tag_color"), class = "data.frame", row.names = c(NA, -3L))
執行此合並的一種簡單方法是使用data.table
庫。 你可以做
#convert to data.table and set keys
ttf<-setDT(TwistTieFix)
setkey(ttf, color_match, julian_start)
dt<-setDT(Data)
setkey(dt, tag_color, julian_date)
#merge and extract columns
ttf[dt, roll=T][julian_start<julian_cut_off_date,list(coll_date,
julian_date=julian_start, tag_color=color_match, color_code)]
要得到
coll_date julian_date tag_color color_code
1: 2000-09-29 112 blue b1
2: 2013-08-13 76 yellow y1
3: 2013-08-14 76 yellow y1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.