
[英]For data.frame in R, pulling data from one data frame based on values from another data frame
[英]Append values to a data frame from another data frame, given a condition and improving efficiency of the code in R
我有一个名为train
的数据集,当两个数据集中的created_at
属性和user_id
属性都匹配时,我希望将这些值追加到total
列中。 以下是我编写的代码。
total = read.csv('Data.csv')
train = read.csv('train.csv', sep='\t')
train$lang=NA
train$tweet_lang=NA
train$time_zone=NA
train$instrumentalness=NA
train$liveness=NA
for (i in 1:nrow(train))
{
train[i,'lang'] = total[which( total$created_at == as.character(train[i,'created_at']) && total$user_id == as.character(train[i,'user_id']) ),'lang']
train[i,'tweet_lang'] = total[which( total$created_at == as.character(train[i,'created_at'])&& total$user_id == as.character(train[i,'user_id']) ),'tweet_lang']
train[i,'time_zone'] = total[which( total$created_at == as.character(train[i,'created_at'])&& total$user_id == as.character(train[i,'user_id']) ),'time_zone']
train[i,'instrumentalness'] = total[which( total$created_at == as.character(train[i,'created_at'])&& total$user_id == as.character(train[i,'user_id']) ),'instrumentalness']
train[i,'liveness'] = total[which( total$created_at == as.character(train[i,'created_at'])&& total$user_id == as.character(train[i,'user_id']) ),'liveness']
}
但是,对于i=3
,我得到错误: Error in x[...] <- m : replacement has length zero
。 我怎样才能填充数据集中的值train
即使它是一个空字符串? 同样,此实现(使用循环)非常慢。 有什么方法可以向量化或并行化代码以使其运行更快?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.