在foreach中使用％dopar％写入数据框

Question

I want to use a foreach each loop running with a doParallel backend for getting tweets from a MySQL database with the RMySql package. 我想使用一个带有doParallel后端的foreach每个循环，通过RMySql软件包从MySQL数据库获取推文。

I create a connection to the database for every user id I want to query, then I get every tweet from that user by 200 batches. 我为要查询的每个用户ID创建到数据库的连接，然后按200个批次从该用户获得每条推文。 If the batch size is 0 (so there are no further tweets) I query next user id. 如果批处理大小为0（因此没有进一步的推文），则查询下一个用户ID。

I want to store the information in a dataframe called tweets, which has columns for the number of hashtags in a tweet and a column with dates. 我想将信息存储在一个名为tweets的数据框中，该数据框包含有关tweet中的＃标签数量的列以及带有日期的列。 For every tweet I want to find out how many hashtags it has and in which month it was created. 对于每条推文，我想知道它有多少个主题标签以及它在哪个月创建。 Then I want to increase the number in the dataframe by 1. 然后我想将数据框中的数字增加1。

So how can I write the results for every tweet in the dataframe? 那么，如何为数据框中的每个推文编写结果？

My dataframe in the beginning: 我的数据框架开始时：

| dates    | zero_ht | one_ht | two_ht | three_ht | four_ht | five_ht |
|----------|---------|--------|--------|----------|---------|---------|
| 01/01/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/02/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/03/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/04/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/05/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/06/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/07/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/08/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/09/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/10/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/11/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/12/13 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/01/14 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/02/14 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/03/14 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/04/14 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/05/14 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/06/14 | 0       | 0      | 0      | 0        | 0       | 0       |
| 01/07/14 | 0       | 0      | 0      | 0        | 0       | 0       |

My code: 我的代码：

x<- foreach(i=1:nrow(ids) ,.packages=c("DBI", "RMySQL"),.combine=rbind ) %dopar% {

con <- dbConnect(MySQL(), *CREDENTIALS*)

start <- 0

length <- 1
while(length > 0)
{
query <- *QUERY*
data <- dbGetQuery(con, query)

length <- nrow(data)

#print(paste("Starting at ",start,sep=""))

for(j in 1:length)
{   
    if(length==0)
    {

    }
    else{ 

    #get the number of hashtags used
    number <-   nchar((gsub("[^#]","",data$message[j])))

    #get the date the tweet was created
    date <- paste(format(as.Date(data$created_at[j]), "%Y-%m"),"-01",sep="")
    # just use it when there are less than 5 hashtags
    if(number < 5)
    {

        if(number==0)
        {


        tweets[tweets$dates==date,2] <- tweets[tweets$dates==date,2]+1


        }
        else{
            tweets[tweets$dates==date,number+1] <- tweets[tweets$dates==date,number+1]+1


        }

    }

}    
}
#increase the start by 200; to get the next 200 tweets
start <- start + 200

}
data.frame(date=date,number=number)
dbDisconnect(con) 
}

Answer 1

Thanks to the comments I could solve the problem: The reason for the list with just "TRUE"s in it, was that the last command in the foreach loop was 多亏了这些注释，我可以解决问题了：列表中仅包含“ TRUE”的原因是，foreach循环中的最后一条命令是

dbDisconnect(con)

And when the database connection was closed successfully it returns a "TRUE". 成功关闭数据库连接后，它将返回“ TRUE”。

So I just had to swap the last two lines and make 所以我只需要交换最后两行并

data.frame(date=date,number=number)

and everything worked fine. 而且一切正常。

Regards 问候

在foreach中使用％dopar％写入数据框

问题描述

1 个解决方案

解决方案1
0 2014-07-23 09:24:10

在foreach中使用％dopar％写入数据框

问题描述

1 个解决方案

解决方案1 0 2014-07-23 09:24:10

解决方案1
0 2014-07-23 09:24:10