How do extract all the tweets from a data frame given a character vector of "user_id_str" in R

Question

enter image description hereI extracted tweets from Twitter Streaming API in R for a week between certain timeframes. I have a data frame of 42 variables one of which is user_id_str . It is of type Character . I have a character vector of user IDs with me. What I want to be able to do is get all the tweets associated with the user IDs in the character vector. I am certain that all the user IDs in the character vector are present in the data frame as well.

timeframe_tue is a dataset where there are all the tweets of tuesday between 11:00 AM to 13:00 PM.

common_users is a character vector with the user IDs I am interested in. It has a length of 93

I tried running the following command and I got a data frame full of NAs with the same number of columns as timeframe_tue and 93 rows.

 com_tue <- timeframe_tue[timeframe_tue$user_id_str[common_user],]

 timeframe_tue[,"user_id_str"][user_count] –– this didn't work either

 timeframe_tue$user_id_str[timeframe_tue$user_id_str==user_count]–– Neither did this.

This is a sample of how my data frame looks like:

Can someone help me figure out the problem?

Answer 1

You can use a really fast solution based on data.table

# load library
 library(data.table)

# convert yout data.frame to data.table to speed up the process
  setDT(timeframe_tue)


# filter
  timeframe_tue[ user_id_str %in% common_user, ]

You can also make use of the %in% operator to correct your solution, like this:

 timeframe_tue[ timeframe_tue$user_id_str %in% common_user, ]

Answer 2

Here's a dplyr solution. You really are just looking for the correct "%in%" syntax.

library(dplyr)

timeframe_tue = tbl_df(timeframe_tue)    

timeframe_tue %>% filter(user_id_str %in% common_user)

How do extract all the tweets from a data frame given a character vector of "user_id_str" in R

Question

2 answers

solution1
1 ACCPTED 2016-05-02 20:37:07

solution2
0 2016-05-02 20:53:39

How do extract all the tweets from a data frame given a character vector of "user_id_str" in R

Question

2 answers

solution1 1 ACCPTED 2016-05-02 20:37:07

solution2 0 2016-05-02 20:53:39

solution1
1 ACCPTED 2016-05-02 20:37:07

solution2
0 2016-05-02 20:53:39