enter image description hereI extracted tweets from Twitter Streaming API in R for a week between certain timeframes. I have a data frame of 42 variables one of which is user_id_str
. It is of type Character
. I have a character vector of user IDs with me. What I want to be able to do is get all the tweets associated with the user IDs in the character vector. I am certain that all the user IDs in the character vector are present in the data frame as well.
timeframe_tue
is a dataset where there are all the tweets of tuesday between 11:00 AM to 13:00 PM.
common_users
is a character vector with the user IDs I am interested in. It has a length of 93
I tried running the following command and I got a data frame full of NAs with the same number of columns as timeframe_tue
and 93 rows.
com_tue <- timeframe_tue[timeframe_tue$user_id_str[common_user],]
timeframe_tue[,"user_id_str"][user_count] –– this didn't work either
timeframe_tue$user_id_str[timeframe_tue$user_id_str==user_count]–– Neither did this.
This is a sample of how my data frame looks like:
Can someone help me figure out the problem?
You can use a really fast solution based on data.table
# load library
library(data.table)
# convert yout data.frame to data.table to speed up the process
setDT(timeframe_tue)
# filter
timeframe_tue[ user_id_str %in% common_user, ]
You can also make use of the %in%
operator to correct your solution, like this:
timeframe_tue[ timeframe_tue$user_id_str %in% common_user, ]
Here's a dplyr solution. You really are just looking for the correct "%in%" syntax.
library(dplyr)
timeframe_tue = tbl_df(timeframe_tue)
timeframe_tue %>% filter(user_id_str %in% common_user)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.