The idea:
The sample tweets:
[
{
"tweet_id": 5675880,
"airline": "Delta",
"name": "JetBlueNews",
"text": "@JetBlue's new CEO seeks the right balance to please passengers and Wall ... - Greenfield Daily ",
"tweet_coord": [
null,
null
],
"tweet_created": "16-02-15 23:36",
"tweet_location": "USA",
"user_timezone": "Sydney"
},
{
"tweet_id": 5675881,
"airline": "Delta",
"name": "nesi_1992",
"text": "@JetBlue is REALLY getting on my nerves !! 😡😡 #nothappy",
"tweet_coord": [
null,
null
],
"tweet_created": "16-02-15 23:43",
"tweet_location": "undecided",
"user_timezone": "Pacific Time (US & Canada)"
},
1- I create a python script to generate the above tweets to kinesis this is the code and the sample output:
the code: Note: Am just using one values def put_to_stream(thing_id, property_value, property_timestamp):
# payload = {
# 'prop': str(property_value),
# 'timestamp': str(property_timestamp),
# 'thing_id': thing_id
# }
payload= {
"tweet_id": 5676295,
"airline": "US Airways",
"name": "liquidfox1",
"text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed",
"tweet_coord": [
'null',
'null'
],
"tweet_created": "17-02-15 11:13",
"tweet_location": "This is an AD account. 18+",
"user_timezone": ""
},
print (payload)
put_response = kc.put_record(
StreamName=my_stream_name,
Data=json.dumps(payload),
PartitionKey=thing_id)
while True:
property_value = random.randint(40, 120)
property_timestamp = calendar.timegm(datetime.utcnow().timetuple())
thing_id = str(random.randint(40, 120)) #'aa-bb'
put_to_stream(thing_id, property_value, property_timestamp)
# wait for 5 second
time.sleep(1)
the output:
({'tweet_id': 5676295, 'airline': 'US Airways', 'name': 'liquidfox1', 'text': '@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed', 'tweet_coord': ['null', 'null'], 'tweet_created': '17-02-15 11:13', 'tweet_location': 'This is an AD account. 18+', 'user_timezone': ''},)
the problem no in my lambda function python code: I want to extract the text from the json and pass it to comprehend to get the result "sentiment"
this is the code I try to read just one file with this code:
objectf = s3.Object(bucket_name, in_key_name).get()["Body"].read().decode('utf-8')
this is the output:
[{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}][{"tweet_id": 5676295, "airline": "US Airways", "name": "liquidfox1", "text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed", "tweet_coord": ["null", "null"], "tweet_created": "17-02-15 11:13", "tweet_location": "This is an AD account. 18+", "user_timezone": ""}]
then when I try to convert it using "json.dumps" I get this output:
"[{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}][{\"tweet_id\": 5676295, \"airline\": \"US Airways\", \"name\": \"liquidfox1\", \"text\": \"@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed\", \"tweet_coord\": [\"null\", \"null\"], \"tweet_created\": \"17-02-15 11:13\", \"tweet_location\": \"This is an AD account. 18+\", \"user_timezone\": \"\"}]"
I don't know the problem for this codes.
You have a trailing comma in the assignment of payload
. Remove it and everything should work:
# payload = {
# 'prop': str(property_value),
# 'timestamp': str(property_timestamp),
# 'thing_id': thing_id
# }
payload= {
"tweet_id": 5676295,
"airline": "US Airways",
"name": "liquidfox1",
"text": "@USAirways me too. In the future, have a better harsh weather preparedness plan. So much of your staff called out that everything snowballed",
"tweet_coord": [
'null',
'null'
],
"tweet_created": "17-02-15 11:13",
"tweet_location": "This is an AD account. 18+",
"user_timezone": ""
}
print (payload)
put_response = kc.put_record(
StreamName=my_stream_name,
Data=json.dumps(payload),
PartitionKey=thing_id)
while True:
property_value = random.randint(40, 120)
property_timestamp = calendar.timegm(datetime.utcnow().timetuple())
thing_id = str(random.randint(40, 120)) #'aa-bb'
put_to_stream(thing_id, property_value, property_timestamp)
# wait for 5 second
time.sleep(1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.