简体   繁体   中英

Python. How do I make a list of small portions of a bigger list?

for word in list6:
    if word = "TRUMP":

So, I have a list of every word in a debate transcript. When Trump speaks, it starts with "TRUMP". I need to take his words and put them into a seperate list. If the word in list6 is "TRUMP", then I need to put everything into a list until it says another person's name. He speaks more than once.

I just need help completing this loop.

list6 = ['TRUMP','I','am','good', 'HILLARY','I','am','good','too','TRUMP','But','How?']
person_words = {'TRUMP':[], 'HILLARY':[]}

person_names = person_words.keys()

one_person_onetime_words = []

for word in list6:
    if word in person_names:
        if len(one_person_onetime_words):
            person_words[this_person].append(one_person_onetime_words)
            one_person_onetime_words = []
        this_person = word
    else:
        one_person_onetime_words.append(word)

person_words[this_person].append(one_person_onetime_words)

print person_words

Gives

{'HILLARY': [['I', 'am', 'good', 'too']], 'TRUMP': [['I', 'am', 'good'], ['But', 'How?']]}

So, this in a single shot gives all the different talks by all the persons.

As mentioned by you in the comments to your question, if you want to get one person's words only you can use the following:

from copy import copy

list6 = ['TRUMP','I','am','good', 'HILLARY','I','am','good','too','TRUMP','But','How?']
person_words = []
all_persons = ['TRUMP', 'HILLARY']
person_looking_for = 'TRUMP'

filter_out_persons = copy(all_persons)
filter_out_persons.remove(person_looking_for)

person_onetime_words = []

capture_words = False
for word in list6:
    if word == person_looking_for:
        capture_words = True
        if len(person_onetime_words):
            person_words.append(person_onetime_words)
            person_onetime_words = []
    elif word not in filter_out_persons and capture_words:
        person_onetime_words.append(word)
    else:
        capture_words = False

person_words.append(person_onetime_words)
print "{}'s words".format(person_looking_for)
print person_words

That gives

TRUMP's words
[['I', 'am', 'good'], ['But', 'How?']]

And, the following will give a dictionary with words as keys and the value will be a dictionary again with frequency of each person for that word.

import pprint

list6 = ['TRUMP','I','am','good', 'HILLARY','I','am','good','too','TRUMP','But','How?']

person_names = ['TRUMP','HILLARY']

word_frequency = {}
for word in list6:
    if word in person_names:
        person = word
    else:
        word = word.lower()
        if word in word_frequency:
            if person in word_frequency[word]:
                word_frequency[word][person] += 1
            else:
                word_frequency[word][person] = 1
        else:
            word_frequency[word] = {person: 1}

pprint.pprint(word_frequency)

Gives

{'am': {'HILLARY': 1, 'TRUMP': 1},
 'but': {'TRUMP': 1},
 'good': {'HILLARY': 1, 'TRUMP': 1},
 'how?': {'TRUMP': 1},
 'i': {'HILLARY': 1, 'TRUMP': 1},
 'too': {'HILLARY': 1}}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM