简体   繁体   中英

Python: list index out of range - while/for loop

I have a list

abc = ['date1','sentence1','date2','sentence2'...]

I want to do sentiment analysis on the sentences. After that I want to store the results in a list that looks like:

xyz =[['date1','sentence1','sentiment1'],['date2','sentence2','sentiment2']...]

For this I have tried following code:

def result(doc):
    x = 2
    i = 3
    for lijn in doc:
        sentiment = classifier.classify(word_feats_test(doc[i]))
        xyz.extend(([doc[x],doc[i],sentiment])
        x = x + 2
        i = i + 2

The len(abc) is about 7500. I start out with x as 2 and i as 3, as I don't want to use the first two elements of the list.

I keep on getting the error 'list index out of range', no matter what I try (while, for loops...)

Can anybody help me out? Thank you!

As comments mentioned - we won't be able to help You with finding error in Your code without stacktrace. But it is easy to solve Your problem like this:

xyz = []
def result(abc):
    for item in xrange(0, len(abc), 2): # replace xrange with range in python3
        #sentiment = classifier.classify(word_feats_test(abc[item]))
        sentiment = "sentiment" + str(1 + (item + 1) / 2) 
        xyz.append([abc[item], abc[item + 1], sentiment])

You might want to read about built-in functions that makes programmers life easy. (Why worry about incrementing if range has that already?)

#output
[['date1', 'sentence1', 'sentiment1'],
 ['date2', 'sentence2', 'sentiment2'],
 ['date3', 'sentence3', 'sentiment3'],
 ['date4', 'sentence4', 'sentiment4'],
 ['date5', 'sentence5', 'sentiment5']]

Try this

i =0
for i in xrange(0,len(doc) -1)
    date = doc[i]
    sentence = doc[i + 1]
    sentiment = classifier.classify(word_feats_test(sentence))
    xyz.append([date,sentence,classifier])

Only need one index. The important thing is knowing when to stop.

Also, check out the difference between extend and append

Finally I would suggest you store your data as a list of dictionaries rather than a list of lists. That lets you access the items by field name rather than index , which makes for cleaner code.

If you want two elements from your list at a time, you can use a generator then pass the element/s to your classifier:

abc = ["ignore","ignore",'date1','sentence1','date2','sentence2']

from itertools import islice


def iter_doc(doc, skip=False):
    it = iter(doc)
    if skip: # if  skip is set, start from index doc[skip:]
         it = iter(islice(it, skip, None))
    date, sent = next(it), next(it)
    while date and sent:
        yield date, sent
        date, sent = next(it, ""), next(it, "")


for d, sen in result(abc, 2): # skip set to to so we ignore first two elements
    print(d, sen)

date1 sentence1
date2 sentence2

So to create you list of lists xyz you can use a list comprehension:

xyz = [ [d,sen,classifier.classify(word_feats_test(sen))] for d, sen in iter_doc(abc, 2)]

It's simple. you can try it:

>>> abc = ['date1','sentence1','date2','sentence2'...]    
>>> xyz = [[ abc[i], abc[i+1], "sentiment"+ str(i/2 + 1)] for i in range(0, len(abc), 2) ]
>>> xyz
output : [['date1', 'sentence1', 'sentiment1'], ['date2', 'sentence2', 'sentiment2'], .....]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM