简体   繁体   English

关于修改python列表的初学者问题

[英]Beginner question about modifying python list

I'm a newbie to python. 我是python的新手。 Currently i'm learning about list. 目前我正在学习列表。 I try to add every words from the "words.txt" file to a list. 我尝试将“words.txt”文件中的每个单词添加到列表中。 But when I tried 但是当我尝试的时候

words += word
every characters becomes an element of the list. 每个字符都成为列表的一个元素。 I tried 我试过了
 words += [word] 单词+ = [单词] 
and it worked. 它起作用了。 But I want to know why the first way makes every characters an element and not words? 但我想知道为什么第一种方式使每个角色成为一个元素而不是单词?

 fhand = open("words.txt") words = list() for line in fhand: for word in line.split(): words += [word] print(words) 

When you want to add word into list as an element. 如果要将单词添加到列表中作为元素。

usually use .append() 通常使用.append()

fhand = open("words.txt")
words = list()
for line in fhand:
    for word in line.split():
        words.append(word)
print(words)

Word is a string, which is itself a collection of objects(characters), if you used word[0] , you will get the 1st element in the word, by default python lists maintain data types, so a collection of characters remains a collection of characters when you append it to the list and results in a list of characters, in the second case you are explicitly declaring that you want to append [word] to the list, and not it's characters, so it becomes a list of strings. Word是一个字符串,它本身就是一个对象(字符)的集合,如果你使用word[0] ,你将获得单词中的第一个元素,默认情况下python列表保持数据类型,所以一组字符仍然是一个集合当你将它附加到列表并产生一个字符列表时的字符,在第二种情况下,你明确声明你要将[word]追加到列表,而不是它的字符,所以它成为一个字符串列表。 If that is still not clear feel free to comment. 如果仍然不清楚,请随意评论。

你只能添加列表到列表,所以当你添加字符串列表时,你将字符串视为字符列表,所以它将字符添加为元素,第二种方式你声明你有列表,单词是元素本身,所以它添加整个单词作为元素。

In python, a string itself is internally a list of 'unicode' characters, albeit considered a different datatype. 在python中,字符串本身在内部是“unicode”字符列表,虽然被认为是不同的数据类型。 So when you do words += word it appends each new character to the empty list. 因此,当您执行words += word它会将每个新字符附加到空列表中。 But when you do words += [word] , [word] is considered a list of one single string, so it appends only one item to the empty list 但是当你执行words += [word] ,[word]被认为是一个单个字符串的列表,因此它只将一个项目附加到空列表中

The += operator on a list is equivalent to calling its extend method, which takes an iterable as an argument and appends each item to the list. list上的+=运算符等效于调用其extend方法,该方法将iterable作为参数并将每个项追加到列表中。 With words += word , the right hand operand of += is a string, which is an iterable, so would be equivalent to writing words.extend(word) . words += word ,右手操作数+=是一个字符串,它是一个可迭代,因此将相当于写入words.extend(word)

Lets go through your code: 让我们看看你的代码:

Consider words.txt consists of the following text: 考虑words.txt包含以下文本:

hello, I am Solomon
Nice to meet you Solomon

So, you first open this file with fhand = open("words.txt") , then you initialize a list called words : 因此,首先使用fhand = open("words.txt")打开此文件,然后初始化一个名为words的列表:

fhand = open("words.txt")
words = list()

Suggestion : Here its advisable to use the with context manager to open the file. 建议 :这里建议使用with context manager来打开文件。 That way, you wouldn't have to close the file explicitly later. 这样,您就不必在以后明确地关闭文件。 If you are just using open() as above, you'd have to close the file in the end with fhand.close() . 如果您只是使用上面的open() ,则必须使用fhand.close()最后关闭该文件。

with open("words.txt", 'r') as fhand:
    #<--code--->

In the next line, you iterate over each line in fhand . 在下一行中,您将line in fhand迭代每一line in fhand Lets print line which basically shows each line in the text: 让打印line基本上显示文本中的每一行:

for line in fhand:
    print(line)
#Output:
hello, I am Solomon

Nice to meet you Solomon

Then you are iterating over line.split() which splits the above lines of text into individual lists of words. 然后你在line.split()上迭代, line.split()上面的文本行分成单独的单词列表。 If we print line.split() : 如果我们打印line.split()

for line in fhand:
    print(line.split())
#Output:
['hello,', 'I', 'am', 'Solomon']
['Nice', 'to', 'meet', 'you', 'Solomon']

Suggestion : You could also make use of splitlines() to break each line(boundary) into a separate list. 建议 :您还可以使用splitlines()将每一行(边界)分成单独的列表。 This is different from split() as it does not break each line into words. 这与split()不同,因为它不会将每一行分解为单词。 This method also preserves whitespaces, so you will have to get rid of them with strip(' ') if your text has any whitespaces in the end or beginning. 此方法还会保留空格,因此如果文本在结尾或开头有任何空格,则必须使用strip(' ')删除它们。 This method has no side effects and you can still use it: 此方法没有副作用,您仍然可以使用它:

for line_str in fhand:
    print(line_str.strip(' ').splitlines())
    #Output:
    ['hello, I am Solomon']
    ['Nice to meet you Solomon']
    for line in line_str.strip(' ').splitlines(): #watch the indentation
        print(line.split())
        #Output:
        ['hello,', 'I', 'am', 'Solomon']
        ['Nice', 'to', 'meet', 'you', 'Solomon']

In the next piece of code you are iterating over each (word? or rather letter) in line.split() (as you know we received a list of words with this method before) and then incrementing words with the set of letters for each word . 在下一段代码中,你在line.split()中迭代每个(word?或更确切的字母)(你知道我们之前用这个方法收到了一个单词列表),然后用每组字母递增words word So, basically you get a set of letters because you iterated over each word in the lists: 所以,基本上你得到一组字母,因为你迭代了列表中的每个单词:

for word in line.split():
    words+=word
#Output:
['h', 'e', 'l', 'l', 'o', ',', 'I', 'a', 'm', 'S', 'o', 'l', 'o', 'm', 'o', 'n', 'N', 'i', 'c', 'e', 't', 'o', 'm', 'e', 'e', 't', 'y', 'o', 'u', 'S', 'o', 'l', 'o', 'm', 'o', 'n']

But most likely you are expecting a list of words in a single list words . 但很可能你会期望单个列表words中的单词列表。 We can achieve this with the append() method as it takes each word in line.split() and simply appends(or adds to the end of the list) to words : 我们可以使用append()方法实现这一点,因为它接受line.split()每个word ,并简单地将(或添加到列表的末尾line.split()附加到words

for word in line.split():
    words.append(word)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']

And then when we look at the other variation words += [word] : 然后当我们查看其他变体words += [word]

for word in line.split():
    words += [word]
print(words)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']

This has the same effect as append() . 这与append()具有相同的效果。 Why is that so? 为什么会这样? Lets print [word] which is nothing but a list of each word. 让我们打印[word] ,它只是每个单词的列表。 This is expected because you are taking each word from line.split() and then concatenating to words : 这是预期的,因为您从line.split()中获取每个word ,然后连接到words

print([word])
#Output:
['hello,']
['I']
['am']
['Solomon']
['Nice']
['to']
['meet']
['you']
['Solomon']

words += [word] is equivalent to words = words + [word] . words += [word]相当于words = words + [word] To see how this concatenation works, consider the following example which is equivalent to this statement: 要查看此并置的工作原理,请考虑以下示例,该示例与此语句等效:

words = list()
word = ["Hello"]
concat_words = words + word
print(concat_words)
#['Hello']
another_word = ["World"]
concat_some_more_words = words + another_word
print(concat_some_more_words)
#['World']
final_concatenation = concat_words + concat_some_more_words
print(final_concatenation)
#Output:
['Hello', 'World']

Lets try append() on this example: 让我们在这个例子中尝试append()

words1 = list()
words_splitted = ["Hello", "World"]
for word in words_splitted:
  words1.append(word)
print(words1)
#['Hello', 'World']

This shows that concatenation is equivalent to appending but it is recommended practice to use append() for lists: 这表明连接等同于附加,但建议练习对列表使用append()

print(words1==final_concatenation)
#True

Returning back to the original question, let's make the whole code more compact using list comprehensions: 回到最初的问题,让我们使用列表推导使整个代码更紧凑:

with open("words.txt", 'r') as fhand:
    words = [word for line in fhand for word in line.split()]
print(words)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']

You will notice I've used the with context manager to leave file open/close to Python after the job is done(exits the context). 您会注意到我已经使用with context manager在作业完成后将文件打开/关闭到Python(退出上下文)。 Next, I've created a list words with the same loops inside. 接下来,我创建了一个内部具有相同循环的列表words This is also called a list comprehension and is one of the most powerful features in Python. 这也称为列表推导,是Python中最强大的功能之一。 This makes the code more compact, easy to read and faster than appending . 这使得代码更紧凑,易于阅读并且比附加更快

Finally, initializing words = [] is much more cleaner than words = list() . 最后,初始化words = []words = list()更清晰。 It is also much faster . 它也快得多

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM