[英]Beginner question about modifying python list
I'm a newbie to python. 我是python的新手。 Currently i'm learning about list.
目前我正在学习列表。 I try to add every words from the "words.txt" file to a list.
我尝试将“words.txt”文件中的每个单词添加到列表中。 But when I tried
但是当我尝试的时候
words += wordevery characters becomes an element of the list.
words += [word]and it worked.单词+ = [单词]
fhand = open("words.txt") words = list() for line in fhand: for word in line.split(): words += [word] print(words)
When you want to add word into list as an element. 如果要将单词添加到列表中作为元素。
usually use .append() 通常使用.append()
fhand = open("words.txt")
words = list()
for line in fhand:
for word in line.split():
words.append(word)
print(words)
Word
is a string, which is itself a collection of objects(characters), if you used word[0]
, you will get the 1st element in the word, by default python lists maintain data types, so a collection of characters remains a collection of characters when you append it to the list and results in a list of characters, in the second case you are explicitly declaring that you want to append [word]
to the list, and not it's characters, so it becomes a list of strings. Word
是一个字符串,它本身就是一个对象(字符)的集合,如果你使用word[0]
,你将获得单词中的第一个元素,默认情况下python列表保持数据类型,所以一组字符仍然是一个集合当你将它附加到列表并产生一个字符列表时的字符,在第二种情况下,你明确声明你要将[word]
追加到列表,而不是它的字符,所以它成为一个字符串列表。 If that is still not clear feel free to comment. 如果仍然不清楚,请随意评论。
你只能添加列表到列表,所以当你添加字符串列表时,你将字符串视为字符列表,所以它将字符添加为元素,第二种方式你声明你有列表,单词是元素本身,所以它添加整个单词作为元素。
In python, a string itself is internally a list of 'unicode' characters, albeit considered a different datatype. 在python中,字符串本身在内部是“unicode”字符列表,虽然被认为是不同的数据类型。 So when you do
words += word
it appends each new character to the empty list. 因此,当您执行
words += word
它会将每个新字符附加到空列表中。 But when you do words += [word]
, [word] is considered a list of one single string, so it appends only one item to the empty list 但是当你执行
words += [word]
,[word]被认为是一个单个字符串的列表,因此它只将一个项目附加到空列表中
The +=
operator on a list
is equivalent to calling its extend method, which takes an iterable as an argument and appends each item to the list. list
上的+=
运算符等效于调用其extend方法,该方法将iterable作为参数并将每个项追加到列表中。 With words += word
, the right hand operand of +=
is a string, which is an iterable, so would be equivalent to writing words.extend(word)
. 与
words += word
,右手操作数+=
是一个字符串,它是一个可迭代,因此将相当于写入words.extend(word)
。
Lets go through your code: 让我们看看你的代码:
Consider words.txt
consists of the following text: 考虑
words.txt
包含以下文本:
hello, I am Solomon
Nice to meet you Solomon
So, you first open this file with fhand = open("words.txt")
, then you initialize a list called words
: 因此,首先使用
fhand = open("words.txt")
打开此文件,然后初始化一个名为words
的列表:
fhand = open("words.txt")
words = list()
Suggestion : Here its advisable to use the with
context manager to open the file. 建议 :这里建议使用
with
context manager来打开文件。 That way, you wouldn't have to close the file explicitly later. 这样,您就不必在以后明确地关闭文件。 If you are just using
open()
as above, you'd have to close the file in the end with fhand.close()
. 如果您只是使用上面的
open()
,则必须使用fhand.close()
最后关闭该文件。
with open("words.txt", 'r') as fhand:
#<--code--->
In the next line, you iterate over each line in fhand
. 在下一行中,您将
line in fhand
迭代每一line in fhand
。 Lets print line
which basically shows each line in the text: 让打印
line
基本上显示文本中的每一行:
for line in fhand:
print(line)
#Output:
hello, I am Solomon
Nice to meet you Solomon
Then you are iterating over line.split()
which splits the above lines of text into individual lists of words. 然后你在
line.split()
上迭代, line.split()
上面的文本行分成单独的单词列表。 If we print line.split()
: 如果我们打印
line.split()
:
for line in fhand:
print(line.split())
#Output:
['hello,', 'I', 'am', 'Solomon']
['Nice', 'to', 'meet', 'you', 'Solomon']
Suggestion : You could also make use of splitlines()
to break each line(boundary) into a separate list. 建议 :您还可以使用
splitlines()
将每一行(边界)分成单独的列表。 This is different from split()
as it does not break each line into words. 这与
split()
不同,因为它不会将每一行分解为单词。 This method also preserves whitespaces, so you will have to get rid of them with strip(' ')
if your text has any whitespaces in the end or beginning. 此方法还会保留空格,因此如果文本在结尾或开头有任何空格,则必须使用
strip(' ')
删除它们。 This method has no side effects and you can still use it: 此方法没有副作用,您仍然可以使用它:
for line_str in fhand:
print(line_str.strip(' ').splitlines())
#Output:
['hello, I am Solomon']
['Nice to meet you Solomon']
for line in line_str.strip(' ').splitlines(): #watch the indentation
print(line.split())
#Output:
['hello,', 'I', 'am', 'Solomon']
['Nice', 'to', 'meet', 'you', 'Solomon']
In the next piece of code you are iterating over each (word? or rather letter) in line.split()
(as you know we received a list of words with this method before) and then incrementing words
with the set of letters for each word
. 在下一段代码中,你在
line.split()
中迭代每个(word?或更确切的字母)(你知道我们之前用这个方法收到了一个单词列表),然后用每组字母递增words
word
。 So, basically you get a set of letters because you iterated over each word in the lists: 所以,基本上你得到一组字母,因为你迭代了列表中的每个单词:
for word in line.split():
words+=word
#Output:
['h', 'e', 'l', 'l', 'o', ',', 'I', 'a', 'm', 'S', 'o', 'l', 'o', 'm', 'o', 'n', 'N', 'i', 'c', 'e', 't', 'o', 'm', 'e', 'e', 't', 'y', 'o', 'u', 'S', 'o', 'l', 'o', 'm', 'o', 'n']
But most likely you are expecting a list of words in a single list words
. 但很可能你会期望单个列表
words
中的单词列表。 We can achieve this with the append()
method as it takes each word
in line.split()
and simply appends(or adds to the end of the list) to words
: 我们可以使用
append()
方法实现这一点,因为它接受line.split()
每个word
,并简单地将(或添加到列表的末尾line.split()
附加到words
:
for word in line.split():
words.append(word)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']
And then when we look at the other variation words += [word]
: 然后当我们查看其他变体
words += [word]
:
for word in line.split():
words += [word]
print(words)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']
This has the same effect as append()
. 这与
append()
具有相同的效果。 Why is that so? 为什么会这样? Lets print
[word]
which is nothing but a list of each word. 让我们打印
[word]
,它只是每个单词的列表。 This is expected because you are taking each word
from line.split()
and then concatenating to words
: 这是预期的,因为您从
line.split()
中获取每个word
,然后连接到words
:
print([word])
#Output:
['hello,']
['I']
['am']
['Solomon']
['Nice']
['to']
['meet']
['you']
['Solomon']
words += [word]
is equivalent to words = words + [word]
. words += [word]
相当于words = words + [word]
。 To see how this concatenation works, consider the following example which is equivalent to this statement: 要查看此并置的工作原理,请考虑以下示例,该示例与此语句等效:
words = list()
word = ["Hello"]
concat_words = words + word
print(concat_words)
#['Hello']
another_word = ["World"]
concat_some_more_words = words + another_word
print(concat_some_more_words)
#['World']
final_concatenation = concat_words + concat_some_more_words
print(final_concatenation)
#Output:
['Hello', 'World']
Lets try append()
on this example: 让我们在这个例子中尝试
append()
:
words1 = list()
words_splitted = ["Hello", "World"]
for word in words_splitted:
words1.append(word)
print(words1)
#['Hello', 'World']
This shows that concatenation is equivalent to appending but it is recommended practice to use append()
for lists: 这表明连接等同于附加,但建议练习对列表使用
append()
:
print(words1==final_concatenation)
#True
Returning back to the original question, let's make the whole code more compact using list comprehensions: 回到最初的问题,让我们使用列表推导使整个代码更紧凑:
with open("words.txt", 'r') as fhand:
words = [word for line in fhand for word in line.split()]
print(words)
#Output:
['hello,', 'I', 'am', 'Solomon', 'Nice', 'to', 'meet', 'you', 'Solomon']
You will notice I've used the with
context manager to leave file open/close to Python after the job is done(exits the context). 您会注意到我已经使用
with
context manager在作业完成后将文件打开/关闭到Python(退出上下文)。 Next, I've created a list words
with the same loops inside. 接下来,我创建了一个内部具有相同循环的列表
words
。 This is also called a list comprehension and is one of the most powerful features in Python. 这也称为列表推导,是Python中最强大的功能之一。 This makes the code more compact, easy to read and faster than appending .
这使得代码更紧凑,易于阅读并且比附加更快 。
Finally, initializing words = []
is much more cleaner than words = list()
. 最后,初始化
words = []
比words = list()
更清晰。 It is also much faster . 它也快得多 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.