[英]Python removing substrings from strings
I'm trying to remove some substrings from a string in a csv file. 我正在尝试从csv文件中的字符串中删除一些子字符串。
import csv
import string
input_file = open('in.csv', 'r')
output_file = open('out.csv', 'w')
data = csv.reader(input_file)
writer = csv.writer(output_file,quoting=csv.QUOTE_ALL)# dialect='excel')
specials = ("i'm", "hello", "bye")
for line in data:
line = str(line)
new_line = str.replace(line,specials,'')
writer.writerow(new_line.split(','))
input_file.close()
output_file.close()
So for this example: 因此,对于此示例:
hello. I'm obviously over the moon. If I am being honest I didn't think I'd get picked, so to get picked is obviously a big thing. bye.
I'd want the output to be: 我希望输出为:
obviously over the moon. If I am being honest I didn't think I'd get picked, so to get picked is obviously a big thing.
This however only works when im searching for a single word. 但是,这仅在即时消息搜索单个单词时有效。 So that specials = "I'm" for example.
因此,例如“ Special =“ I'm”。 Do I need to add my words to a list or an array?
我需要将单词添加到列表或数组中吗?
It seems like you're already splitting the input via the csv.reader
, but then you're throwing away all that goodness by turning the split line back into a string. 似乎您已经通过
csv.reader
分割了输入,但是随后您将分割线改回了字符串,从而丢掉了所有的好处。 It's best not to do this, but to keep working with the lists that are yielded from the csv reader. 最好不要这样做,而要继续使用csv阅读器生成的列表。 So, it becomes something like this:
因此,它变成了这样的东西:
for row in data:
new_row = [] # A place to hold the processed row data.
# look at each field in the row.
for field in row:
# remove all the special words.
new_field = field
for s in specials:
new_field = new_field.replace(s, '')
# add the sanitized field to the new "processed" row.
new_row.append(new_field)
# after all fields are processed, write it with the csv writer.
writer.writerow(new_row)
It looks like you aren't iterating through specials, since it's a tuple rather than a list, so it's only grabbing one of the values. 看起来您没有在遍历特殊项目,因为它是一个元组而不是一个列表,因此它只是获取其中一个值。 Try this:
尝试这个:
specials = ["i'm, "hello", "bye"]
for line in data:
new_line = str(line)
for word in specials:
new_line = str.replace(new_line, word, '')
writer.writerow(new_line.split(','))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.