[英]Replacing multiple characters in a string
I have a csv file that looks like this: 我有一个csv文件,看起来像这样:
Mon-000101,100.27242,9.608597,11.082,10.034,0.39,I,0.39,I,31.1,31.1,,double with 1355,,,,,,,,
Mon-000171,100.2923,9.52286,14.834,14.385,0.45,I,0.45,I,33.7,33.7,,,,,,,,,,
Mon-000174,100.27621,9.563802,11.605,10.134,0.95,I,1.29,I,30.8,30.8,,,,,,,,,,
...it's a few hundred lines long. ...这是几百行长。
I just want to grab the Mon-000101 (not just that specific one, but all the Mon-######) items. 我只想获取Mon-000101(不仅是特定的那个,而是所有Mon-######)物品。 I have this really really ugly little script I threw together:
我有一个非常丑陋的小脚本,我把它们放在一起:
file_list1 = open(raw_input("Enter your list file: "))
file_lines = []
for line in file_list1:
line.replace(' ','\n')
for item in line.split('\n'):
file_lines.append(item)
stringit = ''
for item in file_lines:
stringit += item
IDs = re.findall('Mon-\d\d\d\d\d\d',stringit)
stringIDs = str(IDs)
new = stringIDs.replace(',','\n')
newer = new.replace('\'','')
newer2 = newer.replace('[\]','')
newer3 = newer2.replace(']','')
newer4 = newer3.replace('[','')
newer5 = newer4.replace(' ','')
file_write = open("Testit.txt","w+")
file_write.write(newer4)
print newer4
file_write.close()
I know it's ugly. 我知道这很丑。 Clearly I don't know what I'm doing with the regex stuff, but aside from that I want to know a more efficient way of replacing all the characters that I'm replacing.
显然,我不知道我正在使用正则表达式的东西,但是除此之外,我想知道一种更有效的方式来替换我要替换的所有字符。 I know this isn't how it's done.
我知道这不是怎么做的。 I've tried something along the lines of
我已经尝试了一些方法
newer2 = newer.replace('([\',\[\] ])','')
which I sorta pieced together from various posts. 我从各种帖子中整理出来。 That didn't work though, in fact it didn't do anything.
但这没有用,实际上它什么也没做。
I want to see what a more efficient way of doing this looks like. 我想看看什么样的方法更有效。
Thanks. 谢谢。
I'm also aware that my variable naming is not sufficient/not up to the style guide. 我也意识到我的变量命名不够/不符合样式指南。 This is just something I quickly threw together.
这只是我很快就提出来的。
Assuming the IDs are always the first part of the line, this is a simple way to do it: 假设ID始终是该行的第一部分,这是一种简单的方法:
import csv
with open('some_list_file.txt', 'rb') as list_file:
reader = csv.reader(some_list_file)
with open('Testit.txt', 'W+') as output_file:
output_file.writelines(line[0] + '\n' for line in reader)
If the position varies, it gets just a little more complicated: 如果位置发生变化,它将变得更加复杂:
import csv
with open('some_list_file.txt', 'rb') as list_file:
reader = csv.reader(some_list_file)
with open('Testit.txt', 'W+') as output_file:
for line in reader:
IDs = [part for part in line if part.startswith('Mon-')]
if IDs:
output_file.write(IDs[0] + '\n') # or accept multiple ID values if that's a possibilty
You can shorten that a little if you're sure there's a Mon-
entry in every line: 如果您确定每行中都有一个
Mon-
条目,则可以将其缩短一点:
with open('Testit.txt', 'W+') as output_file:
output_file.writelines([part for part in line if part.startswith('Mon-')][0] + '\n' for line in reader])
将正则表达式模式^Mon\\-\\d{6}
与m
修饰符一起使用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.