[英]Replace a part of a Python list element
I have a csv file like below: 我有一个csv文件,如下所示:
CSV: CSV:
H1,H2,H3
A_B,C1,D
F_2j,G,p5
I'm trying to remove '_' and numbers from the first column. 我正在尝试从第一列中删除“ _”和数字。 Here's what I've tried
这是我尝试过的
for i in range(len(max(cols, key=len))):
transposed = ([(c[i] if i<len(c) else '') for c in cols])
str(transposed[0]).replace("_",";").split()
It did replace '_', but the original transposed still prints the same csv file. 它确实替换了“ _”,但原始转置后仍打印相同的csv文件。 How can I replace this new column with the old?
如何用旧的替换新的列? Also, how can I remove digits just from column1 to give the following output?
另外,如何仅从列1中删除数字以提供以下输出?
Desired output: 所需的输出:
H1,H2,H3
A;B,C1,D
F;j,G,p5
The issue may be a basic misunderstanding of the behavior of replace
- it returns a copy of the modified string, but does not modify the string in-place. 该问题可能是对
replace
行为的基本误解-它返回已修改字符串的副本,但不会就地修改字符串。 To have the replacement "take", you'd have to assign it back to the original string. 要获得替换“ take”,您必须将其分配回原始字符串。 Consider the following:
考虑以下:
>>> text = 'blah_blah_blah'
>>> print(text.replace('_', ';'))
blah;blah;blah
>>> print(text)
blah_blah_blah
As you can see, the original text
string is untouched by the replace
call. 如您所见,
replace
调用未触及原始text
字符串。 To actually modify it: 实际修改它:
>>> text = text.replace('_', ';')
>>> print(text)
blah;blah;blah
As for eliminating numbers, you can go with the regular expression-based approach in the answer from @Hackaholic (which will nicely handle the '_' to ';' conversion as well) - I just thought there would be benefit in shedding light on the behavior the replace
method for strings. 至于消除数字,您可以在@Hackaholic的答案中使用基于正则表达式的方法(它也可以很好地处理从'_'到';'的转换)-我只是认为可以简化一下行为的字符串
replace
方法。
you can try this: 您可以尝试以下方法:
import re
with open('file.csv') as f:
for x in f:
print re.sub("_\d*",';',x) # here you can store it in variable and do procession on it
output: 输出:
H1,H2,H3
A;B,C1,D
F;j,G,p5
I suggest using Python's CSV Module to both read and write. 我建议使用Python的CSV模块进行读写。 This may end up simplifying a lot of the logic you already have.
这可能最终简化了您已经拥有的许多逻辑。 Make sure you are actually writing the rows to a file (I don't see that in your example code).
确保您实际上是将行写到文件中(在示例代码中没有看到)。 I also suggest using regular expressions for the substitution and deletion:
我还建议使用正则表达式进行替换和删除:
sub = re.sub("_\d*", ";", my_column)
# use sub as your new column
Edit : I misread what OP wanted regarding digit removal. 编辑 :我误读了OP想要有关数字删除。 It's ambiguous about the rules of when to wipe the digits (only after a _ character? All digits IF there is a _?).
关于何时擦除数字的规则是模棱两可的(仅在_字符之后?如果有_ ?,则清除所有数字)。 Used OP's example output as the rule ("all digits after an _")
使用OP的示例输出作为规则(“ _后的所有数字”)
import csv
import re
with open("in.csv") as f, open("out.csv", "w") as out:
out.write(next(f))
r = csv.reader(f, delimiter=",")
for row in r:
out.write("{},{}\n".format(re.sub("_\d+|[_\d+]", ";",row[0]), ",".join(row[1:])))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.