简体   繁体   English

替换Python列表元素的一部分

[英]Replace a part of a Python list element

I have a csv file like below: 我有一个csv文件,如下所示:

CSV: CSV:

H1,H2,H3
A_B,C1,D
F_2j,G,p5

I'm trying to remove '_' and numbers from the first column. 我正在尝试从第一列中删除“ _”和数字。 Here's what I've tried 这是我尝试过的

for i in range(len(max(cols, key=len))):
        transposed = ([(c[i] if i<len(c) else '') for c in cols])
        str(transposed[0]).replace("_",";").split()

It did replace '_', but the original transposed still prints the same csv file. 它确实替换了“ _”,但原始转置后仍打印相同的csv文件。 How can I replace this new column with the old? 如何用旧的替换新的列? Also, how can I remove digits just from column1 to give the following output? 另外,如何仅从列1中删除数字以提供以下输出?

Desired output: 所需的输出:

H1,H2,H3
A;B,C1,D
F;j,G,p5

The issue may be a basic misunderstanding of the behavior of replace - it returns a copy of the modified string, but does not modify the string in-place. 该问题可能是对replace行为的基本误解-它返回已修改字符串的副本,但不会就地修改字符串。 To have the replacement "take", you'd have to assign it back to the original string. 要获得替换“ take”,您必须将其分配回原始字符串。 Consider the following: 考虑以下:

>>> text = 'blah_blah_blah'
>>> print(text.replace('_', ';'))
blah;blah;blah
>>> print(text)
blah_blah_blah

As you can see, the original text string is untouched by the replace call. 如您所见, replace调用未触及原始text字符串。 To actually modify it: 实际修改它:

>>> text = text.replace('_', ';')
>>> print(text)
blah;blah;blah

As for eliminating numbers, you can go with the regular expression-based approach in the answer from @Hackaholic (which will nicely handle the '_' to ';' conversion as well) - I just thought there would be benefit in shedding light on the behavior the replace method for strings. 至于消除数字,您可以在@Hackaholic的答案中使用基于正则表达式的方法(它也可以很好地处理从'_'到';'的转换)-我只是认为可以简化一下行为的字符串replace方法。

you can try this: 您可以尝试以下方法:

import re
with open('file.csv') as f:
    for x in f:
        print re.sub("_\d*",';',x)   # here you can store it in variable and do procession on it

output: 输出:

H1,H2,H3
A;B,C1,D
F;j,G,p5 

I suggest using Python's CSV Module to both read and write. 我建议使用Python的CSV模块进行读写。 This may end up simplifying a lot of the logic you already have. 这可能最终简化了您已经拥有的许多逻辑。 Make sure you are actually writing the rows to a file (I don't see that in your example code). 确保您实际上是将行写到文件中(在示例代码中没有看到)。 I also suggest using regular expressions for the substitution and deletion: 我还建议使用正则表达式进行替换和删除:

sub = re.sub("_\d*", ";", my_column)
# use sub as your new column

Edit : I misread what OP wanted regarding digit removal. 编辑 :我误读了OP想要有关数字删除。 It's ambiguous about the rules of when to wipe the digits (only after a _ character? All digits IF there is a _?). 关于何时擦除数字的规则是模棱两可的(仅在_字符之后?如果有_ ?,则清除所有数字)。 Used OP's example output as the rule ("all digits after an _") 使用OP的示例输出作为规则(“ _后的所有数字”)

import csv
import re

with open("in.csv") as f, open("out.csv", "w") as out:
    out.write(next(f))
    r = csv.reader(f, delimiter=",")
    for row in r:
        out.write("{},{}\n".format(re.sub("_\d+|[_\d+]", ";",row[0]), ",".join(row[1:])))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM