简体   繁体   English

如何从Python中删除包含String的字符?

[英]How to remove character containing String from Python?

I'm working with a large set of csv(table) and I need to remove character-containing cells and keep the numeric cells. 我正在使用大量csv(table),并且需要删除包含字符的单元格并保留数字单元格。

For example. 例如。

   p1     p2      p3       p4      p5
 dcf23e   2322   acc41   4212     cdefd

So In this case, I only want to remove dcf23e, acc41 and cdefd. 因此,在这种情况下,我只想删除dcf23e,acc41和cdefd。 After removing those strings, I want to keep them as empty cells. 删除这些字符串后,我想将它们保留为空单元格。

How would I do this? 我该怎么做? Thanks in advance. 提前致谢。

The code that I've tried is this... , this code remove characters in a string but the problem is, if a string is 23cdgf2, it makes a string 232 which is not what I want. 我尝试过的代码是...,该代码删除了字符串中的字符,但是问题是,如果字符串是23cdgf2,它将产生不是我想要的字符串232。 And after removing all the characters, when I try to convert strings to int for calculations, some of the strings became decimals since some string have 123def.24 -> 123.24 在删除所有字符之后,当我尝试将字符串转换为int进行计算时,由于某些字符串的格式为123def.24-> 123.24,因此某些字符串变为了小数。

temp = ''.join([c for c in temp if c in '1234567890.']) # Strip all non-numeric characters
# Now converting strings to integers for calculations, Using function to use   int() , because of the blank spaces cannot be converted to int
def mk_int(s):
    s = s.strip()
    return int(s) if s else 0
mk_int(temp)
print(temp)

Compile regex for performance and split the string for correctness 编译正则表达式以提高性能,并分割字符串以确保正确性

import re
regex = re.compile(r'.*\D+.*')
def my_parse_fun(line):
    return [regex.sub('', emt) for emt in line.split()]

From AbhiP's answer, you can also do 根据AbhiP的回答,您还可以

[val if val.isdigit() else '' for val in line.split()]

use regex 使用regex

import re
def covert_string_to_blank(_str):
    return ['' if re.findall("[a-zA-Z]+", c) else c for c in _str.split()]

or use isalpha : 或使用isalpha

def convert_string_to_blank(_str):
    return ['' if any(c.isalpha() for c in s) else s for s in _str.split()]

I would use a simple setup for doing quick tests. 我将使用简单的设置进行快速测试。

a = 'dcf23e   2322   acc41   4212     cdefd'
cleaned_val = lambda v: v if v.isdigit() else ''
[cleaned_val(val) for val in a.split()]

It will give you the results if strings are valid numbers otherwise empty string in their place. 如果字符串是有效数字,它将为您提供结果,否则为空字符串。

['', '2322', '', '4212', ''] ['','2322','','4212','']

However, this provides the strings only. 但是,这仅提供字符串。 If you want to convert the values into integers (replacing the wrong ones with 0 instead), change your lambda: 如果要将值转换为整数(用0代替错误的值),请更改lambda:

convert_to_int = lambda v: int(v) if v.isdigit() else 0

[convert_to_int(val) for val in a.split()]

Your new results will be all valid integers: 您的新结果将是所有有效的整数:

[0, 2322, 0, 4212, 0] [0,2322,0,4212,0]

have you tried a for loop with a try statement? 您是否尝试过使用try语句for循环?

temp = ['dcf23e','2322','acc41','4212','cdefd']
    index = 0
    for element in temp:
        try:
            element+1
        except:
            del temp[index]
        index = index+1
    print temp

or, if you want to convert the value to an int element you can write this: 或者,如果要将值转换为int元素,则可以这样编写:

temp = ['dcf23e','2322','acc41','4212','cdefd']
    index = 0
    for element in temp:
        try:
            element+1
        except:
            temp[index] = 0
        index = index+1
    print temp

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM