简体   繁体   English

使用 str 数组过滤

[英]Filtering using a str array

I am trying to filter an ASCII list (which contains ASCII and other characters) by using an array that I have created.我正在尝试使用我创建的数组来过滤 ASCII 列表(其中包含 ASCII 和其他字符)。 I am trying to remove any integer string within the list.我正在尝试删除列表中的任何 integer 字符串。

import pandas as pd
with open('ASCII.txt') as f:
    data = f.read().replace('\t', ',')
    print(data, file=open('my_file.csv', 'w'))
df = list(data)
test = ['0','1','2','3','4','5','6','7','8','9']

for x in df:
    try:
        df = int(df)
        for i in range(0,9):
            while any(test) in df:
                df.remove('i') 
        print(df)
    except:
        continue

print(df)

This is what I currently have however, it does not work and outputs:这是我目前拥有的,但是它不起作用并输出:

['3', '3', ',', '0', '4', '1', ',', '2', '1', ',', '!', ',', '\n', '3', '4', ',', '0', '4', ...]

Your if condition for numbers is broken.您的if数字条件已损坏。 any checks if at least one element in the passed iterable is truthy, ie not an empty string in your case. any检查传递的 iterable 中的至少一个元素是否为真,即在您的情况下不是空字符串。

test = ['0','1','2','3','4','5','6','7','8','9']
while any(test) in df:  # Condition always evaluates to False
    df.remove('i')  # Only removes the character 'i' from df

So your condition any(test) evaluates to True .所以你的条件any(test)评估为True And now you are checking if True is in df which it isn't, so the condition evaluates to False .现在您正在检查True是否在df中,而事实并非如此,因此条件评估为False The next error is, that you try to remove the letter 'i' from your list with the remove call.下一个错误是,您尝试使用remove调用从列表中删除字母'i' This can be fixed by casting the integer to a string这可以通过将 integer 转换为字符串来解决

for i in range(9):
    # Cast integer to str
    while str(i) in df:
        # Remove str i from df
        df.remove(str(i))

Using a str list instead of the range function, you can directly iterate over the elements of the test list:使用str列表而不是range function,您可以直接迭代test列表的元素:

df = list(data)
test = ['0','1','2','3','4','5','6','7','8','9']

for num in test:
    # Loop as long as num appears in df
    while num in df:
        df.remove(num)  # removes all elements with value of num

By doing so you have to run a second loop to remove all appearances of the current num in df , as remove only removes the first occurrence of that value.通过这样做,您必须运行第二个循环以删除df中当前num的所有出现,因为remove只会删除该值的第一次出现。

Alternatively you can also check each element of df if it is a digit by using the str method isdigit .或者,您也可以使用str方法isdigit检查df的每个元素是否为数字。 But as you modify the list in-place you need to iterate over a copy.但是当您就地修改列表时,您需要迭代一个副本。 Otherwise you'll encounter side-effects as you reduce the size of df :否则,在减小df的大小时会遇到副作用:

# Use slice to create a copy of df
for el in df[:]:
    if el.isdigit():
        df.remove(el)

As you iterate over each element in df you don't need an inner loop to remove each occurrence of value el .当您遍历df中的每个元素时,您不需要内部循环来删除每次出现的值el

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM