简体   繁体   中英

python checking strings in a list?

I am trying to iterate through a list checking every string in the list for a character.

test = [str(i) for i in range(100)]

for i in test:
    if '0' or '4' or '6' or '8' in str(i):
        test.remove(i)

I thought this would be fine but, the list is this after:

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99]

Where the '2' is removed but, the '41' is not? I've noticed its only even numbers but, don't know why.

There are two issues with your code. The first is that you are modifying the list while iterating over it. The second is that you are using the or operator in the wrong way – the condition in the if statement will always be True .

Here's a fixed version:

test = [i for i in range(100) if set("0468").isdisjoint(str(i))]

You have two problems. First, this if statement is always taken.

if '0' or '4' or '6' or '8' in str(i):

Because the string 0 is a non-zero-length string, and is therefore True . 0 or 4 is 0 and therefore True . The rest of the statement doesn't matter.

I expect you actually wanted to test whether each of those digits was in the string representation of the integer. As you have it written now, you are only testing whether 8 is in the string, and that's not even getting tested because the expression evaluates to True before it even gets there. Something like this would work:

if any(x in i for x in '0468'):

By the way, str(i) is superfluous because your list is already a list of strings.

The other problem is that you are deleting items from the list you are iterating over. So, here's what happens:

  • The first item, 0 , is removed, because your if statement is always taken.
  • The second item, 1 , becomes the first item, because you removed 0 .
  • The for loop goes on to the second item, which is now 2 .

In other words, because you deleted 0 , 1 is never tested. Thus, every other item (all the even numbers) are removed.

The easiest way to avoid this in your code is to iterate over a copy of the list:

for i in test[:]:

Consider this:

(Pdb) i=2
(Pdb) i='2'
(Pdb) '0' or 'beer' in str(i)
'0'
(Pdb) bool('0')
True

Do you see why '0' or '4' or '6' or '8' in str(i) is not always a boolean value?

Now, consider what you're removing:

>>> l=[str(i) for i in range(10)]
>>> for i in l:
...   print i
...   print l
...   l.remove(i)
... 
0
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
2
['1', '2', '3', '4', '5', '6', '7', '8', '9']
4
['1', '3', '4', '5', '6', '7', '8', '9']
6
['1', '3', '5', '6', '7', '8', '9']
8
['1', '3', '5', '7', '8', '9']

As you remove elements in-place while you're iterating over the list, you're shortening the list and skipping each next value.

Sven has the right pythonic solution , but I felt a more lengthy explanation might be worthwhile.

When you execute something like '0' or '4' , the or operator will verify if the first parameter ( '0' ) is false-ish and, if it is not false, it returns the first value. Since '0' is not false (it is a non-empty string, which yields True when used as boolean), it will return the first value and the following one will not even be considered:

>>> '0' or '4'
'0'

If you repeat it, you will get the same result:

>>> '0' or '4' or '6' or '8'
'0'

Also, the in operator has greater precedence, so it will be executed before each or :

'0' or '4' or '6' or '8' in '88' '0'

What you want to do in your condition is to verify if any of the values is in the result:

>>> '0' in str(i) or '4' in str(i) or '6' in str(i) or '8' in str(i)
True
>>> i = 17
>>> '0' in str(i) or '4' in str(i) or '6' in str(i) or '8' in str(i)
False

This is not the most elegant solution, but it is the best translation of your intentions.

So, what would be an elegant solution? As sugested by @sven, you can create a set ( more about it ) of the sought chars:

>>> sought = set("0468")
>>> sought
set(['0', '8', '4', '6'])

and then create a set of the digits in your number:

>>> i = 16
>>> set(str(i))
set(['1', '6'])

Now, just see if they are disjoint :

>>> i = 16
>>> sought.isdisjoint(set(str(i)))
False
>>> i = 17
>>> sought.isdisjoint(set(str(i)))
True

In this case, if the set are not disjoint, then you want to preserve it:

>>> found = []
>>> for i in range(100):
...     if sought.isdisjoint(set(str(i))):
...         found.append(i)
... 
>>> found
[1, 2, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 22, 23, 25, 27, 29, 31, 32, 33, 35, 37, 39, 51, 52, 53, 55, 57, 59, 71, 72, 73, 75, 77, 79, 91, 92, 93, 95, 97, 99]

Most of the time, every time you get yourself creating a for loop for filtering an iterator, what you really want is a list comprehension :

>>> [i for i in range(100) if sought.isdisjoint(set(str(i)))]
[1, 2, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 22, 23, 25, 27, 29, 31, 32, 33, 35, 37, 39, 51, 52, 53, 55, 57, 59, 71, 72, 73, 75, 77, 79, 91, 92, 93, 95, 97, 99]

Or, using a clumsier but more novice-friendly construct:

>>> [i for i in range(100) if not ( '0' in str(i) or '4' in str(i) or '6' in str(i) or '8' in str(i) )]
[1, 2, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 22, 23, 25, 27, 29, 31, 32, 33, 35, 37, 39, 51, 52, 53, 55, 57, 59, 71, 72, 73, 75, 77, 79, 91, 92, 93, 95, 97, 99]

The problem is here with the line:

if '0' or '4' or '6' or '8' in str(i):

It is incorrect. It would be better to write it so:

if any(x in i for x in '0468'):

The second problem (as mentioned Sven) is that you modify the list while iterating over it.

The best solution is:

[i for i in test if not any(x in i for x in '0468')]
itms = ['0','4','6','8']
test = [str(i) for i in range(100) if not  any([x in str(i) for x in itms])]

maybe .... thats untested but i think it will work

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM