简体   繁体   中英

Check if bullet point is in list

So I am trying to check and see if a bullet point is part of an item in a list by iterating through it with a for loop. I know that, at least in Regex a bullet point is defined as \• . But don't know how to use this. What I currently have but obviously doesn't work is something like this.

list = ['changing. • 5.0 oz.', 'hello', 'dfd','df', 'changing. • 5.0 oz.']
for items in list:
     if "\u2022" in items:
        print('yay')

Thanks in Advance!

Best if you use the re (regex) library. Something like this:

# import regex library
import re

# compile the regex pattern, using raw string (that's what the r"" is)
bullet_point = re.compile(r"\u2022")
list = ['changing. • 5.0 oz.', 'hello', 'dfd','df', 'changing. • 5.0 oz.']

# search each item in the list
for item in list:
    # search for bullet_point in item
    result = re.search(bullet_point, item)         
    if result:
        print('yay')

In Python 3 your code will work fine because UTF-8 is the default source code encoding . If you're going to be working with Unicode a lot, consider switching to Python 3.

In Python 2, the default is to treat literal strings as sequences of bytes , so you have to explicitly declare which strings are Unicode by prefixing them with u .

First, set your source code encoding as UTF-8.

# -*- coding: utf-8 -*-

Then tell Python to encode those strings as Unicode. Otherwise they'll be treated as individual bytes which will lead to odd things like Python thinking the first string has a length of 21 instead of 19.

print len(u'changing. • 5.0 oz.')    # 19 characters
print len('changing. • 5.0 oz.')     # 21 bytes

This is because the Unicode code point U+02022 BULLET is UTF-8 encoded as three bytes e2 80 a2 . The first treats it as a single character, the second as three bytes.

Finally, encode the character you're searching for as Unicode. That's either u'\•' or u'•' .

#!/usr/bin/env python
# -*- coding: utf-8 -*-

list = [u'changing. • 5.0 oz.', u'hello', u'dfd', u'df', u'changing. • 5.0 oz.']
for item in list:
    if u'•' in item:
        print('yay')

Real code probably won't be using constant strings, so you have to make sure that whatever is in list is encoded as UTF-8.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM