简体   繁体   中英

How to properly split a specific string in Python

I have a list of string values that represents the id.age of some users:

users = ["1.20", "2.35", "3", "4", "5.", "6.30", "7."]

How can I properly split it to get the id and age separately?

Note that we have some data with the age information missing (eg "3" and "4" ), and even worse, we have some data only with an id and a point (eg "5." and "7." ).

Sure I can use the split function, for example:

>>> "1.2".split('.')
['1', '2']
>>> "2".split('.')
['2']
>>> "3.".split('.')
['3', '']

But, then I will need to check each result. Maybe, something like this:

res = "3.".split('.')
id = int(res[0])
if len(res) > 1:
    if res[1] != "":
        age = int(res[1])

Another option is to use the rpartition function, for example:

>>> "1.2".rpartition('.')
('1', '.', '2')
>>> "2".rpartition('.')
('', '', '2')
>>> "3.".rpartition('.')
('3', '.', '')

But I still need to check the results 'manually' and, in the second example, the value that should be the id is in the age position. (eg ('', '', '2') ).

Is there a built in function that I can get the result like this?

>>> "1.2".some_split_function('.')
('1', '.', '2')
>>> "2".some_split_function('.')
('2', None, None)
>>> "3.".some_split_function('.')
('3', '.', None)

So I can just call it in a loop like this:

for user_info in users:
    id, _, age = user_info.some_split_function('.')
    print int(id)
    if age is not None:
        print int(age)

Yup, you just use partition instead of rpartition .

for user_info in users:
    id, _, age = user_info.partition('.')
    if age.isdigit():
        print int(age)

You'll want to change that conditional from being None to just checking if you've pulled out a number appropriately. This will take care of empty strings etc...

In general though, the way to avoid this problem is to not structure your data like that in the first place.

Seeing some of the other answers, no reason to do anything so complex. If you want a functional solution that maps id to age, then I would advocate for something like this:

>>> {id: age or None for id, _, age in [user.partition(".") for user in users]}
{'1': '20', '3': None, '2': '35', '5': None, '4': None, '7': None, '6': '30'}

Try the following, we split u only if it contains . , if not, u is the id and age is assigned None .

users = ["1.20", "2.35", "3", "4", "5.", "6.30", "7."]
data = []

for u in users:
    id, age = u.split('.') if '.' in u else [u, None]
    age = None if age == '' else age
    data.append({id: age})

If you want your ids to be integers, just call int() function on id like this:

data.append({int(id): age})

Output:

>>> data
[{'1': '20'}, {'2': '35'}, {'3': None}, {'4': None}, {'5': None}, {'6': '30'}, {'7': None}]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM