简体   繁体   中英

Splitting the lines in a list of lines using a delimiter

I've a list of Chemical reactions and I want to split these reactions using a delimiter so that I end up with the species involved in the reaction. Is there any way out of this? For example:

H2 + O2 = 2H2O
Na2 + Cl2 = NaCl
Ag + Cl2 =  AgCl

I want to split the above reactions list in such a way that I end up with a list having the following [['H2', 'O2', '2H2O'],['Na2','Cl2','NaCl'],['Ag','Cl2','AgCl']]

You could do this with re.split() , splitting the string on one or more non-word characters :

>>> import re
>>> re.split(r'\W+', 'H2 + O2 = 2H2O')
['H2', 'O2', '2H2O']

Alternatively, you could use re.findall() to find all 'words':

>>> re.findall(r'\w+', 'H2 + O2 = 2H2O')
['H2', 'O2', '2H2O']

And if you want to strip leading numbers from the words, you can use a pattern like this:

>>> re.findall(r'\b\d*(\w+)', 'H2 + O2 = 2H2O')
['H2', 'O2', 'H2O']
import re
s = "H2 + O2 = 2H2O"
print (re.split("\W+", s))

# re.split takes a regular expression on which you can split the string.
# \W represents non-word character. For ASCII, word characters are [a-zA-Z0-9_]
# + represents one or more occurrences.

In your example, it splitted the string from ' + ' and ' = '

The str.split can't do this, so you can split your string in these ways:

First one is using re :

import re

re.split("+|=", "H2 + O2 = 2H2O")

Second is split manually:

mendeleev = []
cur = ""
for char in "H2 + O2 = 2H2O":
    if char in "+=":
        mendeleev.append(cur)
        cur = ""
    else:
        cur += char

Remember you should strip() your list elements (or do the str.replace(" ", "") first).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM