I know questions like this one have been asked in abundance, but I haven't found one that answers mine (maybe I oversaw sth, but I gave it my best;) ). Here's the problem: I have a pandas series like this:
ingredssplit
0 MAGERMILCH 65%
1 Wasser
2 Keks gemahlen 6% (WEIZENMEHL
3 Traubensaftkonzentrat
4 Palmöl)
5 Stärke
6 Maiskeimöl
7 Zucker
8 Antioxidationsmittel Ascorbinsäure¹
9 Thiamin (Vitamin B1).
dtype: object``
Now I want to remove everything in line 2 before the bracket. But this part changes everytime, sometimes it's "Keks gemahlen 6%", sometimes it's sth completly different. The only thing that is constant in line 2 before the "(" is the "%". So another possibility would be "abc de% (". How can I remove that part? My research brought me to the regular expressions operator and continuing, to this line:
for line in ingredssplit:
print(re.sub())
But now I don't know how to fill the code bracket correctly, so everything is named before "(Weizenmehl". Maybe there's also another way? Also, how do I remove the superscript 1 at "Ascorbinsäure"? Thanks guys, have a nice we!
Try str.extract
:
df.loc[[2], 'ingredssplit'] = (
df.loc[[2], 'ingredssplit'].str.extract('.*\((.*)')[0]
)
Okay, I found a solution. Thanks jcaliz, the '.*\(
part was golden: This is what I did:
item1 = []
for line in ingredssplit:
line=re.sub('.*\(', '', line)
item1.append(line)
def remove_punc(string):
punc = '''!()-[]{};:'"\,<>./?@#$^&*_~'''
for ele in string:
if ele in punc:
string = string.replace(ele, "")
return string
lis = [remove_punc(i) for i in item1]
lis = list(filter(None, lis))
lis=[i.lstrip() for i in lis]
lis=[i.rstrip() for i in lis]
lis
This gives me a list:
['MAGERMILCH 65%',
'Wasser',
'WEIZENMEHL',
'Traubensaftkonzentrat',
'Palmöl',
'Stärke',
'Maiskeimöl',
'Zucker',
'Antioxidationsmittel Ascorbinsäure¹',
'Vitamin B1']
which I can easily transform into a dataframe eg:
lis=pd.DataFrame(lis)
lis
0
0 MAGERMILCH 65%
1 Wasser
2 WEIZENMEHL
3 Traubensaftkonzentrat
4 Palmöl
5 Stärke
6 Maiskeimöl
7 Zucker
8 Antioxidationsmittel Ascorbinsäure¹
9 Vitamin B1
Thanks people: :)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.