Could you please help me with below code? I will try to bem sraight and simple as much as i can.
df_split = df_excelsb_melt[df_excelsb_melt['SB'].str.len() > 7] df_split['SB'].str.len().unique()
Out put was:
array([14, 21, 28], dtype=int64)
3)What i've tried to do:
explode(df_split.assign(SB=df_split.SB.str.split(range(0,df_split.SB.str.len(),7)),'SB')
out put was: SyntaxError: unexpected EOF while parsing
The above being said, the code should have split SB column in 7 characters.
Thanks in advance.
EDIT
A simple solution using regex
:
import re
import pandas as pd
data = [{'MOD': 42334,
'SB': '38-101138-3015',
'AC': 'AAA',
'COMPLIANCE': 'NOT INCORPORATED'},
{'MOD': 43765,
'SB': '49-300949-3012',
'AC': 'AAA',
'COMPLIANCE': 'NOT INCORPORATED'}]
df = pd.DataFrame(data)
df['SB'] = df['SB'].apply(lambda x : re.findall('.{1,7}', x))
df = df.explode('SB')
Output
| MOD | SB | AC | COMPLIANCE |
|------:|:--------|:-----|:-----------------|
| 42334 | 38-1011 | AAA | NOT INCORPORATED |
| 42334 | 38-3015 | AAA | NOT INCORPORATED |
| 43765 | 49-3009 | AAA | NOT INCORPORATED |
| 43765 | 49-3012 | AAA | NOT INCORPORATED |
Original solution
With a combination of df.iterrows()
and regex
:
output = []
#Loop through the records
for record in df.to_dict('records'):
#Find the SB codes with some regex logic
for x in re.findall('.{1,7}', record['SB']):
temp = record.copy()
temp['SB'] = x
#Append to the output list
output.append(temp)
new_df = pd.DataFrame(output)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.