简体   繁体   中英

python, reg ex pattern into dataframe

I am having trouble with my regex pattern. What am I missing here?

pattern = r'^(?P<filename>Cycle Narrative(?P<nlookup>[A-Z0-9-]+).docx?$)' 
dfCycleNarratives = dftemp[dftemp.columns[0]].str.extract(pattern, expand=False, flags=re.IGNORECASE)

df temp looks like this:

                                           0
0    Cycle Narrative - Louis Stevens.docx
1  Cycle Narrative - Steve Stevens.docx

I am trying to get my dfcyclenarratives to look like:

  filename                nlookup
0      Cycle Narrative     Louis Stevens
1      Cycle Narrative     Steve Stevens

my dfcyclenarratives currently looks like:

  filename nlookup
0      NaN     NaN
1      NaN     NaN

Any help would be appreciated. Thanks.

Try str.extractall :

pattern = r'(?P<filename>.*)\s+-\s+(?P<nlookup>.*)\.docx'
dfCycleNarratives = dftemp[dftemp.columns[0]].str.extractall(pattern).reset_index(drop=True)
print(dfCycleNarratives)

# Output
          filename        nlookup
0  Cycle Narrative  Louis Stevens
1  Cycle Narrative  Steve Stevens

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM