Code -
df['Expiry'], df['Symbol'] = None, None
index_Ticker = df.columns.get_loc('Ticker')
index_Expiry = df.columns.get_loc('Expiry')
index_Symbol = df.columns.get_loc('Symbol')
Expiry_Pattern = r'-([A-Z]{1,3})'
Symbol_Pattern = r'(.*?)-[A-Z]{1,3}'
for row in range(0, len(df)):
Expiry = re.search(Expiry_Pattern, df.iat[row, index_Ticker]).group()
df.iat[row, index_Expiry] = Expiry
Symbol = re.search(Symbol_Pattern, df.iat[row, index_Ticker]).group()
df.iat[row, index_Symbol] = Symbol
here I'm using this regex
Expiry_Pattern = r'-([A-Z]{1,3})'
Symbol_Pattern = r'(.*?)-[A-Z]{1,3}'
And my output is - Output Image
And My actual data is in this format -
ZEEL-III.NFO
RELIANCE-III.NFO
ADANIPORTS-I.NFO
ZEEL-II.
AARTIIND-III.NFO
but I want output -
ZEEL III
RELIANCE III
ADANIPORTS I
ZEEL II
AARTIIND III
I don't understand how can I solve this issue.
You can use the regex '-?(\\w+)(?=-|\\.)'
to get the expected output for the sample data you have:
>>> df['col'].str.findall('-?(\w+)(?=-|\.)').apply(pd.Series)
0 1
0 ZEEL III
1 RELIANCE III
2 ADANIPORTS I
3 ZEEL II
4 AARTIIND III`
Pattern Explanation :
'-?(\\w+)(?=-|\\.)'
-?
will match one or zero occurrence of hyphen -
in the beginning(\\w+)
captures the word/substring (?=-|\\.)
is positive lookahead to make sure it ends with -
or .
The Non-regex solution:
Right split the string first on .
with maxsplit n
as 1, then take the value at first index, and split it on -
:
df['col'].str.rsplit('.', n=1).str[:-1].str[0].str.split('-').apply(pd.Series)
0 1
0 ZEEL III
1 RELIANCE III
2 ADANIPORTS I
3 ZEEL II
4 AARTIIND III
I extract value -
df["Symbol"] = df["Ticker"].str.extract('(.*?)-').apply(pd.Series)
df["Expiry"] = df["Ticker"].str.extract('-([A-Z]{1,3})').apply(pd.Series)
and create two columns.
now my Output is also the same as I want. Output Image
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.