[英]String split on digit and space
除非第二個單詞是小寫,否則如何將長字符串除以第一個空格?
df col
0 Apple The fruit. 20 Banana tree A fruit. 30 Carrot A Vegetable. 40
預期 Output:
df
fruit definition page
0 Apple The fruit. 20
1 Banana tree A fruit. 30
2 Carrot A Vegetable. 40
df.col.str.split('(\d+)').explode()
0 Apple The fruit.
0 20
0 Banana tree A fruit.
0 30
0 Carrot A Vegetable.
0 40
df.col.split(".", expand = True)
你可以這樣做:
new_df = pd.DataFrame()
new_df[["fruit", "definition"]] = df.col.str.split("\d+")\
.str[:-1].explode()\
.str.strip()\
.str.extract(r'^([A-Z][^A-Z]*)(.*)')
new_df["page"] = df.col.str.findall('\d+').explode()
new_df = new_df.reset_index(drop = True)
new_df
fruit definition page
0 Apple The fruit. 20
1 Banana tree A fruit. 30
2 Carrot A Vegetable. 40
文檔
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.