簡體   English   中英

字符串在數字和空格上拆分

[英]String split on digit and space

除非第二個單詞是小寫,否則如何將長字符串除以第一個空格?

df                             col
0     Apple The fruit. 20 Banana tree A fruit. 30  Carrot A Vegetable. 40

預期 Output:

df
  fruit          definition      page
0 Apple          The fruit.       20
1 Banana tree    A fruit.         30
2 Carrot         A Vegetable.     40

df.col.str.split('(\d+)').explode()

0 Apple The fruit.
0  20
0 Banana tree A fruit.
0  30
0 Carrot A Vegetable.
0  40
df.col.split(".", expand = True)

你可以這樣做:

new_df = pd.DataFrame()

new_df[["fruit", "definition"]] = df.col.str.split("\d+")\
    .str[:-1].explode()\
    .str.strip()\
    .str.extract(r'^([A-Z][^A-Z]*)(.*)')

new_df["page"] = df.col.str.findall('\d+').explode()
new_df = new_df.reset_index(drop = True)
new_df
          fruit    definition page
0        Apple     The fruit.   20
1  Banana tree       A fruit.   30
2       Carrot   A Vegetable.   40

文檔

  1. pandas.Series.str.split
  2. pandas.Series.explode
  3. pandas.Series.str.strip
  4. pandas.Series.str.extract
  5. pandas.Series.str.findall
  6. pandas.DataFrame.reset_index

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM