简体   繁体   English

从一列中提取最后一个特定的单词/值并将其移至下一行

[英]Extract last specific word/value from one column and move it to the next row

I have a DataFrame like the following我有一个如下所示的 DataFrame

|Animals        | Type         | Year |
|Penguin AVES   | Omnivore     | 2015 |
|Caiman REP     | Carnivore    | 2018 |
|Komodo.Rep     | Carnivore    | 2019 |
|Blue Jay.aves  | Omnivore     | 2015 |
|Iguana+rep     | Carnivore    | 2020 |

I want to extract the last specific words (eg AVES and REP) from the values in column "Animals" and move it to the next row while keeping the values of the entire row.我想从“Animals”列的值中提取最后的特定单词(例如 AVES 和 REP),并将其移动到下一行,同时保留整行的值。 There are several specific words other than AVES and REP.除了 AVES 和 REP 之外,还有几个特定的词。 It's not very clean (as shown by the whitespace, dot, and "+" operator before the specific words).它不是很干净(如特定单词前的空格、点和“+”运算符所示)。 The expected new DataFrame would be like the following预期的新 DataFrame 将如下所示

| Animals        | Type         | Year |
| Penguin AVES   | Omnivore     | 2015 |
| AVES           | Omnivore     | 2015 |
| Caiman REP     | Carnivore    | 2018 |
| REP            | Carnivore    | 2018 |
| Komodo.Rep     | Carnivore    | 2019 |
| Rep            | Carnivore    | 2019 |
| Blue Jay.aves  | Omnivore     | 2015 |
| aves           | Omnivore     | 2015 |
| Iguana+rep     | Carnivore    | 2020 |
| rep            | Carnivore    | 2020 |

I was thinking of using a negative indexing to split the string, but I got confused with the lambda function for this particular issue.我正在考虑使用负索引来拆分字符串,但对于这个特定问题,我对 lambda function 感到困惑。 Any idea how I should approach this problem?知道我应该如何解决这个问题吗? Thanks in advance.提前致谢。

You can use str.extract to get the last word ( (\w+)$ regex, but you can also use a specific list (?i)(aves|rep)$ if needed) and assign it to replace the column, then concat the updated DataFrame to the original one, and sort_index with a stable method to interleave the rows:您可以使用str.extract获取最后一个单词( (\w+)$正则表达式,但如果需要,您也可以使用特定列表(?i)(aves|rep)$ concatassign给替换列,然后连接更新后的 DataFrame 为原来的,并且sort_index使用稳定的方法交错行:

out = (pd.concat([df, df.assign(Animals=df['Animals'].str.extract(r'(\w+)$'))])
         .sort_index(kind='stable', ignore_index=True)
      )

Output: Output:

         Animals       Type  Year
0   Penguin AVES   Omnivore  2015
1           AVES   Omnivore  2015
2     Caiman REP  Carnivore  2018
3            REP  Carnivore  2018
4     Komodo.Rep  Carnivore  2019
5            Rep  Carnivore  2019
6  Blue Jay.aves   Omnivore  2015
7           aves   Omnivore  2015
8     Iguana+rep  Carnivore  2020
9            rep  Carnivore  2020
alternative using stack :替代使用stack
cols = df.columns.difference(['Animals']).tolist()

out = (df.assign(Word=df['Animals'].str.extract(r'(\w+)$'))
         .set_index(cols).stack().reset_index(cols, name='Animals')
         .reset_index(drop=True)[df.columns]
      )
alternative with indexing:替代索引:

Duplicate all rows, modify the odd rows with the extracted word复制所有行,用提取的词修改奇数行

out = df.loc[df.index.repeat(2)].reset_index(drop=True)

out.loc[1::2, 'Animals'] = out.loc[1::2, 'Animals'].str.extract(r'(\w+)$', expand=False)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将最后一个字从一列移动到下一行 Pandas Dataframe - Move the last word from one column to next row in Pandas Dataframe 从一列中提取特定单词并将其移至下一行 - Extract specific words from one column and move it to the next row 如何迭代每一行并从一个 dataframe 的特定列中找到下一个匹配列值并将其与另一个 dataframe 进行比较? - How to iterate each row and find the next matching column value from a specific column from one dataframe and comparing it to another dataframe? 替换特定列值的最后一行值 - Replacing the last row value of a specific column value 从行中的最后一个非零值中减一; 多列 - Subtract one from last nonzero value in row; multiple column 从每个客户 ID 的下一列中识别最后一个是值和 0,然后从前一行下一列熊猫中获取值 - Identify the last yes value & 0 from next column for each customer id then get the value from previous row next column pandas 获取列中具有特定值的最后一行,Python - Get last row with specific value in column, Python 提取 DataFrame 列中的最后一个单词 - Extract last word in DataFrame column 如何从一列的一行中删除特定单词并使用 python 将删除的 substring 粘贴到另一列 - how to remove a specific word from a row of one column and paste the removed substring to another column using python 如何从 pandas dataframe 的列中提取最后一个词 - How to extract last word from a column of a pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM