简体   繁体   English

在 pandas dataframe 中找到 substring 并保存在新的列中

[英]Find substring in pandas dataframe and save in new column

I have a dataframe with approx.我有一个 dataframe 大约。 10,000 rows and 10 columns. 10,000 行和 10 列。 And I have a string, which I want to search for in the dataframe, called 'atmosphere'.我有一个字符串,我想在 dataframe 中搜索它,称为“atmosphere”。 This string can only be found once in a row.该字符串连续只能找到一次。 I want to keep only the cells that contain this string, but with their whole content, and save them in a new column.我只想保留包含此字符串的单元格,但保留它们的全部内容,并将它们保存在新列中。 I already found the following solution, but it only gives me back "True" (when cell contains string) or "False" (when it does not).:我已经找到了以下解决方案,但它只返回“True”(当单元格包含字符串时)或“False”(当它不包含字符串时)。:

df.apply(lambda col: col.str.contains('atmosphere', case=False), axis=1)
Output:
  col_1  col_2  col_3  col_4 ...
1 True   False  False  False
2 False  True   False  False
3 True   False  False  False 
...

How can I get from this, to this?:我怎样才能从这个到这个?:

   new_col
1 today**atmosphere**is
2 **atmosphere**humid
3 the**atmosphere**now

If you already have your result, you can simply stack it:如果你已经有了你的结果,你可以简单地stack它:

df = pd.DataFrame({"a":["apple", "orange", "today atmosphere"],
                   "b":["pineapple", "atmosphere humid", "kiwi"],
                   "c":["the atmosphere now", "watermelon", "grapes"]})

                  a                 b                   c
0             apple         pineapple  the atmosphere now
1            orange  atmosphere humid          watermelon
2  today atmosphere              kiwi              grapes


print (df[df.apply(lambda col: col.str.contains('atmosphere', case=False), axis=1)].stack())

0  c    the atmosphere now
1  b      atmosphere humid
2  a      today atmosphere
dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM