简体   繁体   English

当列在python pandas中具有对象dtype时,如何基于“列A”中的值填充“列B”?

[英]How to fill “column B” based on value in “column A” when the column has object dtype in python pandas?

I have a CSV file which I imported as a pandas dataframe. 我有一个CSV文件,我将其导入为熊猫数据框。 I want to create and fill up a column based on some specific terms in another column. 我想根据另一列中的某些特定术语来创建和填充一列。 The column that has all those values is an object dtype. 具有所有这些值的列是对象 dtype。 It has values like: 其值如下:

ABC|MNO - 2017 - Trial|1|Random|xyz|RUN|Google|1x1|A10001-21|SD|GH|PRIME - 2017 - Big - This is For Example 

The code I was using is: 我使用的代码是:

def new(row):
  if row.str.contains("PRIME"):
      return 'A'
  if row.str.contains("Random"):
      return 'B'
  if row.str.contains("Google"):
      return 'C'

df['X'] = df['Y'].apply (lambda row: new (row))

This code is giving me following error: 这段代码给了我以下错误:

AttributeError: 'str' object has no attribute 'str'

I think it is because Column X has Object dtype. 我认为这是因为X列具有Object dtype。

I tried converting it to a string using the code: 我尝试使用代码将其转换为字符串:

df['Y'] = df['Y'].astype('str')

but it doesn't work. 但这不起作用。 Then I tried splitting it using the following code: 然后,我尝试使用以下代码对其进行拆分:

df['Y_new'] = df['Y'].str.split(r'([A-Z][^\.!?]*[\.!?])')

But it converted all the values to NaN . 但是它将所有值都转换为NaN How should I do this? 我应该怎么做?

Try doing this: 尝试这样做:

    def new(row):
      if row.contains("PRIME"):
         return 'A'
      if row.contains("Random"):
         return 'B'
      if row.contains("Google"):
         return 'C'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM