根据两个不同列中的各自值在 DataFrame 中创建新列

Question

I have the following dataframe:我有以下 dataframe：

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','No Brand'],
        'Country': ['Japan','No Country','United States','Germany']
        }

df = pd.DataFrame(cars, columns = ['Brand', 'Country'])
df.head(4)

            Brand        Country
0     Honda Civic          Japan
1  Toyota Corolla     No Country
2      Ford Focus  United States
3        No Brand        Germany

Would like to create a new column in the dataframe which will combine based on the values of column 'Brand' and 'Country'.想在 dataframe 中创建一个新列，它将根据“品牌”和“国家”列的值进行组合。 If there is 'No Brand' value in column Brand then column Desc only takes the value in column Country.如果 Brand 列中存在“No Brand”值，则 Desc 列仅采用 Country 列中的值。 If there is 'No Country' value in column Country then column Desc only takes the value in column Brand.如果 Country 列中有“No Country”值，则 Desc 列仅采用 Brand 列中的值。 Desired output:所需的 output：

            Brand        Country    Desc
0     Honda Civic          Japan    Honda Civic Japan
1  Toyota Corolla     No Country    Toyota Corolla
2      Ford Focus  United States    Ford Focus United States
3        No Brand        Germany    Germany

If it is checking the string in one column, I am able to do so but not sure how to proceed for two columns.如果它检查一列中的字符串，我可以这样做，但不确定如何处理两列。 Right now I can only check the boolean on the condition I want.现在我只能在我想要的条件下检查 boolean。

df['Desc'] = df['Brand'].str.contains("No Brand") | df['Country'].str.contains("No Country")

            Brand        Country    Desc
0     Honda Civic          Japan    False
1  Toyota Corolla     No Country    True
2      Ford Focus  United States    False
3        No Brand        Germany    True

I read that it is not recommended to iterate dataframe and avoid doing so.我读到不建议迭代 dataframe 并避免这样做。

Answer 1

def get_desc(brand, country):
    return (brand if brand != 'No Brand' else '') +\
           (' ' + country if country != 'No Country' else '')


df['Desc'] = df['Brand'].combine(df['Country'], get_desc)

print(df.head(4))

Output: Output：

            Brand        Country                      Desc
0     Honda Civic          Japan         Honda Civic Japan
1  Toyota Corolla     No Country            Toyota Corolla
2      Ford Focus  United States  Ford Focus United States
3        No Brand        Germany                   Germany

Answer 2

Let's concat both the columns then use str.replace to replace the No Brand and No Country values with empty string:让我们str.replace concat No Brand和No Country值替换为空字符串：

df['Desc'] = (df['Brand'] + ' ' + df['Country']).str.replace(r'No Brand\s*|\s*No Country', '')

Result:结果：

            Brand        Country                      Desc
0     Honda Civic          Japan         Honda Civic Japan
1  Toyota Corolla     No Country            Toyota Corolla
2      Ford Focus  United States  Ford Focus United States
3        No Brand        Germany                   Germany

Answer 3

In [2]: cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','No Brand'],
   ...:         'Country': ['Japan','No Country','United States','Germany']
   ...:         }
   ...: 
   ...: df = pd.DataFrame(cars, columns = ['Brand', 'Country'])
   ...: df
Out[2]: 
            Brand        Country
0     Honda Civic          Japan
1  Toyota Corolla     No Country
2      Ford Focus  United States
3        No Brand        Germany

In [3]: df['New_col'] = (df.Brand + " " + df.Country).str.replace("No Brand", "").str.replace("No Country", "").str.strip()

In [4]: df
Out[4]: 
            Brand        Country                   New_col
0     Honda Civic          Japan         Honda Civic Japan
1  Toyota Corolla     No Country            Toyota Corolla
2      Ford Focus  United States  Ford Focus United States
3        No Brand        Germany                   Germany

根据两个不同列中的各自值在 DataFrame 中创建新列

问题描述

3 个解决方案

解决方案1
0 2020-11-28 15:15:26

解决方案2
0 2020-11-28 15:17:18

解决方案3
0 2020-11-28 23:39:37

根据两个不同列中的各自值在 DataFrame 中创建新列

问题描述

3 个解决方案

解决方案1 0 2020-11-28 15:15:26

解决方案2 0 2020-11-28 15:17:18

解决方案3 0 2020-11-28 23:39:37

解决方案1
0 2020-11-28 15:15:26

解决方案2
0 2020-11-28 15:17:18

解决方案3
0 2020-11-28 23:39:37