简体   繁体   English

使用 pandas/python 从 DataFrame 中的两个现有文本列创建一个新列

[英]Create a new column from two existing text columns in a DataFrame using pandas/python

I have a Dataframe with two columns "Start_location" and "end_location" .我有一个 Dataframe 有两列"Start_location""end_location" I want to create a new column called "location" from the 2 previous columns with the following conditions.我想从具有以下条件的前 2 列中创建一个名为"location"的新列。

If the values of "start_location" == "end_location" , then the value of "location" will be either of the values of the first two columns.如果"start_location" == "end_location"的值,那么"location"的值将是前两列的值之一。 else, if the values of of "start_location" and "end_location are different, then values of "Location" will be "start_location"-"end_location".否则,如果"start_location""end_location ”的值不同,则"Location"的值将是"start_location"-"end_location".

An example of what I want is this.我想要的一个例子就是这个。

+---+--------------------+-----------------------+
|   |  Start_location    |      End_location     |
+---+--------------------+-----------------------+
| 1 | Stratford          |      Stratford        |
| 2 | Bromley            |      Stratford        |
| 3 | Brighton           |      Manchester       |
| 4 | Delaware           |      Delaware         |
+---+--------------------+-----------------------+
   

The result I want is this.我想要的结果是这样的。

+---+--------------------+-----------------------+--------------------+
|   |  Start_location    |      End_location     |   Location         |
+---+--------------------+-----------------------+--------------------+
| 1 | Stratford          |      Stratford        |   Stratford        |
| 2 | Bromley            |      Stratford        | Brombley-Stratford |
| 3 | Brighton           |      Manchester       | Brighton-Manchester|
| 4 | Delaware           |      Delaware         |    Delaware        |
+---+--------------------+-----------------------+--------------------+
   

I would be happy if anyone can help.如果有人可以提供帮助,我会很高兴。

PS- forgive me if this is a very basic question. PS-如果这是一个非常基本的问题,请原谅我。 I have gone through some similar questions on this topic but couldn't get a headway.我在这个主题上经历了一些类似的问题,但没有取得进展。

You can make your own function that does this and then use apply and a lambda function:您可以制作自己的 function 执行此操作,然后使用apply和 lambda function:

def get_location(start, end):
    if start == end:
        return start
    else:
        return start + ' - ' + end

df['location'] = df.apply(lambda x: get_location(x.Start_location, x.End_location), axis = 1)
df['Location'] = df[['start_location','end_location']].apply(lambda x: x[0] if x[0] == x[1] else x[0] + '-' + x[1], axis = 1)

Use np.select(condition, choice) .使用np.select(condition, choice) To join start, use .str.cat() method要加入开始,请使用.str.cat()方法

import numpy as np

condition=[df['Start_location']==df['End_location'],df['Start_location']!= df['End_location']]
choice=[df['Start_location'], df['Start_location'].str.cat(df['End_location'], sep='_')]
df['Location']=np.select(condition, choice)

df
Start_location End_location             Location
1      Stratford    Stratford            Stratford
2        Bromley    Stratford    Bromley_Stratford
3       Brighton   Manchester  Brighton_Manchester
4       Delaware     Delaware             Delaware

You can use Numpy to compare both columns.您可以使用Numpy来比较两列。 Follow This code遵循此代码


import numpy as np

df["Location"] =  np.where((df['Start_location'] == df['End_location'])
                           , df['Start_location'],df['Start_location']+"-"+ df['End_location'])

df

Output: Output:

    Start_location  End_location    Location
0   Stratford        Stratford      Stratford
1   Bromley          Stratford  Bromley-Stratford
2   Brighton         Manchester Brighton-Manchester
3   Delaware         Delaware        Delaware

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 python pandas 从现有列创建一个新的地图列 - Create a new map column from existing columns using python pandas 无法从现有的两个列在 Pandas dataframe 中创建新列 - Unable to make a new column in Pandas dataframe from two existing columns Pandas:使用从预先存在的列计算的值在数据框中创建两个新列 - Pandas: create two new columns in a dataframe with values calculated from a pre-existing column 从 4 个现有列创建一个新列(python pandas) - create a new column from 4 existing columns (python pandas) 从 3 个现有列创建一个新列(python pandas) - create a new column from 3 existing columns (python pandas) Python Pandas数据框:使用列中的值创建新列 - Python Pandas Dataframe: Using Values in Column to Create New Columns python pandas dataframe从其他列的单元格创建新列 - python pandas dataframe create new column from other columns' cells 使用 Pandas 从现有列创建新列到数据框 - Create a new column to data frame from existing columns using Pandas 使用 pandas 对两列进行排序并为 dataframe 中的排序值创建新列 - Sort Two column and create new columns for sorted values from dataframe using pandas 如何使用来自 pandas DataFrame 的两个单独列的数据在 python 中创建一个新列? - How to creating a new column in python using data from two separate columns of a pandas DataFrame?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM