简体   繁体   English

向 pandas dataframe 添加一个新列,其中包含来自另一列的转换值?

[英]Add a new column to pandas dataframe with coverted values from another column?

I have a pandas dataframe from a csv file and i want to add 3 columns in python 3.8我有一个来自 csv 文件的 pandas dataframe 并且我想在 Z23EEEB4347BDD26BDFC63B77 中添加 3 列。

  1. add a column and convert meters to miles (length is meters, new column will be length_miles).添加一列并将米转换为英里(长度为米,新列将是 length_miles)。

  2. add column to convert meters to feet (elevation_gain is in meters, new column will be elevation_gain_feet.添加列以将米转换为英尺(elevation_gain 以米为单位,新列将是elevation_gain_feet。

  3. add a column that computes a difficulty rating as follows: nps difficulty rating = Elevation Gain(feet) x 2 x distance (in miles).添加一个计算难度等级的列,如下所示:nps 难度等级 = 海拔增益(英尺)x 2 x 距离(以英里为单位)。 The product's square root is the numerical rating.产品的平方根是数值等级。

This needs to be broken down a little further into a difficulty rating of 1-5.这需要进一步分解为 1-5 的难度等级。 the current difficulty rating in the data set is not informative so i want to use the national park service rating.数据集中当前的难度等级没有提供信息,所以我想使用国家公园服务等级。

if the numerical difficulty rating is:如果数字难度等级是:

under 50, then the value is 1 50-100, then difficulty rating is 2 101-150, then difficulty rating is 3 151-200, then difficulty rating is 4 above 200, then difficulty rating is 5 50以下,则值为1 50-100,则难度等级为2 101-150,则难度等级为3 151-200,则难度等级为4 200以上,则难度等级为5

Ideally this would compute and just put the number 1-5 in the column, but having 2 new columns for #3 would be fine as well.理想情况下,这将计算并将数字 1-5 放入列中,但是为 #3 设置 2 个新列也可以。

Here are the columns from my dataframe and values from a couple rows.这是我的 dataframe 中的列和几行中的值。 I have not yet thought about making the nps 1-5 ratings in the dataframe, I am not sure if I can, or need to do it outside the dataframe in a function.我还没有考虑过在 dataframe 中设置 nps 1-5 等级,我不确定我是否可以,或者需要在 dataframe 之外的 ZC1C425268E68385D1AB5074F7DZ 之外进行此操作。 unfortunately it does not seem to be adding the columns like I want it to, so I think I must be doing something wrong.不幸的是,它似乎并没有像我想要的那样添加列,所以我认为我一定做错了什么。 数据框 code I have so far我到目前为止的代码

df = pd.read_csv('data.csv')
df.assign(length_miles = lambda x: x['length'] * 0.00062137, axis = 1)
df.assign(elevation_gain_ft = lambda x: x['elevation_gain'] * 3.28084, axis = 1)
df.assign(num_dif_rating = lambda x: np.sqrt( x['length_miles'] * 2 * x['elevation_gain_ft'], axis = 1))

You need to use the assign method:您需要使用assign方法:

df.assign(YourColumn = lambda x: conversion_formula(x['Meters']), axis = 1)

Here's the link to the documentation:这是文档的链接:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.assign.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.assign.html

Good luck!祝你好运!

I got it to work like this.我让它像这样工作。 it cleaned up the data just the way i need.它以我需要的方式清理了数据。

def data_cleanup():
df = pd.read_csv('AllTrails data.csv')
# convert meters to miles and feet and add columns
df['length_miles']=df['length'].apply(lambda x : x*0.000621371)
df['elevation_gain_feet']=df['elevation_gain'].apply(lambda x : x*3.28084)
def difficulty_rating(x, y):
    res = np.sqrt(x * y * 2)
    if res < 50:
        return 1
    elif res >= 50 and res <= 100:
        return 2
    elif res >= 101 and res <= 150:
        return 3
    elif res >= 151 and res <= 200:
        return 4
    else:
        return 5
df['nps_difficulty_rating'] = df.apply(lambda x: difficulty_rating (x.length_miles, x.elevation_gain_feet), axis=1)

df.to_csv('np trails.csv')

data_cleanup()数据清理()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用来自另一列的条件值将新列添加到Pandas数据框 - Add new column to Pandas dataframe using conditional values from another column 使用来自另一个DataFrame的值将列有效地添加到Pandas DataFrame - Efficiently add column to Pandas DataFrame with values from another DataFrame 根据 pandas 中字典中另一列的值添加新列 - Add new column based on values of another column from a dictionary in pandas Pandas:添加新列并按条件从另一个dataframe赋值 - Pandas: Add new column and assigning value from another dataframe by condition 基于匹配来自另一个数据帧pandas的值的新列 - New column based on matching values from another dataframe pandas 如何将值添加到熊猫数据框中的新列? - How to add values to a new column in pandas dataframe? 如何从另一列的所有值创建新的列名并按 pandas dataframe 中的另一列创建新列名? - how to create new column names from another column all values and agg by another column in pandas dataframe? python:pandas:按条件将其他 dataframe 的值添加到新列中 - python: pandas: add values from other dataframe into new column by condition 将新的字典值列添加到 pandas dataframe - Add new column of dictionary values to pandas dataframe 如何在熊猫中添加来自其他数据框的值的列 - How to add a column in pandas with values taken from another dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM