简体   繁体   English

Python 初学者

[英]Python beginner

I have an existing dataset imported as excel file.我有一个作为 excel 文件导入的现有数据集。 Within the imported file I want to在我想要的导入文件中

  1. Create a new column then然后创建一个新列
  2. Assign parameters from an existing column.从现有列分配参数。 Example.例子。 If in column x the age is >17 and <23 assign generation z under colum y else if in colum x the age is >24 but less than <40 assign to mill in column y如果在 x 列中,年龄大于 17 且 <23,则在 y 列下分配 z 代;否则,如果在 x 列中,年龄 >24 但小于 <40,则分配给 y 列中的磨机

Code was代码是

If df[‘age’]>17 and df[‘age’]<23:
    Df[‘generation’]=“genZ”
Élif df[‘age’]>24 and df[‘age’]<40
    Df[‘generation’]=“Millenial”

You should use pd.cut() for this.您应该为此使用pd.cut()

data = [18, 19, 17, 27, 25, 24, 39]
columns = ['age']
df = pd.DataFrame(data=data, columns=columns)

df['generation'] = pd.cut(df['age'], [17, 24, 40], labels=['Genz', 'Millenial'])

Output Output

   age generation
0   18       Genz
1   19       Genz
2   17        NaN
3   27  Millenial
4   25  Millenial
5   24       Genz
6   39  Millenial

I would create a custom function and use map() .我会创建一个自定义 function 并使用map()

def generation(age):
   if (age > 17) & (age < 24):
      return "GenZ"
   elif (age > 24) & (age < 40):
      return "Millenial"
   else:
      return "Not classified"

Then we pass it to our dataframe:然后我们将它传递给我们的 dataframe:

df['generation'] = df['Age'].map(lambda x: generation(x))

An option via np.select :通过np.select的选项:

import numpy as np
import pandas as pd

df = pd.DataFrame([17, 18, 19, 23, 24, 25, 39, 40, 41], columns=['age'])

df['generation'] = np.select(
    [(df['age'] > 17) & (df['age'] < 23),
     (df['age'] > 24) & (df['age'] < 40)],
    ['genZ', 'Millenial'],
    default=None)

print(df)

df : df

   age generation
0   17       None
1   18       genZ
2   19       genZ
3   23       None
4   24       None
5   25  Millenial
6   39  Millenial
7   40       None
8   41       None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM