[英]Python beginner
I have an existing dataset imported as excel file.我有一个作为 excel 文件导入的现有数据集。 Within the imported file I want to
在我想要的导入文件中
Code was代码是
If df[‘age’]>17 and df[‘age’]<23:
Df[‘generation’]=“genZ”
Élif df[‘age’]>24 and df[‘age’]<40
Df[‘generation’]=“Millenial”
You should use pd.cut()
for this.您应该为此使用
pd.cut()
。
data = [18, 19, 17, 27, 25, 24, 39]
columns = ['age']
df = pd.DataFrame(data=data, columns=columns)
df['generation'] = pd.cut(df['age'], [17, 24, 40], labels=['Genz', 'Millenial'])
Output Output
age generation
0 18 Genz
1 19 Genz
2 17 NaN
3 27 Millenial
4 25 Millenial
5 24 Genz
6 39 Millenial
I would create a custom function and use map()
.我会创建一个自定义 function 并使用
map()
。
def generation(age):
if (age > 17) & (age < 24):
return "GenZ"
elif (age > 24) & (age < 40):
return "Millenial"
else:
return "Not classified"
Then we pass it to our dataframe:然后我们将它传递给我们的 dataframe:
df['generation'] = df['Age'].map(lambda x: generation(x))
An option via np.select :通过np.select的选项:
import numpy as np
import pandas as pd
df = pd.DataFrame([17, 18, 19, 23, 24, 25, 39, 40, 41], columns=['age'])
df['generation'] = np.select(
[(df['age'] > 17) & (df['age'] < 23),
(df['age'] > 24) & (df['age'] < 40)],
['genZ', 'Millenial'],
default=None)
print(df)
df
: df
:
age generation
0 17 None
1 18 genZ
2 19 genZ
3 23 None
4 24 None
5 25 Millenial
6 39 Millenial
7 40 None
8 41 None
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.