[英]python pandas binning numerical range
我有一個請求,我想要一個數字值
If the student marks is
b/w 0-50 (incl 50) then assign the level column value = "L"
b/w 50-75(incl. 75) then assign the level column value ="M"
>75 then assign the level column value ="H"
這就是我所擁有的
raw_data = {'student':['A','B','C'],'marks_maths':[75,90,99]}
df = pd.DataFrame(raw_data, columns = ['student','marks_maths'])
bins = [0,50,75,>75]
groups = ['L','M','H']
df['maths_level'] = pd.cut(df['marks_maths'], bins, labels=groups)
我收到語法錯誤
File "<ipython-input-25-f0b9dd609c63>", line 3
bins = [0,50,75,>75]
^
SyntaxError: invalid syntax
我怎樣才能引用一個“特定值”的截止值?
嘗試這個:
bins = [0,50,75,101] or bins = [0,50,75,np.inf]
希望這可以幫助
import numpy as np
import pandas as pd
# 20 random numbers between 0 and 100
scores = np.random.randint(0,100,20)
df = pd.DataFrame(scores, columns=['scores'])
bins = [0,50,75, np.inf]
df['binned_scores'] = pd.cut(df.scores, bins=[0,50,75, np.inf], include_lowest=False, right=True)
df['bin_labels'] = pd.cut(df.scores, bins=[0,50,75, np.inf], include_lowest=False, right=True, labels=['L','M','H'])
include_lowest
和right
參數可讓您控制箱的邊緣是否包含。
只需將上限定義為最佳標記:
bins = [0, 50, 75, 100]
結果如您所願:
student marks_maths maths_level
0 A 75 M
1 B 90 H
2 C 99 H
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.