簡體   English   中英

python pandas binning數值范圍

[英]python pandas binning numerical range

我有一個請求,我想要一個數字值

If the student marks is 
b/w 0-50 (incl 50) then assign the level column value = "L"
b/w 50-75(incl. 75) then assign the level column value ="M"
>75 then assign the level column value ="H"

這就是我所擁有的

raw_data = {'student':['A','B','C'],'marks_maths':[75,90,99]}
df = pd.DataFrame(raw_data, columns = ['student','marks_maths'])
bins = [0,50,75,>75]
groups = ['L','M','H']
df['maths_level'] = pd.cut(df['marks_maths'], bins, labels=groups)

我收到語法錯誤

File "<ipython-input-25-f0b9dd609c63>", line 3
    bins = [0,50,75,>75]
                    ^
SyntaxError: invalid syntax

我怎樣才能引用一個“特定值”的截止值?

嘗試這個:

 bins = [0,50,75,101] or bins = [0,50,75,np.inf]

希望這可以幫助

import numpy as np
import pandas as pd

# 20 random numbers between 0 and 100
scores = np.random.randint(0,100,20)
df = pd.DataFrame(scores, columns=['scores'])

bins = [0,50,75, np.inf]

df['binned_scores'] = pd.cut(df.scores, bins=[0,50,75, np.inf], include_lowest=False, right=True)
df['bin_labels'] = pd.cut(df.scores, bins=[0,50,75, np.inf], include_lowest=False, right=True, labels=['L','M','H'])

include_lowestright參數可讓您控制箱的邊緣是否包含。

只需將上限定義為最佳標記:

bins = [0, 50, 75, 100]

結果如您所願:

  student  marks_maths maths_level
0       A           75           M
1       B           90           H
2       C           99           H

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM