简体   繁体   English

Z 分数计算

[英]Z-Score Calculation

How to calculate the Z-Score of a particular Column(x Variable).如何计算特定列的 Z 分数(x 变量)。 The formula is (x-mu)/sigma.公式为 (x-mu)/sigma。 What should be the value of the x here? x 这里的值应该是多少? My X variable column contains 24 values(rows).我的 X 变量列包含 24 个值(行)。 Should i take all the Summation of X values and then use it in the above formula?我应该取所有 X 值的总和,然后在上面的公式中使用它吗? Pleas let me know.请告诉我。

Thanks谢谢

Z score is basically centering the values around the mean and scaled by standard deviation. Z 分数基本上以平均值为中心,并按标准差进行缩放。 You will find the mean, sd of your variable and convert all of your values to a z-score, for example:您将找到变量的均值 sd 并将所有值转换为 z 分数,例如:

import pandas as pd
import numpy as np
df = pd.DataFrame({'x':np.random.uniform(0,1,100)})
df['z_score_1'] = (df['x'] - df['x'].mean())/df['x'].std(ddof=1)

You can also use scipy, note I set degree of freedom to be 1 in both cases to use the unbiased estimator of standard deviation:您也可以使用 scipy,注意我在两种情况下都将自由度设置为 1,以使用标准偏差的无偏估计:

from scipy.stats import zscore
df['z_score_2'] = zscore(df['x'],ddof=1)

df[:5]

    x           z_score_1   z_score_2
0   0.543405    0.244382    0.244382
1   0.278369    -0.667379   -0.667379
2   0.424518    -0.164608   -0.164608
3   0.844776    1.281142    1.281142
4   0.004719    -1.608777   -1.608777

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM