[英]Z-Score Calculation
How to calculate the Z-Score of a particular Column(x Variable).如何计算特定列的 Z 分数(x 变量)。 The formula is (x-mu)/sigma.公式为 (x-mu)/sigma。 What should be the value of the x here? x 这里的值应该是多少? My X variable column contains 24 values(rows).我的 X 变量列包含 24 个值(行)。 Should i take all the Summation of X values and then use it in the above formula?我应该取所有 X 值的总和,然后在上面的公式中使用它吗? Pleas let me know.请告诉我。
Thanks谢谢
Z score is basically centering the values around the mean and scaled by standard deviation. Z 分数基本上以平均值为中心,并按标准差进行缩放。 You will find the mean, sd of your variable and convert all of your values to a z-score, for example:您将找到变量的均值 sd 并将所有值转换为 z 分数,例如:
import pandas as pd
import numpy as np
df = pd.DataFrame({'x':np.random.uniform(0,1,100)})
df['z_score_1'] = (df['x'] - df['x'].mean())/df['x'].std(ddof=1)
You can also use scipy, note I set degree of freedom to be 1 in both cases to use the unbiased estimator of standard deviation:您也可以使用 scipy,注意我在两种情况下都将自由度设置为 1,以使用标准偏差的无偏估计:
from scipy.stats import zscore
df['z_score_2'] = zscore(df['x'],ddof=1)
df[:5]
x z_score_1 z_score_2
0 0.543405 0.244382 0.244382
1 0.278369 -0.667379 -0.667379
2 0.424518 -0.164608 -0.164608
3 0.844776 1.281142 1.281142
4 0.004719 -1.608777 -1.608777
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.