简体   繁体   English

规范熊猫DataFrame的每一列

[英]Normalize each column of a pandas DataFrame

Each column of the Dataframe needs their values to be normalized according the value of the first element in that column. 数据框的每一列都需要根据该列中第一个元素的值对其值进行规范化。

for timestamp, prices in data.iteritems():
    normalizedPrices = prices / prices[0]
    print normalizedPrices     # how do we update the DataFrame with this Series?

However how do we update the DataFrame once we have created the normalized column of data? 但是,一旦创建了标准化的数据列,我们如何更新DataFrame? I believe if we do prices = normalizedPrices we are merely acting on a copy/view of the DataFrame rather than the original DataFrame itself. 我相信,如果我们做prices = normalizedPrices我们只是在对DataFrame的副本/视图而不是原始DataFrame本身。

It might be simplest to normalize the entire DataFrame in one go (and avoid looping over rows/columns altogether): 一次性标准化整个DataFrame可能是最简单的(并避免完全循环遍历行/列):

>>> df = pd.DataFrame({'a': [2, 4, 5], 'b': [3, 9, 4]}, dtype=np.float) # a DataFrame
>>> df
   a  b
0  2  3
1  4  9
2  5  4

>>> df = df.div(df.loc[0]) # normalise DataFrame and bind back to df
>>> df
     a         b
0  1.0  1.000000
1  2.0  3.000000
2  2.5  1.333333

Assign to data[col] : 分配给data[col]

for col in data:
    data[col] /= data[col].iloc[0]
import numpy

data[0:] = data[0:].values/data[0:1].values 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM