简体   繁体   English

遍历 Pandas Dataframe 值和索引

[英]Iterating Through Pandas Dataframe Values and Indecies

I've got the following dataframe:我有以下 dataframe:
数据框
and, for every column in the dataframe, I'm attempting to calculate the following equations relating to simple linear regression:并且,对于 dataframe 中的每一列,我都试图计算以下与简单线性回归相关的方程式: 方程式
These formulas require me to be able to reference both the values in each column (x_i) and their row index/name (y_i), along with the mean of both the row indices and the actual column values.这些公式要求我能够引用每列 (x_i) 中的值及其行索引/名称 (y_i),以及行索引和实际列值的平均值。 How can I do this?我怎样才能做到这一点? I know that it's a rule of thumb never to iterate over a dataframe, but, given the small size of this frame, I really don't care how it's done.我知道永远不要迭代 dataframe 是一条经验法则,但是,考虑到这个框架的尺寸很小,我真的不在乎它是如何完成的。

You could do it via an apply() function. You can access the 'x' values and the 'y' value (index) for each row.您可以通过apply() function 来完成。您可以访问每一行的“x”值和“y”值(索引)。

Sample DataFrame:样本 DataFrame:

df = pd.DataFrame(index=range(1920, 1931), data={'AUS': range(15,26), 'CAN': range(12,23)})
df

      AUS  CAN
1920   15   12
1921   16   13
1922   17   14
1923   18   15
1924   19   16
1925   20   17
1926   21   18
1927   22   19
1928   23   20
1929   24   21
1930   25   22

Then access then values for each row in an apply() function:然后访问apply() function 中每一行的值:

df.apply(lambda x: [x.name, x['AUS'], x['CAN']], axis=1)

1920    [1920, 15, 12]
1921    [1921, 16, 13]
1922    [1922, 17, 14]
1923    [1923, 18, 15]
1924    [1924, 19, 16]
1925    [1925, 20, 17]
1926    [1926, 21, 18]
1927    [1927, 22, 19]
1928    [1928, 23, 20]
1929    [1929, 24, 21]
1930    [1930, 25, 22]

Knocking out those equations will be up to you.敲出这些等式将取决于你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM