[英]Need help doing analysis on a stock data set
I created a data frame of stock information such as the "open","high", "close", etc. I now need to calculate the performance for each bar of the stock ( each row in the dataFrame). 我创建了一个股票信息的数据框,例如“开盘价”,“高位”,“收盘价”等。现在,我需要计算股票的每个柱形(dataFrame中的每一行)的性能。 I would like to make a new column in the dataFrame that is equal to the "Close" column of the next row - the "Close" column value of the previous row.
我想在dataFrame中创建一个新列,该列等于下一行的“ Close”列-上一行的“ Close”列值。
I tried splitting up the close columns values by every 2nd row and making this new close columns values into its own column. 我尝试按每个第二行拆分close列值,并将此新的close列值制成自己的列。 Then make a new column subtracting this second column with the first one, however they was an issue dealing with the NaN values.
然后创建一个新列,用第一列减去第二列,但是这是处理NaN值的问题。
df['performance'] = df.Close[2] - df.Close[1]
This made the performance for each of the 52767 rows equal to "2.5". 这使52767行中的每一行的性能均等于“ 2.5”。
I would like to make a column 'performance' that does it iteratively. 我想做一列“性能”来进行迭代。 For example if row 0's close value is 5 and row 1's close value is 7, then row 0's performance value should be 2, and this is done for 52767 rows.
例如,如果行0的结束值为5,行1的结束值为7,则行0的性能值应为2,并且对52767行执行此操作。
pandas.Series.diff()
You can use .diff()
with a period of -1
to calculate a difference from the subsequent row (as opposed to the normal behavior of difference from the previous row). 可以使用周期为
-1
.diff()
来计算与下一行的差异(与上一行的差异的正常行为相反)。 For example: 例如:
# Example data
df = pd.read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/fpp2/goog200.csv", index_col=0).head(10)
# Calculate difference
df['performance'] = df['value'].diff(-1)
yields 产量
time value performance
1 1 392.830017 0.317932
2 2 392.512085 -4.793823
3 3 397.305908 -0.705414
4 4 398.011322 -2.478882
5 5 400.490204 -7.605530
6 6 408.095734 -8.494751
7 7 416.590485 3.586670
8 8 413.003815 -0.606048
9 9 413.609863 0.536499
10 10 413.073364 NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.