Using below code I'm attempting to check the percent change of the numeric columns :
import pandas as pd
df = pd.read_csv('./data.txt')
df.pct_change(1)
data.txt :
,AAPL,MSFT,^GSPC
2000-01-03,3.625643,39.33463,1455.219971
2000-01-04,3.319964,38.0059,1399.420044
2000-01-05,3.3685480000000005,38.406628000000005,1402.109985
2000-01-06,3.077039,37.12008,1403.449951
But above code returns error :
/opt/conda/lib/python3.5/site-packages/pandas/core/ops.py in na_op(x, y)
1187 if np.prod(xrav.shape) and np.prod(yrav.shape):
1188 with np.errstate(all='ignore'):
-> 1189 result[mask] = op(xrav, yrav)
1190 elif hasattr(x, 'size'):
1191 result = np.empty(x.size, dtype=x.dtype)
TypeError: unsupported operand type(s) for /: 'str' and 'str'
How to utilize the pct_change method? Remove the non numeric column (in this case the date column) re-run pct_change and then re-combine the data column ?
The first column of dates are strings. df.pct_change(1)
raises a TypeError
when it tries to perform division on these strings.
One way to avoid the error is to make the dates the index when parsing the CSV:
import pandas as pd
df = pd.read_csv('./data.txt', index_col=[0])
print(df.pct_change(1))
yields
AAPL MSFT ^GSPC
2000-01-03 NaN NaN NaN
2000-01-04 -0.084310 -0.033780 -0.038345
2000-01-05 0.014634 0.010544 0.001922
2000-01-06 -0.086538 -0.033498 0.000956
You might also want to parse the date strings as dates:
df = pd.read_csv('./data.txt', index_col=[0], parse_dates=[0])
Then the index will be a DatetimeIndex
instead of a plain Index
(of strings). This will allow you to do datetime arithmetic on the index, and interpolate values based on time .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.