简体   繁体   English

PYTHON:df.drop在功能上不起作用

[英]PYTHON: df.drop does not work in function

I have some problems with pandas DataFrame. 我对Pandas DataFrame有一些问题。 I hope that anyone can help me. 我希望任何人都能帮助我。 I downloaded some data from cryptocompare and wrote it to a cvs file. 我从cryptocompare下载了一些数据,并将其写入了cvs文件。 My goal is to update this cvs file constantly on a daily basis. 我的目标是每天不断更新此cvs文件。

After downloading the new data in an extra DataFrame I want to merge it with the existing data. 在额外的DataFrame中下载了新数据之后,我想将其与现有数据合并。 Therefore I wrote a function (read_dataset) that read the existing data of the cvs file in a DataFrame. 因此,我编写了一个函数(read_dataset)来读取DataFrame中cvs文件的现有数据。 The next step should to merge the new data with the existing data. 下一步应该将新数据与现有数据合并。 I tried pd.merge and pd.concate but all this does not work. 我尝试了pd.merge和pd.concate,但是所有这些都不起作用。

My DataFrames looks like: 我的DataFrames看起来像:

           open        time  volumefrom      volumeto  Timestamp
0        0.04951  1279324800       20.00  9.902000e-01 2010-07-17
1        0.04951  1279411200       75.01  5.090000e+00 2010-07-18
2        0.08584  1279497600      574.00  4.966000e+01 2010-07-19
3        0.08080  1279584000      262.00  2.059000e+01 2010-07-20
4        0.07474  1279670400      575.00  4.226000e+01 2010-07-21
5        0.07921  1279756800     2160.00  1.297800e+02 2010-07-22
6        0.05050  1279843200     2402.50  1.410700e+02 2010-07-23
7        0.06262  1279929600      496.32  2.673000e+01 2010-07-24
8        0.05454  1280016000     1551.48  8.506000e+01 2010-07-25
9        0.05050  1280102400      877.00  4.691000e+01 2010-07-26
10       0.05600  1280188800     3373.69  1.969200e+02 2010-07-27
11       0.06000  1280275200     4390.29  2.557600e+02 2010-07-28
12       0.05890  1280361600     8058.49  5.283200e+02 2010-07-29
13       0.06990  1280448000     3020.85  1.985300e+02 2010-07-30
14       0.06270  1280534400     4022.25  2.439000e+02 2010-07-31
15       0.06785  1280620800     2601.00  1.626500e+02 2010-08-01
16       0.06110  1280707200     3599.00  2.212000e+02 2010-08-02
17       0.06000  1280793600     9821.46  6.060500e+02 2010-08-03
18       0.06000  1280880000     3494.00  2.107700e+02 2010-08-04
19       0.05700  1280966400     5034.07  3.036100e+02 2010-08-05
20       0.06100  1281052800     1395.00  8.591000e+01 2010-08-06
21       0.06230  1281139200     2619.00  1.573400e+02 2010-08-07
22       0.05900  1281225600     2201.00  1.326000e+02 2010-08-08
23       0.06090  1281312000    13631.09  8.869300e+02 2010-08-09
24       0.07100  1281398400     1310.39  8.887000e+01 2010-08-10
25       0.07000  1281484800    14061.18  1.015640e+03 2010-08-11
26       0.06700  1281571200     2062.31  1.344900e+02 2010-08-12
27       0.07000  1281657600     3591.77  2.338000e+02 2010-08-13
28       0.06450  1281744000     4404.20  2.953100e+02 2010-08-14
29       0.06700  1281830400     4462.87  2.949500e+02 2010-08-15
          ...         ...         ...           ...        ...
2791  9928.56000  1520467200   154879.22  1.492236e+09 2018-03-08
2792  9316.77000  1520553600   233598.15  2.081621e+09 2018-03-09
2793  9252.76000  1520640000   117409.38  1.084926e+09 2018-03-10
2794  8797.27000  1520726400   149877.66  1.374815e+09 2018-03-11
2795  9543.98000  1520812800   152959.80  1.435404e+09 2018-03-12
2796  9142.27000  1520899200   133768.47  1.228556e+09 2018-03-13
2797  9160.12000  1520985600   161775.05  1.385573e+09 2018-03-14
2798  8216.22000  1521072000   187365.71  1.519850e+09 2018-03-15
2799  8267.95000  1521158400   129688.11  1.082790e+09 2018-03-16
2800  8283.23000  1521244800   111641.32  9.019394e+08 2018-03-17
2801  7882.67000  1521331200   198796.34  1.535519e+09 2018-03-18
2802  8215.50000  1521417600   171829.52  1.447813e+09 2018-03-19
2803  8623.14000  1521504000   131959.66  1.150462e+09 2018-03-20
2804  8920.53000  1521590400   109985.22  9.913764e+08 2018-03-21
2805  8911.37000  1521676800   116522.98  1.023287e+09 2018-03-22
2806  8724.98000  1521763200   109649.39  9.399973e+08 2018-03-23
2807  8935.51000  1521849600    93296.24  8.276632e+08 2018-03-24
2808  8548.39000  1521936000    76775.64  6.576435e+08 2018-03-25
2809  8472.56000  1522022400   131859.97  1.079039e+09 2018-03-26
2810  8152.18000  1522108800   116523.10  9.307550e+08 2018-03-27
2811  7808.42000  1522195200    82590.62  6.577121e+08 2018-03-28
2812  7959.78000  1522281600   185805.88  1.379180e+09 2018-03-29
2813  7106.62000  1522368000   229837.79  1.584675e+09 2018-03-30
2814  6853.75000  1522454400   129526.48  9.154006e+08 2018-03-31
2815  6943.77000  1522540800   131344.01  8.898877e+08 2018-04-01
2816  6835.58000  1522627200   106513.22  7.488614e+08 2018-04-02
2817  7074.65000  1522713600   122807.02  9.053268e+08 2018-04-03
2818  7434.30000  1522800000   123910.33  8.771998e+08 2018-04-04
2819  6815.50000  1522886400   114426.84  7.771452e+08 2018-04-05
2820  6790.45000  1522972800    72568.93  4.848647e+08 2018-04-06

And the existing and new DataFrame should be merged on the key 'time', which is a unix timestamp. 并且现有的和新的DataFrame应该在键“时间”上合并,该时间是unix时间戳。

# Read the old data
df_old = read_dataset('BTC_historical_data_daily')
# Download the new data
df_new = download_historical_data('BTC', 'USD', 'CCCAGG', 'day')
# Merge the two DataFrames on 'time'
df_merged_inner = pd.merge(left=df_old, right=df_new, how='left', left_on='time', right_on='time')
# Convert Unix Timestamp into a readable format
df_merged_inner['Timestamp'] = pd.to_datetime(df_merged_inner['time'], unit='s')
# Drop the Unix Timestamp
df_merged_inner = df_merged_inner.drop('time', axis=1)
# Save the new DataFrame as cvs file
df_merged_inner.to_csv('BTC_historical_data_daily_' + current_datetime)

This code returns a DataFrame with no updated data but doubled values for each key. 此代码返回一个DataFrame,其中没有更新数据,但每个键的值加倍。

pd.concate gives back the following error: pd.concate返回以下错误:

d = pd.concat(df_old,df_new)
Traceback (most recent call last):
  File "/Users/audiodeep/anaconda/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-50-891cefa897e1>", line 1, in <module>
    d = pd.concat(df_old,df_new)
  File "/Users/audiodeep/anaconda/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 212, in concat
    copy=copy)
  File "/Users/audiodeep/anaconda/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 227, in __init__
    '"{name}"'.format(name=type(objs).__name__))
TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"

Has anyone a solution for me? 有没有人为我解决方案? Thanks a lot :D 非常感谢:D

pd.concat([df_old, df_new])

The error message is basically that your group of DataFrames have to be in an iterable object; 错误消息基本上是您的DataFrame组必须位于可迭代的对象中。 list. 清单。

As czr mentioned in a comment, pd.concat should work for your example when you supply it with a tuple (df_old, df_new) . 正如czr在评论中提到的那样,当为pd.concat提供元组(df_old, df_new)时, pd.concat应该适用于您的示例。 That is because it expects an iterable such as for example a tuple or a list. 那是因为它期望一个可迭代的对象,例如元组或列表。 The way you supplied df_old and df_new does not work, as you supplied each as an individual positional argument, ie pd.concat(df_old, df_new) . 您提供df_old和df_new的方式不起作用,因为您分别将其作为单独的位置参数提供,即pd.concat(df_old, df_new) Any of the following should work: 以下任何一项都可以工作:

d = pd.concat((df_old, df_new))

d = pd.concat([df_old, df_new])

The official documentation mentions this iterable as objs. 官方文档将此可迭代对象称为objs。

Additionally you might want to think about keeping only one data point for time points that you have multiple rows for. 此外,您可能要考虑只保留一个数据点作为您有多个行的时间点。 You can do this the following way: 您可以通过以下方式执行此操作:

d = d.drop_duplicates('time')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM