[英]Pandas: Create a matrix of price differences?
I am trying to build price differences between bitcoins and exchanges, for example I have a dataframe, 我正在尝试建立比特币和交易所之间的价格差异,例如,我有一个数据框,
Exchange coin lastUpdate price volume
0 Bitfinex BTC 2019-06-23 06:23:27 10646 24299.4
1 Bitfinex ETH 2019-06-23 06:23:13 308.47 225945
2 Bitfinex LTC 2019-06-23 06:23:18 140.41 215698
3 Bitstamp BTC 2019-06-23 06:23:21 10546.4 9620.04
4 Bitstamp ETH 2019-06-23 06:22:48 305.15 46062.6
5 Bitstamp LTC 2019-06-23 06:22:46 139.22 85160.5
6 CCCAGG BTC 2019-06-23 06:23:23 10580.4 79049.8
7 CCCAGG ETH 2019-06-23 06:23:20 306.74 681056
8 CCCAGG LTC 2019-06-23 06:23:24 139.71 752875
9 Coinbase BTC 2019-06-23 06:23:17 10557.5 23731.2
10 Coinbase ETH 2019-06-23 06:23:11 306.09 247213
11 Coinbase LTC 2019-06-23 06:23:13 139.49 381421
And I am trying to get all of the prices differences between the coin and all the exchanges it trades on, 而且我正在尝试获得代币与其所交易的所有交易所之间的所有价格差异,
I want it to look like, 我希望它看起来像
price_combos diff
Price Diff: BTC - Bitfinex-Bitstamp 14.06
Price Diff: BTC - Bitfinex-CCCAGG 14.32
Price Diff: BTC - Bitstamp-CCCAGG 0.26
Price Diff: BTC - Coinbase-Bitfinex -17.99
Price Diff: BTC - Coinbase-Bitstamp -3.93
Price Diff: BTC - Coinbase-CCCAGG -3.67
And then repeat for each coin. 然后重复每个硬币。
Edit: Added price to combinations, note that the diff is from a different set of data so it won't match the actual diff from the first dataframe. 编辑:将价格添加到组合中,请注意,差异来自另一组数据,因此它与第一个数据帧的实际差异不匹配。
We can approach this problem as following: 我们可以通过以下方法解决此问题:
outer merge
on each coin itself so it gives us all the combinations back. outer merge
,以便将所有组合返还给我们。 ne
(not equal) where the exchange is the same (we don't want to compare those). ne
(不等于)过滤掉交换相同的行(我们不想比较它们)。 Price diff
column by subtracting the prices Price diff
列 # Step 1 outer merge
df2 = df[['Exchange', 'coin', 'price']].merge(df[['Exchange', 'coin', 'price']],
on='coin',
how='outer',
suffixes=['', '_2'])
# Step 2 filter out same exchange
df2 = df2[df2['Exchange'].ne(df2['Exchange_2'])]
# Step 3 create Price Diff column
df2['Price Diff'] = df2['price'] = df2['price_2']
Exchange coin price Exchange_2 price_2 Price Diff
1 Bitfinex BTC 10546.40 Bitstamp 10546.40 10546.40
2 Bitfinex BTC 10580.40 CCCAGG 10580.40 10580.40
3 Bitfinex BTC 10557.50 Coinbase 10557.50 10557.50
4 Bitstamp BTC 10646.00 Bitfinex 10646.00 10646.00
6 Bitstamp BTC 10580.40 CCCAGG 10580.40 10580.40
7 Bitstamp BTC 10557.50 Coinbase 10557.50 10557.50
8 CCCAGG BTC 10646.00 Bitfinex 10646.00 10646.00
9 CCCAGG BTC 10546.40 Bitstamp 10546.40 10546.40
11 CCCAGG BTC 10557.50 Coinbase 10557.50 10557.50
12 Coinbase BTC 10646.00 Bitfinex 10646.00 10646.00
13 Coinbase BTC 10546.40 Bitstamp 10546.40 10546.40
14 Coinbase BTC 10580.40 CCCAGG 10580.40 10580.40
17 Bitfinex ETH 305.15 Bitstamp 305.15 305.15
18 Bitfinex ETH 306.74 CCCAGG 306.74 306.74
19 Bitfinex ETH 306.09 Coinbase 306.09 306.09
20 Bitstamp ETH 308.47 Bitfinex 308.47 308.47
22 Bitstamp ETH 306.74 CCCAGG 306.74 306.74
23 Bitstamp ETH 306.09 Coinbase 306.09 306.09
24 CCCAGG ETH 308.47 Bitfinex 308.47 308.47
25 CCCAGG ETH 305.15 Bitstamp 305.15 305.15
27 CCCAGG ETH 306.09 Coinbase 306.09 306.09
28 Coinbase ETH 308.47 Bitfinex 308.47 308.47
29 Coinbase ETH 305.15 Bitstamp 305.15 305.15
30 Coinbase ETH 306.74 CCCAGG 306.74 306.74
33 Bitfinex LTC 139.22 Bitstamp 139.22 139.22
34 Bitfinex LTC 139.71 CCCAGG 139.71 139.71
35 Bitfinex LTC 139.49 Coinbase 139.49 139.49
36 Bitstamp LTC 140.41 Bitfinex 140.41 140.41
38 Bitstamp LTC 139.71 CCCAGG 139.71 139.71
39 Bitstamp LTC 139.49 Coinbase 139.49 139.49
40 CCCAGG LTC 140.41 Bitfinex 140.41 140.41
41 CCCAGG LTC 139.22 Bitstamp 139.22 139.22
43 CCCAGG LTC 139.49 Coinbase 139.49 139.49
44 Coinbase LTC 140.41 Bitfinex 140.41 140.41
45 Coinbase LTC 139.22 Bitstamp 139.22 139.22
46 Coinbase LTC 139.71 CCCAGG 139.71 139.71
You should have a look at the itertools
module (doc) . 您应该看看
itertools
模块(doc) 。 There are a lot of nice functions for iterations. 有很多不错的迭代功能。
Here you're exactly looking for the combination
function. 在这里,您正在寻找
combination
功能。
Once you have the combinations, that becomes simple: 一旦有了组合,就变得很简单:
# Import modules
import pandas as pd
import itertools as iter
# Your data
df = pd.DataFrame([
["Bitfinex", "BTC", "2019-06-23 06:23:27", 10646, 24299.4],
["Bitfinex", "ETH", "2019-06-23 06:23:13", 308.47, 225945],
["Bitfinex", "LTC", "2019-06-23 06:23:18", 140.41, 215698],
["Bitstamp", "BTC", "2019-06-23 06:23:21", 10546.4, 9620.04],
["Bitstamp", "ETH", "2019-06-23 06:22:48", 305.15, 46062.6],
["Bitstamp", "LTC", "2019-06-23 06:22:46", 139.22, 85160.5],
["CCCAGG", "BTC", "2019-06-23 06:23:23", 10580.4, 79049.8],
["CCCAGG", "ETH", "2019-06-23 06:23:20", 306.74, 681056],
["CCCAGG", "LTC", "2019-06-23 06:23:24", 139.71, 752875],
["Coinbase", "BTC", "2019-06-23 06:23:17", 10557.5, 23731.2],
["Coinbase", "ETH", "2019-06-23 06:23:11", 306.09, 247213],
["Coinbase", "LTC", "2019-06-23 06:23:13", 139.49, 381421],
], columns=["Exchange", "coin", "lastUpdate", "price", "volume"])
# Print all combinations for one coin
def print_combi(df, coin):
# subset dataframe with matching rows
sub_df = df[df["coin"] == coin]
# Create all combinations for the exchange columns
list_combi = [cb for cb in iter.combinations(sub_df.Exchange, 2)]
# Print the expected output
for combi in list_combi:
print("Price diff: {0} - {1}-{2}".format(coin, combi[0], combi[1]))
print_combi(df, 'BTC')
# Price diff: BTC - Bitfinex-Bitstamp
# Price diff: BTC - Bitfinex-CCCAGG
# Price diff: BTC - Bitfinex-Coinbase
# Price diff: BTC - Bitstamp-CCCAGG
# Price diff: BTC - Bitstamp-Coinbase
# Price diff: BTC - CCCAGG-Coinbase
EDIT1: EDIT1:
Return a dataframe. 返回一个数据框。 The diff column is from the data used in the snippet above.
diff列来自上面代码段中使用的数据。
def combo_money_df(df, coin):
# subset the dataframe
sub_df = df[df["coin"] == coin]
new_data = []
# For each subset
for combi in iter.combinations(sub_df.index, 2):
# Select corresponding row
row_1 = sub_df.loc[combi[0]]
row_2 = sub_df.loc[combi[1]]
# Create new rows
new_data.append([row_1.Exchange + "-" + row_2.Exchange, row_1.price - row_2.price])
# Return a dataframe object
return pd.DataFrame(new_data, columns=["price_combo", "diff"])
print(combo_money_df(df, "BTC"))
# price_combo diff
# 0 Bitfinex-Bitstamp 99.6
# 1 Bitfinex-CCCAGG 65.6
# 2 Bitfinex-Coinbase 88.5
# 3 Bitstamp-CCCAGG -34.0
# 4 Bitstamp-Coinbase -11.1
# 5 CCCAGG-Coinbase 22.9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.