[英]Create ID column in a pandas dataframe
I have a dataframe containing a trading log.我有一个包含交易日志的数据框。 My problem is that I do not have any ID to match buy and sell of a stock.我的问题是我没有任何 ID 来匹配股票的买卖。 The stock could be traded many times and I would like to have an ID to match each finished trade.股票可以多次交易,我想有一个 ID 来匹配每笔完成的交易。 My original dataframe a sequential timeseries dataframe with timestamps.我的原始数据帧是一个带有时间戳的顺序时间序列数据帧。 The below example illustrates my problem, I need to match and ID traded stock in sequential order.下面的例子说明了我的问题,我需要按顺序匹配和 ID 交易的股票。 Very simplified example:非常简单的例子:
df1 = pd.DataFrame({'stock': ['A', 'B', 'C', 'A','C', 'A', 'A'],
'deal': ['buy', 'buy', 'buy', 'sell','sell', 'buy', 'sell']})
df1
Out[84]:
stock deal
0 A buy
1 B buy
2 C buy
3 A sell
4 C sell
5 A buy
6 A sell
Here is my desired output:这是我想要的输出:
df1 = pd.DataFrame({'stock': ['A', 'B', 'C', 'A','C', 'A', 'A'],
'deal': ['buy', 'buy', 'buy', 'sell','sell', 'buy', 'sell'],
'ID': [1, 2, 3, 1,3, 4, 4]})
df1
Out[82]:
stock deal ID
0 A buy 1
1 B buy 2
2 C buy 3
3 A sell 1
4 C sell 3
5 A buy 4
6 A sell 4
Any ideas?有任何想法吗?
Try this:尝试这个:
m = df1['deal'] == 'buy'
df1['ID'] = m.cumsum().where(m)
df1['ID'] = df1.groupby('stock')['ID'].ffill()
df1
Output:输出:
stock deal ID
0 A buy 1.0
1 B buy 2.0
2 C buy 3.0
3 A sell 1.0
4 C sell 3.0
5 A buy 4.0
6 A sell 4.0
Details:细节:
Try This:尝试这个:
import pandas as pd
df1 = pd.DataFrame({'stock': ['A', 'B', 'C', 'A','C', 'A', 'A'],
'deal': ['buy', 'buy', 'buy', 'sell','sell', 'buy', 'sell']})
def sequential_buy_sell_id_generator(df1):
column_length = len(df1["stock"])
found = [0]*column_length
id = [0]*column_length
counter = 0
for row_pointer_head in range(column_length):
if df1["deal"][row_pointer_head]=="buy":
id[row_pointer_head]= counter
counter+=1
found[row_pointer_head] = 1
id[row_pointer_head]= counter
for row_pointer_tail in range(row_pointer_head+1, column_length):
if df1["stock"][row_pointer_head]== df1["stock"][row_pointer_tail] and df1["deal"][row_pointer_tail] =="sell" and found[row_pointer_tail] == 0:
found[row_pointer_tail] = 1
id[row_pointer_tail]= counter
break
df1 = df1.assign(id = id)
return df1
print(sequential_buy_sell_id_generator(df1))
Output:输出:
enter code here
stock deal id
0 A buy 1
1 B buy 2
2 C buy 3
3 A sell 1
4 C sell 3
5 A buy 4
6 A sell 4
Another Example:另一个例子:
For df1 = pd.DataFrame({'stock': ['A', 'B', 'C', 'A','C', 'A', 'A'],
'deal': ['buy', 'buy', 'buy', 'buy','sell', 'sell', 'sell']})
stock deal ID
0 A buy 1
1 B buy 2
2 C buy 3
3 A buy 4
4 C sell 3
5 A sell 1
6 A sell 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.