[英]How to create a pandas Series (column), based in a match with a value in another Dataframe?
[英]Loop each column and match the value then create another dataframe
我有一個數據集如下:
a = pd.DataFrame({'time': pd.date_range(start='2016-03-10', end='2019-03-10'),
'a': [0 for _ in range(1096)],
'b': [0 for _ in range(1096)]})
indices_a = [0,1,3,6,10,15, 20, 40, 50,70, 100,400,700]
indices_b = [0,1,3,6,10,15, 20, 40, 50,70, 100,400,700]
a.loc[indices_a,'a'] = 1
a.loc[indices_b,'b'] = 1
上面將創建一個 dataframe ,其中 a 和 b 的一些索引為 0 和 1。
我想要做的是使用 pandas 庫函數循環每列並查找值是否為 1 然后創建另一個 dataframe 如下例所示
time | category
2018-03-10 | a
2018-02-10 | a
2018-04-10 | a
2018-05-10 | a
2018-06-10 | b
2018-07-10 | b
2018-08-10 | b
2018-09-10 | b
2018-10-10 | b
我的嘗試:
output = pd.DataFrame()
for col in a.columns[1:]:
temp = pd.DataFrame({'category': [col for _ in range(len(a[a[col]==1]))],
'time':a[a[col]==1]['time'].values})
output = output.append(temp, ignore_index=True)
# Although my attemp produced correct output but its just not the dataframe or pandas way of doing things. Since I wish to know more pandas way of handling the dataframe, please kindly use the pandas functions.
IIUC,你需要melt
和.query
b = a.melt(id_vars='time',var_name='category').query('value == 1')\
.drop('value',axis=1)
print(b)
time category
0 2016-03-10 a
1 2016-03-11 a
3 2016-03-13 a
6 2016-03-16 a
10 2016-03-20 a
15 2016-03-25 a
20 2016-03-30 a
40 2016-04-19 a
50 2016-04-29 a
70 2016-05-19 a
100 2016-06-18 a
400 2017-04-14 a
700 2018-02-08 a
1096 2016-03-10 b
1097 2016-03-11 b
1099 2016-03-13 b
1102 2016-03-16 b
1106 2016-03-20 b
1111 2016-03-25 b
1116 2016-03-30 b
1136 2016-04-19 b
1146 2016-04-29 b
1166 2016-05-19 b
1196 2016-06-18 b
1496 2017-04-14 b
1796 2018-02-08 b
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.