[英]To find out a header and index of a particular value in a multi-columned dataframe
[英]pandas : from a two columns dataframe to a (time series) multi-columned dataFrame
假設我們有一個 Dataframe 看起來像這樣
df = pd.DataFrame(columns=['A', 'B','C'])
df.loc[0]=[1,2,3]
df.loc[1]=[4,5,6]
df.loc[2]=[7,8,9]
df.loc[3]=[10,11,12]
df.loc[4]=[13,14,15]
df.loc[5]=[16,17,18]
df.loc[6]=[19,20,21]
df
A B C
0 1 2 3
1 4 5 6
2 7 8 9
3 10 11 12
4 13 14 15
5 16 17 18
6 19 20 21
我想修改 df 得到 df2;
df2 = pd.DataFrame(columns=['first', 'second','third','fourth','fifth','sixth'])
df2.loc[0]=[1,2,4,5,7,8]
df2.loc[1]=[4,5,7,8,10,11]
df2.loc[2]=[7,8,10,11,13,14]
df2.loc[3]=[10,11,13,14,16,17]
df2.loc[4]=[13,14,16,17,19,20]
df2
first second third fourth fifth sixth
0 1 2 4 5 7 8
1 4 5 7 8 10 11
2 7 8 10 11 13 14
3 10 11 13 14 16 17
4 13 14 16 17 19 20
也就是我想用df的前兩列的三行來填充df2的第一行。 然后我們繼續用 df 的兩列的接下來的三行填充 df2 的第二行,依此類推。
我應該怎么做才能從 df 移動到 df2? 我可以做一些基本和簡單的操作。 但現在對我來說仍然很難。
任何人都可以幫助我嗎?
您可以通過 ravel 和 select 每對行通過索引[::2]
使用strides將前 2 列轉換為 1d 數組
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
a = rolling_window(df[['A','B']].to_numpy().ravel(), 6)[::2]
print (a)
[[1 2 4 5 7 8]
[4 5 7 8 10 11]
[7 8 10 11 13 14]
[10 11 13 14 16 17]
[13 14 16 17 19 20]]
df2 = pd.DataFrame(a, columns=['first', 'second','third','fourth','fifth','sixth'])
print (df2)
first second third fourth fifth sixth
0 1 2 4 5 7 8
1 4 5 7 8 10 11
2 7 8 10 11 13 14
3 10 11 13 14 16 17
4 13 14 16 17 19 20
使用 NumPy 作為:
import numpy as np
new = df.values[:, :2].reshape(-1)
l = [new[2*i:2*i+6] for i in range(int(new.shape[0]/2-2))]
l = np.array(l)
df2 = pd.DataFrame(l, columns=['first', 'second','third','fourth','fifth','sixth'])
print(df2)
'''
Output:
first second third fourth fifth sixth
0 1 2 4 5 7 8
1 4 5 7 8 10 11
2 7 8 10 11 13 14
3 10 11 13 14 16 17
4 13 14 16 17 19 20
'''
一個更簡單的解決方案可能是刪除列“C”。 只需加入 3 個列表即可為 df2 排成一行。
代碼如下:
df.drop(['C'] ,axis = 1 , inplace = True)
df2 = pd.DataFrame(columns=['first', 'second','third','fourth','fifth','sixth'])
for i in range(0,len(df.A) - 2):
df2.loc[i] = list(df.loc[i]) + list(df.loc[i+1]) + list(df.loc[i+2])
print(df2)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.