[英]pandas convert the unique column value as a column name and place all relating Server names under it
我有下面的代碼,我希望根據ENC1001
和ENC1002
將服務器名稱放在下面
這些只是兩個ENC1001
和ENC1002
,但我有數百個。
#!/usr/bin/python3
import pandas as pd
df = pd.read_excel("C7000_Report_Servers_Report.xlsx", sheet_name=0, usecols=[0, 1, 2, 3])
df = df[df['Enclosure Hardware'].str.contains('C7000')]
print(df)
Server Enclosure Hardware Bay Server Name
0 ENC1001 C7000 bay 1 dpcfl1001.example.com
1 ENC1001 C7000 bay 2 dpcfl1002.example.com
2 ENC1001 C7000 bay 3 dpcfl1003.example.com
3 ENC1001 C7000 bay 4 dpcfl1004.example.com
4 ENC1001 C7000 bay 5 dpcfl1005.example.com
5 ENC1001 C7000 bay 6 dpcfl1006.example.com
6 ENC1001 C7000 bay 7 dpcfl1007.example.com
7 ENC1001 C7000 bay 8 dpcfl1008.example.com
8 ENC1001 C7000 bay 9 dpcfl1009.example.com
9 ENC1001 C7000 bay 10 dpcfl1010.example.com
10 ENC1001 C7000 bay 11 dpcfl1011.example.com
11 ENC1001 C7000 bay 12 inc1001
12 ENC1001 C7000 bay 13 inc1003
13 ENC1001 C7000 bay 14 dpcfl2313.example.com
14 ENC1001 C7000 bay 15 lic1002
15 ENC1002 C7000 bay 1 dpcfl1012.example.com
16 ENC1002 C7000 bay 2 dpcfl1013.example.com
17 ENC1002 C7000 bay 3 dpcfl1014.example.com
18 ENC1002 C7000 bay 4 dpcfl1015.example.com
19 ENC1002 C7000 bay 5 dpcfl1016.example.com
20 ENC1002 C7000 bay 6 dpcfl1017.example.com
21 ENC1002 C7000 bay 7 dpcfl1018.example.com
ENC1001 ENC1002
dpcfl1001.example.com dpcfl1012.example.com
dpcfl1002.example.com dpcfl1013.example.com
dpcfl1003.example.com dpcfl1014.example.com
dpcfl1004.example.com dpcfl1015.example.com
dpcfl1005.example.com dpcfl1016.example.com
dpcfl1006.example.com dpcfl1017.example.com
dpcfl1007.example.com dpcfl1018.example.com
dpcfl1008.example.com None
dpcfl1009.example.com None
dpcfl1010.example.com None
dpcfl1011.example.com None
inc1001 None
inc1003 None
dpcfl2313.example.com None
lic1002 None
謝謝您的幫助。
print(df.head(9))
Server Enclosure Hardware Bay Server Name
0 ENC1001 C7000 bay1 dpcfl1001.example.com
1 ENC1001 C7000 bay2 dpcfl1002.example.com
2 ENC1001 C7000 bay3 dpcfl1003.example.com
3 ENC1001 C7000 bay4 dpcfl1004.example.com
4 ENC1001 C7000 bay5 dpcfl1005.example.com
5 ENC1001 C7000 bay6 dpcfl1006.example.com
6 ENC1001 C7000 bay7 dpcfl1007.example.com
7 ENC1001 C7000 bay8 dpcfl1008.example.com
8 ENC1001 C7000 bay9 dpcfl1009.example.com
df2=df.groupby(['Server','Enclosure Hardware','Bay'])['Server Name'].apply(lambda x: pd.Series(x.tolist())).unstack('Server').fillna('None').reset_index().drop(columns=['level_2'])
Server Enclosure Hardware Bay ENC1001 ENC1002
0 C7000 bay1 dpcfl1001.example.com dpcfl1012.example.com
1 C7000 bay10 dpcfl1010.example.com None
2 C7000 bay11 dpcfl1011.example.com None
3 C7000 bay12 inc1001 None
4 C7000 bay13 inc1003 None
5 C7000 bay14 dpcfl2313.example.com None
6 C7000 bay15 lic1002 None
7 C7000 bay2 dpcfl1002.example.com dpcfl1013.example.com
8 C7000 bay3 dpcfl1003.example.com dpcfl1014.example.com
9 C7000 bay4 dpcfl1004.example.com dpcfl1015.example.com
10 C7000 bay5 dpcfl1005.example.com dpcfl1016.example.com
11 C7000 bay6 dpcfl1006.example.com dpcfl1017.example.com
12 C7000 bay7 dpcfl1007.example.com dpcfl1018.example.com
13 C7000 bay8 dpcfl1008.example.com None
14 C7000 bay9 dpcfl1009.example.com None
在數據框中,同一行的不同列的不同值之間存在一些關系。 所以,我認為使用數據框來獲得你想要的並不是最好的方法。
我建議使用一個字典,其中鍵是Server
,值是Server Name
,如下所示:
from collections import defaultdict
dd = defaultdict(list)
for name, group in df.groupby("Server"):
dd[name] = group["Server Name"].values
現在,您可以像這樣打印Server
的所有Server Name
:
>>> dd["ENC1001"]
['dpcfl1001.example.com' 'dpcfl1002.example.com' 'dpcfl1003.example.com'
'dpcfl1004.example.com' 'dpcfl1005.example.com' 'dpcfl1006.example.com'
'dpcfl1007.example.com' 'dpcfl1008.example.com' 'dpcfl1009.example.com'
'dpcfl1010.example.com' 'dpcfl1011.example.com' 'inc1001' 'inc1003'
'dpcfl2313.example.com' 'lic1002']
>>> dd["ENC1002"]
['dpcfl1012.example.com' 'dpcfl1013.example.com' 'dpcfl1014.example.com'
'dpcfl1015.example.com' 'dpcfl1016.example.com' 'dpcfl1017.example.com'
'dpcfl1018.example.com']
如果你想把它轉換成pandas.DataFrame
,你可以簡單地運行:
>>> new_df = pd.DataFrame.from_dict(dd, orient='index').T
ENC1001 ENC1002
0 dpcfl1001.example.com dpcfl1012.example.com
1 dpcfl1002.example.com dpcfl1013.example.com
2 dpcfl1003.example.com dpcfl1014.example.com
3 dpcfl1004.example.com dpcfl1015.example.com
4 dpcfl1005.example.com dpcfl1016.example.com
5 dpcfl1006.example.com dpcfl1017.example.com
6 dpcfl1007.example.com dpcfl1018.example.com
7 dpcfl1008.example.com None
8 dpcfl1009.example.com None
9 dpcfl1010.example.com None
10 dpcfl1011.example.com None
11 inc1001 None
12 inc1003 None
13 dpcfl2313.example.com None
14 lic1002 None
你可以這樣做:
pd.concat([g.set_index('Bay').add_suffix(f'_{n}') for n, g in df.groupby('Server')],
axis=1, sort=False).filter(like='Server Name')
Output:
Server Name_ENC1001 Server Name_ENC1002
bay 1 dpcfl1001.example.com dpcfl1012.example.com
bay 2 dpcfl1002.example.com dpcfl1013.example.com
bay 3 dpcfl1003.example.com dpcfl1014.example.com
bay 4 dpcfl1004.example.com dpcfl1015.example.com
bay 5 dpcfl1005.example.com dpcfl1016.example.com
bay 6 dpcfl1006.example.com dpcfl1017.example.com
bay 7 dpcfl1007.example.com dpcfl1018.example.com
bay 8 dpcfl1008.example.com NaN
bay 9 dpcfl1009.example.com NaN
bay 10 dpcfl1010.example.com NaN
bay 11 dpcfl1011.example.com NaN
bay 12 inc1001 NaN
bay 13 inc1003 NaN
bay 14 dpcfl2313.example.com NaN
bay 15 lic1002 NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.