pandas dataframe：將2列（值，值）轉換為2列（值，類型）

Question

假設我有以下數據框“A”

         utilization  utilization_billable
service                                   
1               10.0                   5.0
2               30.0                  20.0
3               40.0                  30.0
4               40.0                  32.0

我需要將它轉換為以下數據幀“B”

         utilization      type
service                       
1               10.0     total
2               30.0     total
3               40.0     total
4               40.0     total
1                5.0  billable
2               20.0  billable
3               30.0  billable
4               32.0  billable

所以第一個值被分類為類型列，其值為total或billable。

data = {
    'utilization': [10.0, 30.0, 40.0, 40.0],
    'utilization_billable': [5.0, 20.0, 30.0, 32.0],
    'service': [1, 2, 3, 4]
}
df = pd.DataFrame.from_dict(data).set_index('service')
print(df)

data = {
    'utilization': [10.0, 30.0, 40.0, 40.0, 5.0, 20.0, 30.0, 32.0],
    'service': [1, 2, 3, 4, 1, 2, 3, 4],
    'type': [
        'total',
        'total',
        'total',
        'total',
        'billable',
        'billable',
        'billable',
        'billable',
    ]
}
df = pd.DataFrame.from_dict(data).set_index('service')
print(df)

有沒有辦法轉換數據框並執行此類分類？

Answer 1

你可以使用pd.melt ：

import pandas as pd
data = {
    'utilization': [10.0, 30.0, 40.0, 40.0],
    'utilization_billable': [5.0, 20.0, 30.0, 32.0],
    'service': [1, 2, 3, 4]}

df = pd.DataFrame(data)
result =  pd.melt(df, var_name='type', value_name='utilization', id_vars='service')
print(result)

產量

   service                  type  utilization
0        1           utilization         10.0
1        2           utilization         30.0
2        3           utilization         40.0
3        4           utilization         40.0
4        1  utilization_billable          5.0
5        2  utilization_billable         20.0
6        3  utilization_billable         30.0
7        4  utilization_billable         32.0

然后result.set_index('service')會使service成為索引，但我建議避免使用它，因為service值不是唯一的。

Answer 2

看起來像df.stack()有多個DataFrame.rename()

df.rename(index=str, columns={"utilization": "total", "utilization_billable": "billable"})\
  .stack().reset_index(1).rename(index=str, columns={"level_1": "type", 0: "utilization"})\
  .sort_values(by='type', ascending = False)

輸出：

             type  utilization
service                       
1           total         10.0
2           total         30.0
3           total         40.0
4           total         40.0
1        billable          5.0
2        billable         20.0
3        billable         30.0
4        billable         32.0

Answer 3

在第一列添加后綴后，可以使用pd.wide_to_long完成此操作。

import pandas as pd
df = df.rename(columns={'utilization': 'utilization_total'})

pd.wide_to_long(df.reset_index(), stubnames='utilization', sep='_', 
                i='service', j='type', suffix='.*').reset_index(1)

輸出：

             type  utilization
service                       
1           total         10.0
2           total         30.0
3           total         40.0
4           total         40.0
1        billable          5.0
2        billable         20.0
3        billable         30.0
4        billable         32.0

pandas dataframe：將2列（值，值）轉換為2列（值，類型）

問題描述

有沒有辦法轉換數據框並執行此類分類？

3 個解決方案

解決方案1
5 已采納 2019-03-05 19:31:12

解決方案2
2 2019-03-05 19:20:47

解決方案3
2 2019-03-05 19:21:54

輸出：

pandas dataframe：將2列（值，值）轉換為2列（值，類型）

問題描述

有沒有辦法轉換數據框並執行此類分類？

3 個解決方案

解決方案1 5 已采納 2019-03-05 19:31:12

解決方案2 2 2019-03-05 19:20:47

解決方案3 2 2019-03-05 19:21:54

輸出：

解決方案1
5 已采納 2019-03-05 19:31:12

解決方案2
2 2019-03-05 19:20:47

解決方案3
2 2019-03-05 19:21:54