[英]Pandas - Reshape / Transform Dataframe with Multiple Columns into a Single Column of values
我有一個熊貓數據框,其中年份作為列,國家作為行名:
Country | 1960 | 1961 | 1962 | 1963
-----------------------------------------
United States | 1000 | 2000 | 3000 | 4000
-----------------------------------------
Argentina | 1000 | 2000 | 3000 | 4000
-----------------------------------------
我想將其轉換為:
Country | Year | Value
-----------------------------
Unites States | 1960 | 1000
Unites States | 1961 | 2000
Unites States | 1962 | 3000
Unites States | 1963 | 4000
Argentina | 1960 | 1000
Argentina | 1961 | 2000
Argentina | 1962 | 3000
Argentina | 1963 | 4000
我不確定要實現此目標需要執行哪些拆分,排序或分組操作。
謝謝!
您可以使用堆棧方法:
>>> df=pd.DataFrame({"country":["United States","Argentina"],
1960:[1000,1000],
1961:[2000,2000],
1962:[3000,3000],
1963:[4000,4000]} )
>>> df
1960 1961 country 1963 1962
0 1000 2000 United States 4000 3000
1 1000 2000 Argentina 4000 3000
>>> df.set_index("country").stack()
country
United States 1960 1000
1961 2000
1963 4000
1962 3000
Argentina 1960 1000
1961 2000
1963 4000
1962 3000
dtype: int64
>>> df.set_index("country").stack().reset_index()
country level_1 0
0 United States 1960 1000
1 United States 1961 2000
2 United States 1963 4000
3 United States 1962 3000
4 Argentina 1960 1000
5 Argentina 1961 2000
6 Argentina 1963 4000
7 Argentina 1962 3000
希望對您有所幫助
僅舉一個完整的例子,
In [1]: df = pd.DataFrame([['United States', 1000, 2000, 3000, 4000],
['Argentina', 1000, 2000, 3000, 4000]],
columns=['Country', 1960, 1961, 1962, 1963])
In [2]: df.set_index('Country', inplace=True)
In [3]: df = df.stack().reset_index()
In [4]: df.columns = ['Country', 'Year', 'Value']
產量
Country Year Value
0 United States 1960 1000
1 United States 1961 2000
2 United States 1962 3000
3 United States 1963 4000
4 Argentina 1960 1000
5 Argentina 1961 2000
6 Argentina 1962 3000
7 Argentina 1963 4000
要擺脫索引列並使用“國家/地區”列作為索引,可以使用
In [3]: df = df.stack().reset_index(1)
In [4]: df.columns = ['Year', 'Value']
產生
Year Value
Country
United States 1960 1000
United States 1961 2000
United States 1962 3000
United States 1963 4000
Argentina 1960 1000
Argentina 1961 2000
Argentina 1962 3000
Argentina 1963 4000
這並不是您想要的,但是使用df.stack()
可以得到以下內容:
0 Country United States
1960 1000
1961 2000
1962 3000
1963 2300
1 Country Argentina
1960 1000
1961 2000
1962 3000
1963 4000
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.