[英]Pandas - Pivot Table generated from 2 columns
I am trying to generate a pivot table structure but only from two columns of data. 我正在尝试仅从两列数据生成数据透视表结构。 What I have is this general DataFrame;
我只有这个通用的DataFrame;
df = pd.DataFrame({'name': ['Australia', 'Japan', 'Brazil'], 'code': ['R1', 'R2', 'R3']})
and what I am trying to achieve is have the name
field in both the columns and the index, and the values to the the concatenated strings from the code
field. 我想要实现的是在列和索引中都有
name
字段,并从code
字段中获取连接字符串的值。 This will result in a DataFrame with shape (3, 3). 这将导致形状为(3,3)的DataFrame。 Essentially having in the
Australia
row and Japan
column the value of R1-R2
and in the Brazil
row and Australia
column the value of R3-R1
. 本质上,在
Australia
行和Japan
列中具有R1-R2
的值,而在Brazil
行和Australia
列中具有R3-R1
的值。
I tried using this function but I am not sure the pivot_table
can take the same value in the index and columns. 我尝试使用此函数,但不确定数据
pivot_table
表在索引和列中pivot_table
可以采用相同的值。
pd.pivot_table(df, values='code', index=['name'], columns=['name'], aggfunc=lambda x: '-'.join(x))
Essentially, the output should be of this form (although maybe with index and column names) and not manually generated; 从本质上讲,输出应采用这种形式(尽管可能带有索引和列名),而不应手动生成;
data = {'Australia': ['R1-R1', 'R2-R1', 'R3-R1'],
'Japan': ['R1-R2', 'R2-R2', 'R3-R2'],
'Brazil': ['R1-R3', 'R2-R3', 'R3-R3']}
df_result = pd.DataFrame(data, columns=['Australia', 'Japan', 'Brazil'], index=['Australia', 'Japan', 'Brazil'])
Thanks! 谢谢!
One way, you could do this: 一种方式,您可以这样做:
df1 = df.assign(key=1).merge(df.assign(key=1), how='outer',on='key',suffixes=('','_c'))
df1 = df1.drop('key',axis=1)
df1['value'] = df1['code'] + '-' + df1['code_c']
df2 = df1.drop(['code','code_c'],axis=1)
df_result = df2.set_index(['name','name_c']).unstack()
df_result.columns = df_result.columns.droplevel()
print(df_result)
Output: 输出:
name_c Australia Brazil Japan
name
Australia R1-R1 R1-R3 R1-R2
Brazil R3-R1 R3-R3 R3-R2
Japan R2-R1 R2-R3 R2-R2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.