简体   繁体   English

从 pandas dataframe 创建相对熵矩阵

[英]Create relative entropy matrix from pandas dataframe

I have a dataframe of values, say:我有一个 dataframe 值,比如说:

df = pd.DataFrame(np.array([[0.2, 0.5, 0.3], [0.1, 0.2, 0.5], [0.4, 0.3, 0.3]]),
                   columns=['a', 'b', 'c'])

in which every row is a vector of probabilities.其中每一行都是概率向量。 I want to compute something like the correlation matrix of df.corr() , but instead of correlation, I want to compute the relative entropy .我想计算df.corr()的相关矩阵之类的东西,但我想计算相对熵而不是相关性。

What is the best way to do this, as I can't find a way to get inside the .corr() method and simply change the function it uses?最好的方法是什么,因为我找不到进入.corr()方法并简单地更改它使用的 function 的方法?

IIUC, use .corr as follows: IIUC,使用.corr如下:

import pandas as pd
import numpy as np

from scipy.stats import entropy

df = pd.DataFrame(np.array([[0.2, 0.5, 0.3], [0.1, 0.2, 0.5], [0.4, 0.3, 0.3]]),
                   columns=['a', 'b', 'c'])

res = df.corr(method=entropy)
print(res)

Output Output

          a         b         c
a  1.000000  0.160246  0.270608
b  0.160246  1.000000  0.167465
c  0.270608  0.167465  1.000000

From the documentation:从文档中:

callable: callable with input two 1d ndarrays and returning a float.可调用:可调用输入两个 1d ndarray 并返回一个浮点数。 Note that the returned matrix from corr will have 1 along the diagonals and will be symmetric regardless of the callable's behavior.请注意,从 corr 返回的矩阵沿对角线将具有 1 并且无论可调用对象的行为如何都是对称的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM