简体   繁体   English

使用 toch log_prob 计算选择分布本身的多个值的概率

[英]Using toch log_prob to calculate the probability of selecting multiple values of the distribution itself

I try to use log_prob to get the probability of selecting a value from a normal distribution, I got dist from a neural.network and action from dist.sample()我尝试使用 log_prob 来获取从正态分布中选择一个值的概率,我从 neural.network 获得 dist 并从dist.sample()获得动作

In a learning phase, I give 5 tensors to a neural.network, and it gives me 5 dist, and from dists, I got 5 actions.在学习阶段,我给 neural.network 5 个张量,它给了我 5 个距离,从距离中,我得到了 5 个动作。 The problem is that I want to select an action over its own distribution, but this function gives me the probability of action in all distributions.问题是我想 select 对它自己的分布采取行动,但是这个 function 给了我在所有分布中行动的概率。 The data on the diameter of the output matrix is the values I want, but I wonder if there is an easy way to implement this part? output矩阵的直径上的数据就是我想要的值,但是不知道有没有简单的实现这部分的方法?

I use this block of code:我使用这段代码:

states = T.tensor(state[b], dtype=T.float).to(agent.device)
old_probs = T.tensor(log_prob[b]).to(agent.device)
actions = T.tensor(action[b]).to(agent.device)
values = T.tensor(value[b]).to(agent.device)

dist = actor(states)
new_probs = dist.log_prob(actions)

and the output is output 是

tensor([[-1.1823, -0.9680, -3.6280, -1.1112, -1.9610],
        [-1.5279, -1.1463, -2.5806, -1.0561, -1.4768],
        [-1.6258, -1.1618, -2.5027, -1.0100, -1.3882],
        [-1.6125, -1.1576, -2.5169, -1.0133, -1.3989],
        [-1.3384, -1.0965, -2.9404, -1.1370, -1.7129]], device='cuda:0',
       dtype=torch.float64, grad_fn=<SubBackward0>)

but the output must be like:但 output 必须是这样的:

tensor([-1.1823, -1.1463, -2.5027, -1.0133, -1.7129], device='cuda:0',
       grad_fn=<SqueezeBackward1>)

You can select the diagonal of your matrix withtorch.diag :您可以使用torch.diag select 矩阵的对角线:

>>> new_probs.diag()
tensor([-1.1823, -1.1463, -2.5027, -1.0133, -1.7129], 
  device='cuda:0', grad_fn=<DiagBackward>)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 tensorflow_probability:反向传播正态分布样本的 log_prob 时,梯度始终为零 - tensorflow_probability: Gradients always zero when backpropagating the log_prob of a sample of a normal distribution log_prob 与人工计算的差异 - Discrepancy between log_prob and manual calculation numpy 张量 log_prob() 浮动预测错误 - numpy tensor log_prob() float presicion bug 获取 AttributeError: 'Tensor' object 在保存 tensorflow model 时没有属性 'log_prob' - Getting AttributeError: 'Tensor' object has no attribute 'log_prob' while saving a tensorflow model 我收到错误此错误:&#39;Tensor&#39;对象没有属性&#39;log_prob&#39; - I am getting error this error : 'Tensor' object has no attribute 'log_prob' 如何从 python 中的对数正态分布概率密度 function 图中计算概率? - How can I calculate probability from the log-normal distribution probability density function graph in python? 如何使用python将对数概率转换为0到1值之间的简单概率 - How to convert log probability into simple probability between 0 and 1 values using python Tensorflow JointDistributionSequential 样本的对数概率上的概率不兼容形状错误 - Tensorflow probability incompatible shapes error on log prob of JointDistributionSequential samples 如何计算 Scipy 中正态分布的概率 - How to calculate a probability for a normal distribution in Scipy 如何计算给定概率分布的期望值 - How to calculate the expectation value for a given probability distribution
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM