簡體   English   中英

在Python中實現Mahalanobis從頭開始的距離

[英]Implementing Mahalanobis Distance from scratch in python

我從頭開始實施Mahalanobis Distance但發生了錯誤。 馬哈拉諾比斯距離的公式是 - 在此輸入圖像描述 我在下面提供我的代碼,錯誤 -

from math import*
from decimal import Decimal
import numpy as np

def mahalanobis(x, y, cov=None):
    x_mean = np.mean(x)
    y_mean = np.mean(y)
    y_minus_mn = y - y_mean
    x_minus_mn_with_transpose =np.transpose(x- x_mean)
    Covariance = covar(x, y)
    inv_covmat = np.linalg.inv(Covariance)
    x_minus_mn = x - x_mean
    D_square = np.dot( x_minus_mn_with_transpose, inv_covmat, x_minus_mn)
    return D_square

def covar(x, y):
    x_mean = np.mean(x)
    y_mean = np.mean(y)
    Cov_numerator = sum(((a - x_mean)*(b - y_mean)) for a, b in zip(x, y))
    Cov_denomerator = len(x) - 1
    Covariance = (Cov_numerator / Cov_denomerator)
    return  Covariance

import pandas as pd

filepath = 'https://raw.githubusercontent.com/selva86/datasets/master/diamonds.csv'
df = pd.read_csv(filepath).iloc[:, [0,4,6]]
df.head()

X = df[['carat', 'depth', 'price']].head(500).values.tolist
Y =df[['carat', 'depth', 'price']].values.tolist

mahalanobis(X, Y)

錯誤 - 下面的圖片 在此輸入圖像描述

Plz的幫助。 是否有人可以檢查和更正我的代碼

X = df[['carat', 'depth', 'price']].head(500).values.tolist
Y =df[['carat', 'depth', 'price']].values.tolist

.tolist

它的功能。 我想你需要:

.tolist()

我要指出,您的代碼中存在許多錯誤

  1. 使用np.cov在使用numpy數組時計算協方差,不要重新實現所有內容

  2. np.dot的第三個參數是輸出,所以你應該寫D_square = np.dot(np.dot(x_minus_mn, inv_covmat), np.transpose(x_minus_mn))而不是D_square = np.dot( x_minus_mn_with_transpose, inv_covmat, x_minus_mn) D_square = np.dot(np.dot(x_minus_mn, inv_covmat), np.transpose(x_minus_mn))

  3. 而不是X = df[['carat', 'depth', 'price']].head(500).values.tolist使用X = np.asarray(df[['carat', 'depth', 'price']].head(500).values) 如果你使用numpy然后只使用numpy數組,而不是列表。

這是您提供的代碼的修改版本

import numpy as np

def mahalanobis(x, y, cov=None):
    x_mean = np.mean(x)
    Covariance = np.cov(np.transpose(y))
    inv_covmat = np.linalg.inv(Covariance)
    x_minus_mn = x - x_mean
    D_square = np.dot(np.dot(x_minus_mn, inv_covmat), np.transpose(x_minus_mn))
    return D_square

import pandas as pd

filepath = 'https://raw.githubusercontent.com/selva86/datasets/master/diamonds.csv'
df = pd.read_csv(filepath).iloc[:, [0,4,6]]
df.head()

X = np.asarray(df[['carat', 'depth', 'price']].head(500).values)
Y =np.asarray(df[['carat', 'depth', 'price']].values)

mahalanobis(X, Y)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM