简体   繁体   中英

ELKI: LOF score as infinite

What is the generally used and accepted way to handle LOF scores as inifinite in ELKI, due to duplicate points? If LOF scores of ELKI to be used, should such scores be considered as maximum-scores, zeros, or inliers?

The LOF score of a point is infinite if at least one neighbor of a point has reachability distance 0 (because they are duplicate points).

If the point itself has a non-zero reachability, the value is thus infinitely higher than the lrd of the neighbors (or in terms of density: the point is infinitely less dense than the neighbors), so it is an outlier .

The proper way of handling this is to increase k (minpts) to be larger than the maximum number of duplicate points. If you have too many duplicate points, this usually indicates that using LOF may not be a good idea for this data set. LOF requires that a nearest-neighbor density estimation makes sense on the data, and if you have this kind of problems, the cause usually is the input data, not the algorithm.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM