[英]Error in svd(x, nu = 0) : infinite or missing values in 'x' (checked no negative values exist)
I know this is a common error for PCA but I went through the solutions provided and its not working. 我知道这是PCA的常见错误,但是我经历了提供的解决方案,但无法正常工作。
I followed: Error in svd(x, nu = 0) : 0 extent dimensions 我遵循: svd(x,nu = 0)中的错误:0范围尺寸
Below is my code extract: 下面是我的代码摘录:
require(class)
set.seed(2095)
# dataset source:https://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
normalize<-function(x) {
return ((x - min(x)) / (max(x) - min(x)))
}
dataset <- read.csv("data/kdd_data_10pc.csv", header = FALSE, sep = ",")
names <- read.csv("data/kdd_names.csv", header = FALSE , sep = ";")
names(dataset) <- sapply((1:nrow(names)),function(i) toString(names[i, 1]))
# extracting relevant features
dataset_extracted <- dataset[, c("src_bytes", "dest_bytes", "count", "dst_host_count", "dst_host_same_srv_rate", "dst_host_serror_rate", "label")]
head(dataset_extracted, 3)
log.kdd <-log(dataset_extracted[, 1:6])
kdd.label <- dataset_extracted[, 7]
kdd.pca <-prcomp(log.kdd,
center = TRUE,
scale. = TRUE)
Summary(dataset) output is as follow: 摘要(数据集)输出如下:
summary(dataset_extracted)
src_bytes dest_bytes count dst_host_count dst_host_same_srv_rate dst_host_serror_rate label
Min. : 0 Min. : 0 Min. : 0.0 Min. : 0.0 Min. :0.0000 Min. :0.0000 smurf. :280790
1st Qu.: 45 1st Qu.: 0 1st Qu.:117.0 1st Qu.:255.0 1st Qu.:0.4100 1st Qu.:0.0000 neptune.:107201
Median : 520 Median : 0 Median :510.0 Median :255.0 Median :1.0000 Median :0.0000 normal. : 97278
Mean : 3026 Mean : 869 Mean :332.3 Mean :232.5 Mean :0.7538 Mean :0.1768 back. : 2203
3rd Qu.: 1032 3rd Qu.: 0 3rd Qu.:511.0 3rd Qu.:255.0 3rd Qu.:1.0000 3rd Qu.:0.0000 satan. : 1589
Max. :693375640 Max. :5155468 Max. :511.0 Max. :255.0 Max. :1.0000 Max. :1.0000 ipsweep.: 1247
(Other) : 3713
Based on the summary none of the extracted columns minimum value are of any negative value. 根据摘要,所有提取的列的最小值都不为负值。
I'm new to machine learning. 我是机器学习的新手。 Appreciate any help provided.
感谢提供的任何帮助。 The exact error shown was
显示的确切错误是
Error in svd(x, nu = 0) : infinite or missing values in 'x'
You apply a log transformation to an object ( dataset
) containing zero values. 您将对数转换应用于包含零值的对象(
dataset
)。 This will produce elements of negative infinity. 这将产生负无穷大的元素。 Try using
log1p()
instead. 尝试改用
log1p()
。
Also don't forget to apply the standardisation you encode in the function normalize()
. 同样不要忘记在函数
normalize()
应用您编码的normalize()
。
Also also, given the magnitude of some of the outliers, I'm not sure a log transformation will be sufficient - you may need to consider excluding some observations. 同样,鉴于某些离群值的大小,我不确定对数转换是否足够-您可能需要考虑排除一些观察值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.