简体   繁体   English

AWS-Sage Maker随机砍伐森林

[英]AWS - Sage Maker Random Cut Forest

I have aws cpu-utilization data which NAB used to create Anomaly Detection using AWS- SageMaker Random Cut Forest. 我有aws cpu-utilization数据,NAB使用它来使用AWS-SageMaker Random Cut Forest创建异常检测。 i am able to execute it but i need a deeper solution for the Hyper Parameter Tuning. 我能够执行它,但我需要针对“超参数调整”的更深入的解决方案。 I have gone through the AWS- Documentation but need to understand the Hyper Parameter selection. 我已经阅读过AWS文档,但需要了解Hyper Parameter选择。 are the parameters an educated Guess or Do we need to calculate co_disp's mean and standard deviation in order to infer the parameters. 是有根据的Guess或Do的参数,我们需要计算co_disp的均值和标准差才能推断出这些参数。

Thanks in Advance. 提前致谢。

I have tried 100 Trees and 512/256 tree_size to detect anomalies but how to infer these parameters 我尝试了100棵树和512/256 tree_size来检测异常,但是如何推断这些参数

    # Set tree parameters
    num_trees = 50
    shingle_size = 48
    tree_size = 512

    # Create a forest of empty trees
    forest = []
    for _ in range(num_trees):
        tree = rrcf.RCTree()
        forest.append(tree)

    # Use the "shingle" generator to create rolling window
    #temp_data represents my aws_cpuutilization data
    points = rrcf.shingle(temp_data, size=shingle_size)

    # Create a dict to store anomaly score of each point
    avg_codisp = {}

    # For each shingle...
    for index, point in enumerate(points):
        # For each tree in the forest...
        for tree in forest:
          # If tree is above permitted size, drop the oldest point (FIFO)
          if len(tree.leaves) > tree_size:
             tree.forget_point(index - tree_size)
        # Insert the new point into the tree
        tree.insert_point(point, index=index)
        """Compute codisp on the new point and take the average among all 
         trees"""
        if not index in avg_codisp:
            avg_codisp[index] = 0
            avg_codisp[index] += tree.codisp(index) / num_trees
    values =[]   
    for key,value in avg_codisp.items():
        values.append(value)

Thanks for your interest in RandomCutForest. 感谢您对RandomCutForest的关注。 If you have labeled anomalies we recommend you use SageMaker Automatic Model Tuning ( https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html ), and let SageMaker find the combination that works best. 如果您已标记异常,我们建议您使用SageMaker自动模型调整( https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html ),并让SageMaker找到最合适的组合。

Heuristically, if you know that your data has 0.4% of anomalies, for example, you would set the number of samples per tree to N = 1 / (0.4 / 100) = 250. The idea behind this is that each tree represents a sample of your data. 试探性地,例如,如果您知道数据具有0.4%的异常,则可以将每棵树的样本数设置为N = 1 /(0.4 / 100)=250。这背后的想法是,每棵树代表一个样本您的数据。 Each datapoint in a tree is considered "normal". 树中的每个数据点均被视为“正常”。 If your trees have too few points, eg 10, then most points will look different than these "normal" ones, ie they will have a high anomaly score. 如果您的树上的点太少(例如10),则大多数点看上去将与这些“正常”点不同,即它们的异常得分较高。

The relation between the number of trees and the underlying data is more complex. 树的数量与基础数据之间的关系更加复杂。 As the range of "normal" points grows, you would want to have more trees. 随着“标准”点范围的增加,您将需要更多的树。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 AWS SageMaker随机砍伐森林还是Kinesis Data Analytics随机砍伐森林? - AWS SageMaker Random Cut Forest or Kinesis Data Analytics Random Cut Forest? AWS Kinesis SQL 的问题 - 随机森林砍伐算法 - Issue with AWS Kinesis SQL - Random Cut Forest algorithm 在本地使用 AWS ML model 随机森林砍伐森林 - Use AWS ML model Random Cut Forest locally AWS educate 帐户但 sage maker 在训练 model 时出错 - AWS educate account but sage maker gives error while training a model 随机砍伐森林的超参数调整 - Hyper parameter tuning for Random cut forest 什么是类似于 AWS 的 Kinesis Random Cut Forest 算法的用于时间序列流数据的 Google Clouds 异常检测解决方案? - What is Google Clouds anomaly detection solution for time series streaming data similar to AWS' Kinesis Random Cut Forest algorithm? 为什么重新启动 AWS ec2 实例(sage maker)时会丢失 shh 密钥? - Why are shh keys lost on reboot of AWS ec2 instance (sage maker)? 在Sage Maker中使用图像分类时出现内存不足错误 - Getting Out of Memory error when using Image Classification in Sage Maker 访问我自己的图像文件或将它们从S3加载到phoneoxpth到贤哲制造商 - access my own image files or load them in phoneoxpth from S3 to sage maker 标记在 Sage Maker Studio 中是如何工作的。 应用于用户的标签是否传播到在工作室中创建的资源 - How does tagging work in sage maker studio. Does the tags applied to users propagate to the resources created in studio
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM