简体   繁体   English

从设备日志数据中提取模式

[英]Extract Patterns from the device log data

I am working on a project, in which we have to extract the patterns(User behavior) from the device log data. 我正在一个项目中,我们必须从设备日志数据中提取模式(用户行为)。 Device log contains different device actions with a timestamp like when the devices was switched on or when they was switched off. 设备日志包含带有时间戳的不同设备操作,例如打开或关闭设备的时间。

For example:
When a person enters a room. He first switches on the light and then he 
switches on the fan or Whenever the temp is less than 20 C, he switches off
the AC.

I am thinking to use Bayesian Networks to extract these patterns. 我正在考虑使用贝叶斯网络来提取这些模式。

  • Learn Bayes Network from data (using Weka or Netica). 从数据中学习贝叶斯网络(使用Weka或Netica)。
  • Arcs in Bayes Network will give the patterns/dependencies among different devices. 贝叶斯网络中的弧将给出不同设备之间的模式/依存关系。

    Is this the right approach ?? 这是正确的方法吗?

Edit: The chronological order between devices matters. 编辑:设备之间的时间顺序很重要。

Is this the right approach? 这是正确的方法吗?

There's many possible approaches, but here's a very simple and effective one that fits the domain: 有很多可能的方法,但是这是一种非常适合该领域的简单有效的方法:

Given the nature of the application, chronological order doesn't really matter, it doesn't matter if the Fan gets turned on before the Light eg 鉴于应用程序的性质,时间顺序并不重要,例如Fan是否在Light之前打开也没关系。

Also given that you can have eg a motion sensor to trigger a routine that reads the sensors, and perhaps a periodic temperature check, you can use the network below to act upon the extracted patterns (no need to complicate it further with chronological order and event tracking, we extract data to act upon, and event order in this domain isn't interesting) 另外,假设您可以使用例如运动传感器来触发读取传感器的例程,并可能进行定期温度检查,则可以使用下面的网络对提取的模式进行操作(无需按时间顺序和事件进一步使其复杂化)跟踪,我们提取数据以进行操作,因此该域中的事件顺序并不有趣)

For example: When a person enters a room. 例如:当一个人进入房间时。 He first switches on the light and then he switches on the fan or Whenever the temp is less than 20 C, he switches off the AC. 他首先打开灯,然后再打开风扇,或者只要温度低于20 C,就关闭交流电源。

Raw devices log might look something like this, T/F being True/False: 原始设备日志可能看起来像这样,T / F为True / False:

Person in room | Temperature | Light | Fan | AC
-----------------------------------------------
T              | 20          | T     | T   | T
T              | 19          | T     | T   | F
F              | 18          | F     | F   | F 

With sufficient samples you can train a model on the above, eg Naive bayes is not sensitive to irrelevant features/inputs, so eg if you look at my first raw table above that includes all the variables and try to predict AC , with sufficient data it will understand that some inputs are not very important or completely irrelevant 有了足够的样本,您可以在上面训练模型,例如,朴素贝叶斯对无关的功能/输入不敏感,因此,例如,如果您查看上面包含所有变量的第一个原始表,并尝试使用足够的数据来预测AC ,会理解某些输入不是非常重要或完全不相关

Or if you know how before hand what the Light , Fan , and AC depend on, eg we know Light isn't going to depend on Temperature , and that Fan and AC don't care if Light is turned on or not (they can operate even if the person is sleeping eg) so you can break it down like below: 或者,如果您事先知道LightFanAC依赖于什么,例如,我们知道Light不会取决于Temperature ,并且FanAC不在乎是否打开Light (它们可以即使该人正在睡觉也可以进行操作,例如),因此您可以将其分解如下:

在此处输入图片说明

Person in Room | Light 
----------------------
T              | T
F              | F

Person in Room | Temperature | Fan
----------------------------------
T              | 20          | T
F              | 25          | F

Person in room | Temperature | AC
---------------------------------
T              | 20          | T
T              | 19          | F
F              | 20          | F
F              | 19          | F

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM