[英]Suggestions for feature engineering
I am having a problem during feature engineering.我在特征工程过程中遇到了问题。 Looking for some suggestions.
寻找一些建议。 Problem statement: I have usage data of multiple customers for 3 days.
问题陈述:我有多个客户 3 天的使用数据。 Some have just 1 day usage some 2 and some 3. Data is related to number of emails sent / contacts added on each day etc.
有些只使用 1 天,有些使用 2 天,有些使用 3 天。数据与每天发送的电子邮件数量/添加的联系人等有关。
I am converting this time series data to column-wise ie., number of emails sent by a customer on day1 as one feature, number of emails sent by a customer on day2 as one feature and so on.我正在将此时间序列数据转换为列方式,即客户在第 1 天发送的电子邮件数量作为一项功能,客户在第 2 天发送的电子邮件数量作为一项功能等等。 But problem is that, the usage can be of either increasing order or decreasing order for different customers.
但问题是,对于不同的客户,使用可以是递增的,也可以是递减的。
ie., example 1: customer 'A' --> 'number of emails sent on 1st .即,示例 1:客户 'A' --> '1st 发送的电子邮件数量。 day' = 100 .
天' = 100 。 ' number of emails sent on 2nd day'=0
'第 2 天发送的电子邮件数量'=0
example 2: customer 'B' --> 'number of emails sent on 1st .示例 2:客户 'B' --> '1st 发送的电子邮件数量。 day' = 0 .
天' = 0 。 ' number of emails sent on 2nd day'=100
'第 2 天发送的电子邮件数量'=100
example 3: customer 'C' --> 'number of emails sent on 1st .示例 3:客户 'C' --> '1st 发送的电子邮件数量。 day' = 0 .
天' = 0 。 ' number of emails sent on 2nd day'=0
'第 2 天发送的电子邮件数量'=0
example 4: customer 'D' --> 'number of emails sent on 1st .示例 4:客户 'D' --> '1st 发送的电子邮件数量。 day' = 100 .
天' = 100 。 ' number of emails sent on 2nd day'=100
'第 2 天发送的电子邮件数量'=100
In the first two cases => My new feature will have "-100" and "100" as values.在前两种情况下 => 我的新功能将使用“-100”和“100”作为值。 Which I guess is good for differentiating.
我想这有利于区分。 But the problem arises for 3rd and 4th columns when the new feature value will be "0" in both scenarios Can anyone suggest a way to handle this
但是当新特征值在两种情况下都为“0”时,第 3 列和第 4 列会出现问题任何人都可以建议一种方法来处理这个问题
You can extract the following features:您可以提取以下特征:
Simple Moving Averages for day 2 and day 3 respectively.分别为第 2 天和第 3 天的简单移动平均线。 This means you now have two extra columns.
这意味着您现在有两个额外的列。
Percentage Change from previous day与前一天相比的百分比变化
Percentage Change from day 1 to 3从第 1 天到第 3 天的百分比变化
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.