简体   繁体   中英

Kusto Query Pipeline Query

I am trying to create another column from a pipeline runs data using Azure data explorer/Kusto queries. I am very new to Kusto and not sure how to go about that. Goal is for each customer,

  1. If previous run failed and last run failed get difference of days/hours between failures.
  2. If previous run succeeded and last run failed, get difference of days/hours between the events
  3. If previous event failed and last event succeeded, then ignore.

Dataset

Customers   PipelineType    PipelineState    TimeStamp
CustomerA   PipelineA   Succes               2021-08-13 12:59:03.0073653
CustomerA   PipelineA   Fail                 2021-08-13 09:59:03.0124853
CustomerA   PipelineB   Succes               2021-08-13 11:56:03.0151948
CustomerA   Pipeline B  Fail                 2021-08-12 17:56:03.0019445
CustomerA   Pipeline C  Succes               2021-08-13 13:16:03.0015617
CustomerA   Pipeline C  Fail                 2021-07-30 21:52:03.0157372
CustomerB   PipelineA   Succes               2021-08-13 12:59:03.0073331
CustomerB   PipelineA   Succes               2021-08-13 12:57:03.0099138
CustomerB   PipelineB   Fail                 2021-07-30 03:33:03.0123262
CustomerB   Pipeline B  Succes               2021-08-13 13:16:03.0015297
CustomerB   Pipeline C  Fail                 2021-08-13 12:57:03.0099499
CustomerB   Pipeline C  Succes               2021-08-13 13:16:03.0016348
CustomerC   PipelineA   Succes               2021-08-13 13:16:03.0016999
CustomerC   PipelineA   Succes               2021-08-13 12:59:03.0074113
CustomerC   PipelineB   Succes               2021-08-13 10:56:03.0075546
CustomerC   Pipeline B  Fail                 2021-08-11 06:54:03.0118628
CustomerC   Pipeline C  Fail                 2021-08-13 13:16:03.0016233
CustomerC   Pipeline C  Fail                 2021-08-13 12:59:03.0072337
``

If I understand the requirements correctly, you could sort your data set and then use the case() and prev() functions.

For example:

datatable(customer:string, PipelineType:string, PipelineState:string, TimeStamp:datetime)
[
    'CustomerA', 'Pipeline A', 'Fail',    datetime(2021-08-13 12:59:03.0073653),
    'CustomerA', 'Pipeline A', 'Fail',    datetime(2021-08-13 09:59:03.0124853),
    'CustomerA', 'Pipeline B', 'Success', datetime(2021-08-13 11:56:03.0151948),
    'CustomerA', 'Pipeline B', 'Fail',    datetime(2021-08-12 17:56:03.0019445),
    'CustomerA', 'Pipeline C', 'Success', datetime(2021-08-13 13:16:03.0015617),
    'CustomerA', 'Pipeline C', 'Fail',    datetime(2021-07-30 21:52:03.0157372),
    'CustomerB', 'Pipeline A', 'Fail',    datetime(2021-08-13 12:59:03.0073331),
    'CustomerB', 'Pipeline A', 'Success', datetime(2021-08-13 12:57:03.0099138),
    'CustomerB', 'Pipeline B', 'Fail',    datetime(2021-07-30 03:33:03.0123262),
    'CustomerB', 'Pipeline B', 'Success', datetime(2021-08-13 13:16:03.0015297),
    'CustomerB', 'Pipeline C', 'Fail',    datetime(2021-08-13 12:57:03.0099499),
    'CustomerB', 'Pipeline C', 'Success', datetime(2021-08-13 13:16:03.0016348),
    'CustomerC', 'Pipeline A', 'Fail',    datetime(2021-08-13 13:16:03.0016999),
    'CustomerC', 'Pipeline A', 'Fail',    datetime(2021-08-13 12:59:03.0074113),
    'CustomerC', 'Pipeline B', 'Success', datetime(2021-08-13 10:56:03.0075546),
    'CustomerC', 'Pipeline B', 'Fail',    datetime(2021-08-11 06:54:03.0118628),
    'CustomerC', 'Pipeline C', 'Fail',    datetime(2021-08-13 13:16:03.0016233),
    'CustomerC', 'Pipeline C', 'Fail',    datetime(2021-08-13 12:59:03.0072337),
]   
| order by customer asc, PipelineType asc, TimeStamp asc
| extend result = case(prev(customer) == customer and prev(PipelineType) == PipelineType and PipelineState == 'Fail', TimeStamp - prev(TimeStamp), timespan(null))
customer PipelineType PipelineState TimeStamp result
CustomerA Pipeline A Fail 2021-08-13 09:59:03.0124853
CustomerA Pipeline A Fail 2021-08-13 12:59:03.0073653 02:59:59.9948800
CustomerA Pipeline B Fail 2021-08-12 17:56:03.0019445
CustomerA Pipeline B Success 2021-08-13 11:56:03.0151948
CustomerA Pipeline C Fail 2021-07-30 21:52:03.0157372
CustomerA Pipeline C Success 2021-08-13 13:16:03.0015617
CustomerB Pipeline A Success 2021-08-13 12:57:03.0099138
CustomerB Pipeline A Fail 2021-08-13 12:59:03.0073331 00:01:59.9974193
CustomerB Pipeline B Fail 2021-07-30 03:33:03.0123262
CustomerB Pipeline B Success 2021-08-13 13:16:03.0015297
CustomerB Pipeline C Fail 2021-08-13 12:57:03.0099499
CustomerB Pipeline C Success 2021-08-13 13:16:03.0016348
CustomerC Pipeline A Fail 2021-08-13 12:59:03.0074113
CustomerC Pipeline A Fail 2021-08-13 13:16:03.0016999 00:16:59.9942886
CustomerC Pipeline B Fail 2021-08-11 06:54:03.0118628
CustomerC Pipeline B Success 2021-08-13 10:56:03.0075546
CustomerC Pipeline C Fail 2021-08-13 12:59:03.0072337
CustomerC Pipeline C Fail 2021-08-13 13:16:03.0016233 00:16:59.9943896

Update : in reply to your comment - just add the appropriate filters.

For example:

datatable(customer:string, PipelineType:string, PipelineState:string, TimeStamp:datetime)
[
    'CustomerA', 'Pipeline A', 'Fail',    datetime(2021-08-13 12:59:03.0073653),
    'CustomerA', 'Pipeline A', 'Fail',    datetime(2021-08-13 09:59:03.0124853),
    'CustomerA', 'Pipeline B', 'Success', datetime(2021-08-13 11:56:03.0151948),
    'CustomerA', 'Pipeline B', 'Fail',    datetime(2021-08-12 17:56:03.0019445),
    'CustomerA', 'Pipeline C', 'Success', datetime(2021-08-13 13:16:03.0015617),
    'CustomerA', 'Pipeline C', 'Fail',    datetime(2021-07-30 21:52:03.0157372),
    'CustomerB', 'Pipeline A', 'Fail',    datetime(2021-08-13 12:59:03.0073331),
    'CustomerB', 'Pipeline A', 'Success', datetime(2021-08-13 12:57:03.0099138),
    'CustomerB', 'Pipeline B', 'Fail',    datetime(2021-07-30 03:33:03.0123262),
    'CustomerB', 'Pipeline B', 'Success', datetime(2021-08-13 13:16:03.0015297),
    'CustomerB', 'Pipeline C', 'Fail',    datetime(2021-08-13 12:57:03.0099499),
    'CustomerB', 'Pipeline C', 'Success', datetime(2021-08-13 13:16:03.0016348),
    'CustomerC', 'Pipeline A', 'Fail',    datetime(2021-08-13 13:16:03.0016999),
    'CustomerC', 'Pipeline A', 'Fail',    datetime(2021-08-13 12:59:03.0074113),
    'CustomerC', 'Pipeline B', 'Success', datetime(2021-08-13 10:56:03.0075546),
    'CustomerC', 'Pipeline B', 'Fail',    datetime(2021-08-11 06:54:03.0118628),
    'CustomerC', 'Pipeline C', 'Fail',    datetime(2021-08-13 13:16:03.0016233),
    'CustomerC', 'Pipeline C', 'Fail',    datetime(2021-08-13 12:59:03.0072337),
]   
| order by customer asc, PipelineType asc, TimeStamp asc
| where not((prev(customer) == customer and prev(PipelineType) == PipelineType and PipelineState == 'Success' and prev(PipelineState) == 'Fail') or 
            (prev(customer) == customer and prev(PipelineType) == PipelineType and PipelineState == 'Fail' and next(PipelineState) == 'Success'))
| extend result = case(prev(customer) == customer and prev(PipelineType) == PipelineType and PipelineState == 'Fail', TimeStamp - prev(TimeStamp), timespan(null))
customer PipelineType PipelineState TimeStamp result
CustomerA Pipeline A Fail 2021-08-13 09:59:03.0124853
CustomerA Pipeline A Fail 2021-08-13 12:59:03.0073653 02:59:59.9948800
CustomerA Pipeline B Fail 2021-08-12 17:56:03.0019445
CustomerA Pipeline C Fail 2021-07-30 21:52:03.0157372
CustomerB Pipeline A Success 2021-08-13 12:57:03.0099138
CustomerB Pipeline A Fail 2021-08-13 12:59:03.0073331 00:01:59.9974193
CustomerB Pipeline B Fail 2021-07-30 03:33:03.0123262
CustomerB Pipeline C Fail 2021-08-13 12:57:03.0099499
CustomerC Pipeline A Fail 2021-08-13 12:59:03.0074113
CustomerC Pipeline A Fail 2021-08-13 13:16:03.0016999 00:16:59.9942886
CustomerC Pipeline B Fail 2021-08-11 06:54:03.0118628
CustomerC Pipeline C Fail 2021-08-13 12:59:03.0072337
CustomerC Pipeline C Fail 2021-08-13 13:16:03.0016233 00:16:59.9943896

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM