简体   繁体   中英

Kusto - Materialized View on latest version of rows

I have a Table called 'metadata', that contains a list of Parameter and ParamterValue that is partitioned by a TestId. Everytime a Test is changed, the Test will be reingested to Azure Data Explorer with a never Version.

My overall goal is to:

  1. Define a Function (GetTestsFromSearch) that takes a Parameter (dynamic) of key value pairs, that lets me Query all Tests (Of latest version) for a match of the Key/Value pairs:
( {{"Search param1", "Search value1"},{"Search param2", "Search value2"}}

Example

GetTestsFromSearch({{"ProjectId", "SturnProject"},{"Product Name", "Nacelle "}})

Should return

TestId Version
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 3
ea5b688c-b61f-4c5b-bb87-af2eac94d454 1

from the example metadata.table below

  1. Another Goal is to Create a Materialized View that contains only latest metadata for each Test (Explained below table)

Example of metadata.table

TestId TestName Parameter ParameterValue Version
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest ProjectId SturnProject 1
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest Product Category 2MW 1
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest Project Start Date 2022-02-03 1
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest ProjectId SturnProject 2
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest Product Category 2MW 2
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest Project Start Date 2022-02-03 2
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest ProjectId SturnProject 3
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest Product Category 2MW 3
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 MyTest Project Start Date 2022-02-03 3
ea5b688c-b61f-4c5b-bb87-af2eac94d454 MyTest ProjectId SturnProject 1
ea5b688c-b61f-4c5b-bb87-af2eac94d454 MyTest Project State Open 1
ea5b688c-b61f-4c5b-bb87-af2eac94d454 MyTest Product Name Nacelle 1

Over time there will be thousands of Tests in several different Versions, and hence I anticipate, that it would be a good idea to create a Materialized View , that only maintains the Latest Versions of each Test - I have tried to create the view as:

metadata
| summarize arg_max(Version,*) by TestId

But this only gives me one Parameter and Parameter Value for each TestId/Version, not the entire result set of the Test.

  • Can anyone point me in the right direction for this materialized view?

I have included an example of a metadata.table as DataTable, which can be used in kusto directly.

Metadata Table as DataTable

datatable (TestId: string, Name: string, Parameter: string, ParameterValue: string, Version: int) [
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Report, DMS number","1234-231",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project name","Thor3",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","GTRS reference","gtrs",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Product Category","2MW",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project number","TE-12321",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DUT responsible person","ANFRB3",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test execution person","ANFRB3",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project Manager","ANFRB3",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DVPR, DMS number","1234-1234",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DVRE, DMS number","1231-1213",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Start Date","2022-02-23T00:00:00.0000000Z",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Category","Verification safety",int(3),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","GTRS reference","gtrs",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project number","TE-12321",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DUT responsible person","ANFRB2",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test execution person","ANFRB2",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project Manager","ANFRB2",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DVPR, DMS number","1234-1234",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DVRE, DMS number","1231-1213",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Start Date","2022-02-23T00:00:00.0000000Z",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Category","Verification safety",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Product Category","2MW",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project name","Thor3",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Report, DMS number","1234-231",int(2),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Category","Verification safety",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project Manager","ANFRB",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","GTRS reference","gtrs",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Product Category","2MW",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project name","Thor3",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Project number","TE-12321",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DUT responsible person","ANFRB",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test execution person","ANFRB",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DVPR, DMS number","1234-1234",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Report, DMS number","1234-231",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","DVRE, DMS number","1231-1213",int(1),
    "fc76aa10-5cf8-447e-95f6-3bd801ef2ed0","ANFRB-FILEVIEW-TEST","Test Start Date","2022-02-23T00:00:00.0000000Z",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","GTRS reference","gtrs232",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Product Category","4MW",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Project name","Myproject",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Project number","43324534",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","DUT responsible person","ANFRB",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Test execution person","ANFRB",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Project Manager","ANFRB",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","DVPR, DMS number","435123454",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Test Report, DMS number","123123123",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","DVRE, DMS number","12312312312",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Test Start Date","2022-03-01T00:00:00.0000000Z",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Test Category","Verification functionality",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Test facility","CHE",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Test rig","rig23",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Sample ID","1",int(1),
    "ea5b688c-b61f-4c5b-bb87-af2eac94d454","ANFRB2-TEST","Link to test data","asdfsafdsdfa",int(1)
]

Thanks

If most tests include same properties (as in your example), you can consider changing the schema to a wide schema, in which each run ( Version ) of a TestId and Name is a single record. The result schema would look like the output of the following:

**datatable**
| extend pack(Parameter, ParameterValue)
| summarize make_bag(Column1) by TestId, Name, Version
| evaluate bag_unpack(bag_Column1)

Then, you can set up a materialized view with the following aggregation that will provide what you're looking for IIUC:

T | summarize arg_max(Version, *) by TestId, Name

To switch from the schema in your example to the suggested one, you can either change your ingestion pipeline to ingest in new schema format, or use an update policy for the transformation. If you choose the latter, avoid using bag_unpack plugin in the update policy function. Instead, project the columns you need explicitly, to avoid a non-deterministic schema.

Another alternative, is keeping all properties in single dynamic column, as in the result of:

**datatable**
| extend pack(Parameter, ParameterValue)
| summarize make_bag(Column1) by TestId, Name, Version

And using same materialized view definition as above.

For the 1st question - using the 2nd suggested schema, you can try something like the following:

let GetTestsFromSearch = (Filter:dynamic)
{
    T
    | extend pack(Parameter, ParameterValue)
    | summarize Properties = make_bag(Column1) by TestId, Name, Version  
    | summarize arg_max(Version, *) by TestId, Name
    | extend Filter
    | mv-apply Filter on 
    ( 
        extend key=tostring(bag_keys(Filter)[0]) 
        | extend expected = tostring(Filter[key]), actual = tostring(Properties[key])
        | summarize count(), countif(actual == expected)
        | where count_  == countif_
    )
};
GetTestsFromSearch(dynamic({"Test Category" : "Verification safety", "Project name" : "Thor3"}));
TestId Name Version Properties count_ countif_
fc76aa10-5cf8-447e-95f6-3bd801ef2ed0 ANFRB-FILEVIEW-TEST 3 {
"Test Report, DMS number": "1234-231",
"Project name": "Thor3",
"GTRS reference": "gtrs",
"Product Category": "2MW",
"Project number": "TE-12321",
"DUT responsible person": "ANFRB3",
"Test execution person": "ANFRB3",
"Project Manager": "ANFRB3",
"DVPR, DMS number": "1234-1234",
"DVRE, DMS number": "1231-1213",
"Test Start Date": "2022-02-23T00:00:00.0000000Z",
"Test Category": "Verification safety"
}
2 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM