簡體   English   中英

CouchDB視圖-在鍵數組上進行篩選和分組

[英]CouchDB View - Filter and Group By on Key Array

問題描述

我在CouchDB視圖中有一個鍵數組[doc.time, doc.address] 兩者都不是唯一的。 doc.time是UNIX時間戳,而doc.address是字符串。 reduce函數設置為_sum ,因為每組鍵的唯一值是一個數字。

我要的是通過過濾doc.time ,然后組剩余的記錄由doc.address 如果我將doc.time作為第一個鍵,那么無論我指定為group_level ,我似乎都無法按唯一地址進行group_level 如果我將doc.address放在首位,則似乎無法按時間過濾查詢。

兩個例子

查詢: ?group_level=1&startkey=[0,1230000000]&endkey=[{},1340000000]

第一個關鍵字: doc.address早於doc.time

問題:不按時間過濾

碼:

rows: [
  {
    key: [ "1126GDuGLQTX3LFHHmjCctdn8WKDjn7QNA" ],
    value: 50
  },
  {
    key: [ "112AobLhjLJQ3LGqXFrsdnWMPqWCQqoiS6" ],
    value: 50
  }
]

查詢: ?group_level=1&startkey=[1230000000]&endkey=[1340000000,{}]

第一個關鍵字: doc.time之前的doc.address

問題:看不到,我沒有按doc.address分組

碼:

rows: [
  {
    key: [ 1231469665 ],
    value: 50
  },
  {
    key: [ 1231469744 ],
    value: 50
  }
]

您提到:

...如果我將doc.time作為第一個鍵,則無論我指定為group_level為何,我似乎都無法按唯一地址進行分組...

查詢參數group_level=N在第N th逗號處分割字符串,並通過字符串匹配將左元素分組在一起。 因此,當您的數組鍵如下所示: [doc.time, doc.address] ,將無法按address分組,該address不在逗號的左側。

...如果我將doc.address放在首位,似乎無法按時間過濾查詢...

當您的數組鍵類似於: [doc.address, doc.time] ,請注意您正在Map函數中發出數組鍵 您需要考慮有關CouchDB中的數組鍵復合鍵以下幾點


此參考資料中進行了描述:

...首先要注意的一點,也是非常重要的... javascript Map函數的數組輸出...每個索引鍵都是字符串,並且按字符串作為字符串(包括方括號和逗號)按字符排序。 ..

復合鍵數組鍵的情況下,以上對參考的聲明和說明對CouchDB索引的工作方式具有重大影響。

為了澄清起見,讓我們在sample數據庫上創建如下所示的文檔:

{"time":"2011","address":"CT"}
{"time":"2012","address":"CT"}
...
{"time":"2011","address":"TX"}
...
{"time":"2015","address":"TX"}
...
{"time":"2014","address":"NY"}
...
{"time":"2014","address":"CA"}
{"time":"2015","address":"CA"}
{"time":"2016","address":"CA"}

我實現了這樣的視圖地圖功能:

function (doc) {
  if(doc.time && doc.address){
    emit([doc.address, doc.time], null);
  }
}

現在,我不使用任何Reduce函數,因為讓我們忽略任何分組歸約,而將重點放在簡單索引上 上面的視圖正在生成以下用於索引的鍵/值對:

$ curl -k -X GET 'https://admin:****@192.168.1.106:6984/sample/_design/by_addr_time/_view/by_addr_time'
{"total_rows":25,"offset":0,"rows":[
{"id":"doc_0022","key":["CA","2014"],"value":null},
{"id":"doc_0023","key":["CA","2015"],"value":null},
{"id":"doc_0024","key":["CA","2016"],"value":null},
{"id":"doc_0000","key":["CT","2011"],"value":null},
{"id":"doc_0001","key":["CT","2012"],"value":null},
{"id":"doc_0002","key":["CT","2013"],"value":null},
{"id":"doc_0003","key":["CT","2014"],"value":null},
{"id":"doc_0004","key":["CT","2015"],"value":null},
{"id":"doc_0005","key":["CT","2016"],"value":null},
{"id":"doc_0014","key":["NY","2011"],"value":null},
{"id":"doc_0015","key":["NY","2012"],"value":null},
{"id":"doc_0016","key":["NY","2013"],"value":null},
{"id":"doc_0017","key":["NY","2014"],"value":null},
{"id":"doc_0018","key":["NY","2015"],"value":null},
{"id":"doc_0019","key":["NY","2016"],"value":null},
{"id":"doc_0020","key":["NY","2017"],"value":null},
{"id":"doc_0021","key":["NY","2018"],"value":null},
{"id":"doc_0006","key":["TX","2011"],"value":null},
{"id":"doc_0008","key":["TX","2012"],"value":null},
{"id":"doc_0007","key":["TX","2013"],"value":null},
{"id":"doc_0009","key":["TX","2014"],"value":null},
{"id":"doc_0010","key":["TX","2015"],"value":null},
{"id":"doc_0011","key":["TX","2016"],"value":null},
{"id":"doc_0012","key":["TX","2017"],"value":null},
{"id":"doc_0013","key":["TX","2018"],"value":null}
]}

現在,我將進行查詢以按doc.time過濾視圖。 我的查詢參數是:

?startkey=["AA","2017"]&endkey=["ZZ","2018"]

我希望上述查詢僅返回time字段在20172018之間的文檔,這些文檔的address字段可以具有任何值,因為我是從AAZZ指定的,其中包括數據庫中的所有地址。 我正在使用curl進行查詢,如下所示:

$ curl -k -X GET 'https://admin:****@192.168.1.106:6984/sample/_design/by_addr_time/_view/by_addr_time?startkey=\["AA","2017"\]&endkey=\["ZZ","2018"\]'
{"total_rows":25,"offset":0,"rows":[
{"id":"doc_0022","key":["CA","2014"],"value":null},
{"id":"doc_0023","key":["CA","2015"],"value":null},
{"id":"doc_0024","key":["CA","2016"],"value":null},
{"id":"doc_0000","key":["CT","2011"],"value":null},
{"id":"doc_0001","key":["CT","2012"],"value":null},
{"id":"doc_0002","key":["CT","2013"],"value":null},
{"id":"doc_0003","key":["CT","2014"],"value":null},
{"id":"doc_0004","key":["CT","2015"],"value":null},
{"id":"doc_0005","key":["CT","2016"],"value":null},
{"id":"doc_0014","key":["NY","2011"],"value":null},
{"id":"doc_0015","key":["NY","2012"],"value":null},
{"id":"doc_0016","key":["NY","2013"],"value":null},
{"id":"doc_0017","key":["NY","2014"],"value":null},
{"id":"doc_0018","key":["NY","2015"],"value":null},
{"id":"doc_0019","key":["NY","2016"],"value":null},
{"id":"doc_0020","key":["NY","2017"],"value":null},
{"id":"doc_0021","key":["NY","2018"],"value":null},
{"id":"doc_0006","key":["TX","2011"],"value":null},
{"id":"doc_0008","key":["TX","2012"],"value":null},
{"id":"doc_0007","key":["TX","2013"],"value":null},
{"id":"doc_0009","key":["TX","2014"],"value":null},
{"id":"doc_0010","key":["TX","2015"],"value":null},
{"id":"doc_0011","key":["TX","2016"],"value":null},
{"id":"doc_0012","key":["TX","2017"],"value":null},
{"id":"doc_0013","key":["TX","2018"],"value":null}
]}

上述查詢返回的響應似乎令人震驚 因為它看起來像它沒有與僅返回文檔time之間提交20172018 這就是數組鍵的CouchDB索引工作的方式。 as if the whole array is a string including the brackets and commas of the array! CouchDB對進行索引,就好像整個數組都是一個字符串,包括數組的括號和逗號一樣! 如果您閱讀參考資料 ,它將開始變得有意義。

現在讓我們更改查詢:

?startkey=["CT","2016"]&endkey=["TX","2011"]

根據我們的解釋,上面查詢的結果如下所示,這應該是有道理的:

$ curl -k -X GET 'https://admin:****@192.168.1.106:6984/sample/_design/by_addr_time/_view/by_addr_time?startkey=\["CT","2016"\]&endkey=\["TX","2011"\]'
{"total_rows":25,"offset":8,"rows":[
{"id":"doc_0005","key":["CT","2016"],"value":null},
{"id":"doc_0014","key":["NY","2011"],"value":null},
{"id":"doc_0015","key":["NY","2012"],"value":null},
{"id":"doc_0016","key":["NY","2013"],"value":null},
{"id":"doc_0017","key":["NY","2014"],"value":null},
{"id":"doc_0018","key":["NY","2015"],"value":null},
{"id":"doc_0019","key":["NY","2016"],"value":null},
{"id":"doc_0020","key":["NY","2017"],"value":null},
{"id":"doc_0021","key":["NY","2018"],"value":null},
{"id":"doc_0006","key":["TX","2011"],"value":null}
]}

UPDATE

......我要的是通過過濾doc.time ,然后組剩余的記錄由doc.address ...

那么,我們該怎么辦? 有一個很好的問答,並提供了基本思想。

不確定哪個想法是最好的,但是我實現了這樣的想法:創建一個名為t_red的視圖,如下所示,內置_count reduce:

function (doc) {
  if(doc.time && doc.address){
    emit([doc.time, doc.address], null);
  }
}

另外,我創建了一個名為a_red的視圖,並內置了_count reduce:

function (doc) {
  if(doc.address && doc.time){
    emit([doc.address, doc.time], null);
  }
}

然后我開發上的NodeJS以下代碼來查詢doc.time之間20122015 ,然后組的結果根據doc.address ,控制台日志所示的代碼作為注釋的內部。 我希望這段代碼會有所幫助(不要混淆!):

process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0"; // Ignore rejection, becasue CouchDB SSL certificate is self-signed

const fetch=require('node-fetch')

// query "t_red" view/index
fetch(`https://admin:****@192.168.1.106:6984/sample/_design/t_red/_view/t_red?group_level=2&startkey=["2012", "AA"]&endkey=["2015", "ZZ"]`, {
    method: 'GET',
    headers: {
        'Content-Type': 'application/json',
    }
}).then(
    res=>res.json()
).then(data=>{
    let unique_addr=[]
    data.rows.map(row=>{
        console.log('row.key-> ', row.key, '  row.value-> ', row.value)

        // console log is shown below:
        //
        // row.key->  [ '2012', 'CT' ]   row.value->  1
        // row.key->  [ '2012', 'NY' ]   row.value->  1
        // row.key->  [ '2012', 'TX' ]   row.value->  1
        // row.key->  [ '2013', 'CT' ]   row.value->  1
        // row.key->  [ '2013', 'NY' ]   row.value->  1
        // row.key->  [ '2013', 'TX' ]   row.value->  1
        // row.key->  [ '2014', 'CA' ]   row.value->  1
        // row.key->  [ '2014', 'CT' ]   row.value->  1
        // row.key->  [ '2014', 'NY' ]   row.value->  1
        // row.key->  [ '2014', 'TX' ]   row.value->  1
        // row.key->  [ '2015', 'CA' ]   row.value->  1
        // row.key->  [ '2015', 'CT' ]   row.value->  1
        // row.key->  [ '2015', 'NY' ]   row.value->  1
        // row.key->  [ '2015', 'TX' ]   row.value->  1

        if(unique_addr.indexOf(row.key[1])==-1){ // Push unique addresses into an array
            unique_addr.push(row.key[1])
        }
    })

    console.log(unique_addr)

    // Console log is shown below:
    //
    // [ 'CT', 'NY', 'TX', 'CA' ]

    return unique_addr

}).then(unique_addr=>{

    // Group the unique addresses
    let group_by_address=unique_addr.map(addr=>{
        // For each unique address, do a query of "a_red" view/index
        return fetch(`https://admin:****@192.168.1.106:6984/sample/_design/a_red/_view/a_red?group_level=2&startkey=["${addr}","2012"]&endkey=["${addr}","2015"]`, {
            method: 'GET',
            headers: {
                'Content-Type': 'application/json',
            }
        }).then(
            res=>res.json()
        ).then(data=>{
            data.rows.map(row=>{console.log('row.key-> ', row.key, '  row.value-> ', row.value)})

            // Console logs related to this section of code are shown below

            //row.key->  [ 'CA', '2014' ]   row.value->  1
            //row.key->  [ 'CA', '2015' ]   row.value->  1

            //row.key->  [ 'NY', '2012' ]   row.value->  1
            //row.key->  [ 'NY', '2013' ]   row.value->  1
            //row.key->  [ 'NY', '2014' ]   row.value->  1
            //row.key->  [ 'NY', '2015' ]   row.value->  1

            //row.key->  [ 'CT', '2012' ]   row.value->  1
            //row.key->  [ 'CT', '2013' ]   row.value->  1
            //row.key->  [ 'CT', '2014' ]   row.value->  1
            //row.key->  [ 'CT', '2015' ]   row.value->  1

            //row.key->  [ 'TX', '2012' ]   row.value->  1
            //row.key->  [ 'TX', '2013' ]   row.value->  1
            //row.key->  [ 'TX', '2014' ]   row.value->  1
            //row.key->  [ 'TX', '2015' ]   row.value->  1

            let obj={}
            obj[addr]=data.rows.length // This object contains unique address and its corresponding frequency in above query
            return obj

        }).catch(err=>{
            console.log('err-> ', err)
        })
    })

    return group_by_address

}).then(group_by_address=>{
    group_by_address.map(group=>{
        group.then(()=>{
            console.log('Grouped by address-> ', group)

            // Console logs related this section of code are shown below:

            //Grouped by address->  Promise { { CA: 2 } }

            //Grouped by address->  Promise { { NY: 4 } }

            //Grouped by address->  Promise { { CT: 4 } }

            //Grouped by address->  Promise { { TX: 4 } }
        })
    })
}).catch(err=>{
    console.log('err-> ', err)
})

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM