简体   繁体   English

如何从 KDB 中获取表或视图元数据并保存到数据结构中?

[英]How to fetch the table or view metadata from KDB and save to a data-structure?

I have been trying to fetch the metadata from a KDB+ Database using python, basically, I installed a library called qpython and using this library we connect and query the KDB+ Database.我一直在尝试使用 python 从 KDB+ 数据库中获取元数据,基本上,我安装了一个名为qpython的库,并使用该库连接和查询 KDB+ 数据库。

I want to store the metadata for all the appropriate cols for a table/view in KDB+ Database using python.我想使用 python 将表/视图的所有适当列的元数据存储在 KDB+ 数据库中。 I am unable to separate the metadata part, despite trying myriad different approaches.尽管尝试了无数不同的方法,但我无法分离元数据部分。

Namely a few to typecast the output to list/tuple, iterating using for , et cetera.即一些将 output 类型转换为列表/元组,使用for等进行迭代。

from qpython import qconnection

def fetch_metadata_from_kdb(params):
    try:
        kdb_connection_obj = qconnection.QConnection(host=params['host'], port=params['port'], username=params['username'], password=params['password'])
        kdb_connection_obj.open()
        PREDICATE = "meta[{}]".format(params['table'])
        metadata = kdb_connection_obj(PREDICATE)
        kdb_connection_obj.close()
        return metadata

    except Exception as error_msg: 
        return error_msg

def fetch_tables_from_kdb(params):
    try:
        kdb_connection_obj = qconnection.QConnection(host=params['host'], port=params['port'], username=params['username'], password=params['password'])
        kdb_connection_obj.open()
        tables = kdb_connection_obj("tables[]")
        views = kdb_connection_obj("views[]")
        kdb_connection_obj.close()
        return [table.decode() for table in list(tables)], [view.decode() for view in list(views)]

    except Exception as error_msg:
        return error_msg

parms_q = {'host':'localhost', 'port':5010,
           'username':'kdb', 'password':'kdb', 'table':'testing'}

print("fetch_tables_from_kdb:", fetch_tables_from_kdb(parms_q), "\n")
print("fetch_metadata_from_kdb:", fetch_metadata_from_kdb(parms_q), "\n")

The output which I am currently getting is as follows;我目前拿到的output如下;

fetch_tables_from_kdb: (['testing'], ['viewname']) 

fetch_metadata_from_kdb: [(b'time',) (b'sym',) (b'price',) (b'qty',)]![(b'p', b'', b'') (b's', b'', b'') (b'f', b'', b'') (b'j', b'', b'')] 

I am not able to separate the columns part and the metadata part.我无法将列部分和元数据部分分开。 How to store only the metadata for the appropriate column for a table/view in KDB using python?如何使用 python 仅存储 KDB 中表/视图的相应列的元数据?

The metadata that you have returned from kdb is correct but is being displayed in python as a kdb dictionary format which I agree is not very useful.您从 kdb 返回的元数据是正确的,但在 python 中显示为 kdb 字典格式,我同意这不是很有用。

If you pass the pandas=True flag into your qconnection call then qPython will parse kdb datastructures, such as a table into pandas data structures or sensible python types, which in your case looks like it will be more useful.如果您将 pandas=True 标志传递给您的 qconnection 调用,那么 qPython 将解析 kdb 数据结构,例如将表格转换为 pandas 数据结构或合理的 python 类型,在您的情况下看起来它会更有用。

Please see an example below - kdb setup (all on localhost)请看下面的例子 - kdb setup (all on localhost)

$ q -p 5000
q)testing:([]date:.z.d+0 1 2;`g#sym:`abc`def`ghi;num:`s#10 20 30)
q)testing
date       sym num
------------------
2022.01.31 abc 10
2022.02.01 def 20
2022.02.02 ghi 30
q)meta testing
c   | t f a
----| -----
date| d
sym | s   g
num | j   s

Python code Python代码

from qpython import qconnection

#create and open 2 connections to kdb process - 1 without pandas flag and one
q = qconnection.QConnection(host="localhost", port=5000)
qpandas = qconnection.QConnection(host="localhost", port=5000, pandas=True)
q.open()
qpandas.open()

#see what is returned with a q table 
print(q("testing"))
[(8066, b'abc', 10) (8067, b'def', 20) (8068, b'ghi', 30)]
#the data is a qPython data object
type(q("testing"))
qpython.qcollection.QTable

#whereas using the pandas=True flag a dataframe is returned.
print(qpandas("testing"))
        date     sym  num
0 2022-01-31  b'abc'   10
1 2022-02-01  b'def'   20
2 2022-02-02  b'ghi'   30

#This is the same for the meta of a table
print(q("meta testing"))
[(b'date',) (b'sym',) (b'num',)]![(b'd', b'', b'') (b's', b'', b'g') (b'j', b'', b's')]

print(qpandas("meta testing"))
         t    f     a
c
b'date'  d  b''   b''
b'sym'   s  b''  b'g'
b'num'   j  b''  b's'

With the above you can now access the columns and rows using pandas (the b'num' etc is the qPython way of expressing a backtick `有了上面的内容,您现在可以使用 pandas 访问列和行(b'num' 等是表示反引号的 qPython 方式`

Also now you have the ability to now use the DataFrame.info() to extract datatypes if you are more intrested in the python data structure rather than the kdb data structure/types.如果您对 python 数据结构而不是 kdb 数据结构/类型更感兴趣,现在您现在还可以使用DataFrame.info()来提取数据类型。 qPython will convert the q types to sensible python types automatically. qPython 会自动将 q 类型转换为合理的 python 类型。

qpandas("testing").info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   date    3 non-null      datetime64[ns]
 1   sym     3 non-null      object
 2   num     3 non-null      int64
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes

In the meantime, I have checked quite a bit of KBD documentation and found that the metadata provides the following as the output.与此同时,我检查了相当多的 KBD 文档,发现元数据提供了以下内容,即 output。 You can see that here kdb metadata您可以在这里看到kdb 元数据

c | tfa

c-columns t-symbol f-foreign key association a-attributes associated with the column c-columns t-symbol f-外键关联 a-与该列关联的属性

We can access the metadata object( <class 'qpython.qcollection.QKeyedTable'> ) by interating over a for loop as shown below;我们可以访问元数据对象( <class 'qpython.qcollection.QKeyedTable'> )通过一个for循环进行交互,如下所示;

from qpython import qconnection

def fetch_metadata_from_kdb(params):
    try:
        col_list, metadata_list = [], []
        kdb_connection_obj = qconnection.QConnection(host=params['host'], port=params['port'], username=params['username'], password=params['password'])
        kdb_connection_obj.open()
        PREDICATE = "meta[{}]".format(params['table'])

        ############# FOR LOOP ##############
        for i,j in kdb_connection_obj(PREDICATE).items():
            col_list.append(i[0].decode())
            metadata_list.append(j[0].decode())

        kdb_connection_obj.close()
        return col_list, metadata_list

    except Exception as error_msg: 
        return error_msg
    
parms_q = {'host':'localhost', 'port':5010,
           'username':'kdb', 'password':'kdb', 'table':'testing'}
           
print(fetch_metadata_from_kdb(parms_q))

Output: ['time', 'sym', 'price', 'qty'], ['p', 's', 'f', 'j']

I also got the KDB char types / q data types from the documentation here .我还从此处的文档中获得了 KDB char types/q 数据类型。 Below is the implementation for the same;以下是相同的实现;

import pandas as pd
from qpython import qconnection

kdb_type_char_dict = dict()

df = pd.read_html('https://code.kx.com/q4m3/2_Basic_Data_Types_Atoms/')[1].iloc[:17, 0:3][['Type', 'CharType']]
for i, j in zip(df.iloc[:, 0], df.iloc[:, 1]): kdb_type_char_dict[str(j)] = str(i)

####### Q DATA TYPES DICTIONARY #######
print("Chat types/ q data types dictionary:", kdb_type_char_dict)

def fetch_metadata_from_kdb(params):
    try:
        col_list, metadata_list, temp_list = [], [], []
        kdb_connection_obj = qconnection.QConnection(host=params['host'], port=params['port'],
                                                     username=params['username'], password=params['password'])
        kdb_connection_obj.open()
        PREDICATE = "meta[{}]".format(params['table'])
        for i, j in kdb_connection_obj(PREDICATE).items():
            col_list.append(i[0].decode())
            temp_list.append(j[0].decode())
        for i in temp_list:
            metadata_list.append("{}".format(kdb_type_char_dict[i]))
        kdb_connection_obj.close()
        return col_list, metadata_list

    except Exception as error_msg:
        return error_msg


params = {'host': 'localhost', 'port': 5010,
           'username': 'kdb', 'password': 'kdb', 'table': 'testing'}

print(fetch_metadata_from_kdb(params))

Output: Output:

Chat types/ q data types dictionary: {'b': 'boolean', 'x': 'byte', 'h': 'short', 'i': 'int', 'j': 'long', 'e': 'real', 'f': 'float', 'c': 'char', 's': 'symbol', 'p': 'timestamp', 'm': 'month', 'd': 'date', 'z': '(datetime)', 'n': 'timespan', 'u': 'minute', 'v': 'second', 't': 'time'}


(['time', 'sym', 'price', 'qty'], ['timestamp', 'symbol', 'float', 'long'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM