简体   繁体   中英

Data caching with ClickHouse

Intro

I have ClickHouse as data warehouse (tables with billions of rows). Users interact with the DWH using my application backend that generates SQL queries to ClickHouse. Different users can access the same data (sometimes the WHERE filtering conditions can change in queries). It is assumed that in the future ClickHouse will scale across different servers.

The task

At the moment, I am caching the results of frequent SQL queries with creating new tables based on those stored in the database and declaring a TTL for the table equal to 1 day. If during the day another query arrives at the table, then I do ALTER TABLE and update the TTL for another 1 day. I doubt that this method is efficient. I also additionally store a table where I fix the name of the table and the time of the last access (in order to delete obsolete empty tables using my application).

Is it possible that there are some patterns for implementing efficient access to the most frequently used data or ready-made mechanisms in ClickHouse? I would also be grateful for links to literature where I can get acquainted with such information or approach this issue from a different angle.

ClickHouse does not have caching mechanism. On the other hand, it relies heavily on the file system cache.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM