简体   繁体   English

是否可以在一次操作中通过键从 Ignite 缓存中获取多个值,在服务器端应用额外的过滤?

[英]Is it possible to get multiple values from an Ignite cache by their keys, applying additional filtering server-side, in one operation?

I have an Ignite cache:我有一个点燃缓存:

IgniteCache<String, Record> cache;

A collection of keys of this cache is given.给出了这个缓存的一组键。 I need to do the following:我需要执行以下操作:

  1. Get records with the specified keys获取具有指定键的记录
  2. ... but additionally filter them by some logic defined dynamically (like 'where field name has value John ') ...但另外通过一些动态定义的逻辑过滤它们(例如“字段name的值John ”)
  3. ... do it as fast as possible ... 尽可能快地完成
  4. ... under a transaction ...根据交易

One way I tried was using getAll() method and applying filtering on my side:我尝试的一种方法是使用getAll()方法并在我这边应用过滤:

cache.getAll(keys).values().stream()
        .filter(... filter logic...)
        .collect(toList());

This works, but if the additional filter has high selectivity (ie it rejects a lot of data), we'll waste a lot of time on sending unneeded data via network.这可行,但如果附加过滤器具有高选择性(即它拒绝大量数据),我们将浪费大量时间通过网络发送不需要的数据。

Another option is using a scan:另一种选择是使用扫描:

cache.query(new ScanQuery<>(new IsKeyIn(keys).and(new CustomFilter())))

This makes all the filtering work at the server nodes side, but it is a full scan, and if there are many entries in the cache, while the input keys only constitute a small fraction of it, a lot of time is wasted again, this time on the unneeded scanning.这使得所有的过滤工作都在服务器节点端进行,但它是全扫描,如果缓存中有很多条目,而输入键只占其中的一小部分,又浪费了很多时间,这不需要的扫描时间。

And there is invokeAll() which allows to filter on the server nodes side:并且有invokeAll()允许在服务器节点端进行过滤:

cache.invokeAll(new TreeSet<>(keys), new AdditionalFilter())
        .values().stream()
        .map(EntryProcessorResult::get)
        .collect(toList());

where在哪里

private static class AdditionalFilter implements CacheEntryProcessor<String, Record, Record> {
    @Override
    public Record process(MutableEntry<String, Record> entry,
            Object... arguments) throws EntryProcessorException {
        if (... record matches the filter ...) {
            return entry.getValue();
        }
        return null;
    }
}

It finds entries by their keys, it executes filtering logic at server nodes side, but on my data it is even slower than the scanning solution.它通过它们的键查找条目,它在服务器节点端执行过滤逻辑,但在我的数据上它甚至比扫描解决方案还要慢。 I suppose (but not sure) this is due to invokeAll() being possibly an updating operation, so (according to its Javadoc) it takes locks on the corresponding keys.我想(但不确定)这是因为invokeAll()可能是一个更新操作,所以(根据它的 Javadoc)它会锁定相应的键。

I would like to have ability to find entries by given keys, apply additional filtering at the server nodes side and not pay for additional locks (as in my case it's a read-only operation).我希望能够通过给定的键找到条目,在服务器节点端应用额外的过滤,而不是支付额外的锁(在我的例子中它是一个只读操作)。

Is it possible?可能吗?

My cache is distributed among 3 server nodes, and its atomicity is TRANSACTIONAL_SNAPSHOT .我的缓存分布在 3 个服务器节点之间,它的原子性是TRANSACTIONAL_SNAPSHOT The reads are done under transaction.读取是在事务下完成的。

  1. SQL is the simplest solution, and possibly the fastest, given proper indexes. SQL 是最简单的解决方案,并且可能是最快的,给定适当的索引。

  2. IgniteCompute#broadcast + IgniteCache#localPeek : IgniteCompute#broadcast + IgniteCache#localPeek

Collection<Key> keys = ...;
Collection<Collection<Value>> results = compute.broadcast(new LocalGetter(), keys);

...

    class LocalGetter implements IgniteClosure<Collection<Key>, Collection<Value>>
    {
        @Override public Collection<Value> apply(Collection<Key> keys) {
            IgniteCache<Key, Value> cache = ...;

            Collection<Value> res = new ArrayList<>(keys.size());
            
            for (Key key : keys) {
                Value val = cache.localPeek(key, CachePeekMode.PRIMARY);
                
                if (val != null && filterMatches(val)) {
                    res.add(val);
                }
            }
            
            return res;
        }
    }

This way we retrieve cache entries efficiently by key, then apply the filter locally, and only send matching entries back over the network.通过这种方式,我们可以通过键有效地检索缓存条目,然后在本地应用过滤器,并且仅通过网络将匹配的条目发送回。 There are only N network calls, where N is the number of server nodes.只有 N 次网络调用,其中 N 是服务器节点的数量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM