简体   繁体   中英

Can Apache Storm be used to process tuples with a dynamic set of properties?

I am currently evaluating Apache Storm to process heterogeneous data from multiple data sources. While there may be some common properties shared by all data (ie, a "type" property), I would like to be able many different "classes" of tuples and also be able to handle new data types with minimal changes to the topology. To give an example what these data types might look like:

{type=LogTransaction,timestamp=...,user=...,duration=...}
{type=LogEvent,timestamp=...,user=...,message=...}

The examples on the Storm page primarily deal with simple Tuples which are well-defined in advance so that the spouts / bolts can statically declare the output fields.

My initial idea was to declare the type and store all other properties in a Map<String,Object> , which could then be declared:

public void declareOutputFields(OutputFieldsDeclarer ofd) {
    ofd.declare(new Fields("type", "attributes"));
}

However, I believe at that point many of the more advanced features of Storm will no longer work correctly. For example, it it my understanding that I could no longer use Trident to execute a groupBy on any of the attributes.

Is there a better way to handle this type of data that I have missed in Apache Storm? I did find this post describing a similar issue, however I would like to avoid having to create a Java class for each data type.

You can use your own customized fields as long as the field is serializable , It will work fine in storm with more than one supervisor.

Because storm is a distributed data processing tool and when there exists more than one supervisor, based on grouping, certain bolts will emit the fields to bolts running on different supervisor. In such sutiuations, the output fields will be serialized and sent through network. This serialization can be of regular java serialization or Kryo serialization(to avoid network latency).

Hence you might experience exceptions if your jvm not able to serialize your output fields.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM