Flink Pipeline is as follows:
Below is the code for pattern matching using grok.
SingleOutputStreamOperator<JSONObject> mainStream = messageStream.rebalance()
.map(new MapFunction<String, JSONObject>() {
private static final long serialVersionUID = 6;
@Override
public JSONObject map(String value) throws Exception {
JSONObject logJson = new JSONObject();
grok.compile(pattern); //pattern is some pattern defined in the class
Match gm = grok.match(value);
gm.captures();
logJson.putAll(gm.toMap());
return logJson;
}})
In the above code writing grok.compile(pattern)
inside the map function works fine. Not doing so gives the following error
The implementation of the MapFunction is not serializable
Caused by: java.io.NotSerializableException: com.google.code.regexp.Pattern
Is there any way in which I could remove the grok.compile outside the map. As per my understanding the compilation of the pattern with every message is not required and might create a bottleneck if the no. of messages becomes quite large.
PS: I have imported the package oi.thekraken.grok.api.Grok
EDIT:
I looked through grok implementation and the Grok class implements Serializable. https://github.com/thekrakken/java-grok/blob/master/src/main/java/io/thekraken/grok/api/Grok.java
Your code does not show where the local variable grok comes from, but:
Flink requires all operators to be Serializable because they might be moved around in a cluster. This also holds true for all members of operators. Can you post a complete non-working example? This might make it easier to see where serialization might fail.
More information about flink serialization can be ound in the flink documentation at https://flink.apache.org/faq.html#why-am-i-getting-a-nonserializableexception- and https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/types_serialization.html
Basically, you can register a kryo serializer for custom types or implement (de-)serialization yourself if you need operator members that are not directly serializable.
Btw.: I think you are right in trying to reduce the number of times the pattern is compiled
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.