Use StanfordCoreNLP in parallel

Question

This thread contains a nice example on how to use a wrapper for Stanfords CoreNLP library. Here is the exmaple I am using:

from pycorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP('http://localhost:9000')
res = nlp.annotate("I love you. I hate him. You are nice. He is dumb",
                   properties={
                       'annotators': 'sentiment',
                       'outputFormat': 'json',
                       'timeout': 1000,
                   })
for s in res["sentences"]:
    print("%d: '%s': %s %s" % (
        s["index"],
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))

Say I have +10000 sentences that I want to analyze like in this example. Is it possible to process these in parallel and multithread it?

Answer 1

Not sure about this approach. In java I have singleton class setup with corenlp and the pipeline I want to use. I then call a method on the singleton, with multiple threads using the same instance, that takes a few sentences and it annotates them and does some work with the result. So this type of multi threading does work. I have been doing this for a few years and have no issues

Could you re factor your code to do this? So setup your pipeline then call annotate on a few sentences at a time with your thread pool? Shouldn't be too much effort.

Hope that makes sense.

Use StanfordCoreNLP in parallel

Question

1 answers

solution1
0 2019-02-20 17:46:58

Use StanfordCoreNLP in parallel

Question

1 answers

solution1 0 2019-02-20 17:46:58

solution1
0 2019-02-20 17:46:58