简体   繁体   中英

Stanford OpenIE in Google Colab

I wondered if anybody knows if and how Stanford Open IE can be set up in google colab?

I've followed the colab tutorial for the CoreNLP client before and that seems to be working.

I get the following error when running the example from their github ( https://github.com/philipperemy/Stanford-OpenIE-Python ):

---------------------------------------------------------------------------
PermanentlyFailedException                Traceback (most recent call last)
<ipython-input-2-01d7100eb03f> in <module>()
      4     text = 'Barack Obama was born in Hawaii. Richard Manning wrote this sentence.'
      5     print('Text: %s.' % text)
----> 6     for triple in client.annotate(text):
      7         print('|-', triple)
      8 

3 frames
/usr/local/lib/python3.6/dist-packages/stanfordnlp/server/client.py in ensure_alive(self)
    135                 time.sleep(1)
    136             else:
--> 137                 raise PermanentlyFailedException("Timed out waiting for service to come alive.")
    138 
    139         # At this point we are guaranteed that the service is alive.

PermanentlyFailedException: Timed out waiting for service to come alive.

Any advice is appreciated:-)

Try this before starting server

%env NO_PROXY='localhost'
%env no_proxy='localhost'

I tested it with stanza-corenlp. And it solved the time-out problem.

I have found the workaround for the Stanford CoreNLP(OPEN IE) with stanza. It is now suggested to use stanza for all annotators.

The problem with me running this was that the colab couldn't find the path to the stanford-corenlp-full-2018-10-05.zip file.

First, download the below file and upload it to your google drive.

https://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip

After uploading to the drive, mount your drive with colab(refer colab for this.)

from google.colab import drive
drive.mount('/content/drive')

you should see stanford-corenlp-full-2018-10-05.zip file in the file explorer of the colab under the content folder. copy this path.

Install stanza(installs stanza)

!pip install stanza

Set the CORENLP_HOME path to the path copied(path of the file in google drive).

import os
os.environ["CORENLP_HOME"] = '/content/drive/MyDrive/stanford-corenlp-full-2018-10-05'

Run the following code.

import stanza
# Import client module
from stanza.server import CoreNLPClient

client = CoreNLPClient(timeout=150000000, be_quiet=True, annotators=['openie'], 
endpoint='http://localhost:9001')
client.start()
import time
time.sleep(10)

Make sure you have set the timeout and annotators as openie in the above code.

Check if the java is running.

# Print background processes and look for java
# You should be able to see a StanfordCoreNLPServer java process running in the 
#background
!ps -o pid,cmd | grep java

Run the code with your text to get triplets as follows.

text = "Albert Einstein was a German-born theoretical physicist. He developed the theory of relativity."
document = client.annotate(text, output_format='json')
triples = []
for sentence in document['sentences']:
    for triple in sentence['openie']:
        triples.append({
           'subject': triple['subject'],
           'relation': triple['relation'],
            'object': triple['object']
        })
print(triples)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM