简体   繁体   中英

Connecting to AWS Elasticsearch instance using Python

I have an Elasticsearch instance, hosted on AWS. I can connect from my terminal with Curl. I am now trying to use the python elasticsearch wrapper. I have:

from elasticsearch import Elasticsearch

client = Elasticsearch(host='https://ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com', port=9200)

and the query is:

data = client.search(index="mynewindex", body={"query": {"match": {"email": "gmail"}}})
    for hit in data:
        print(hit.email)
    print data

The full traceback, from heroku, is:

2016-07-22T14:06:06.031347+00:00 heroku[router]: at=info method=GET path="/" host=elastictest.herokuapp.com request_id=9a96d447-fe02-4670-bafe-efba842927f3 fwd="88.106.66.168" dyno=web.1 connect=1ms service=393ms status=500 bytes=456
2016-07-22T14:09:18.035805+00:00 heroku[slug-compiler]: Slug compilation started
2016-07-22T14:09:18.035810+00:00 heroku[slug-compiler]: Slug compilation finished
2016-07-22T14:09:18.147278+00:00 heroku[web.1]: Restarting
2016-07-22T14:09:18.147920+00:00 heroku[web.1]: State changed from up to starting
2016-07-22T14:09:20.838784+00:00 heroku[web.1]: Starting process with command `gunicorn application:application --log-file=-`
2016-07-22T14:09:20.834521+00:00 heroku[web.1]: Stopping all processes with SIGTERM
2016-07-22T14:09:17.850918+00:00 heroku[api]: Deploy b7187d3 by hector@fastmail.se
2016-07-22T14:09:17.850993+00:00 heroku[api]: Release v21 created by hector@fastmail.se
2016-07-22T14:09:21.372589+00:00 app[web.1]: [2016-07-22 14:09:21 +0000] [3] [INFO] Handling signal: term
2016-07-22T14:09:21.383946+00:00 app[web.1]: [2016-07-22 14:09:21 +0000] [3] [INFO] Shutting down: Master
2016-07-22T14:09:21.367656+00:00 app[web.1]: [2016-07-22 14:09:21 +0000] [9] [INFO] Worker exiting (pid: 9)
2016-07-22T14:09:21.366309+00:00 app[web.1]: [2016-07-22 14:09:21 +0000] [10] [INFO] Worker exiting (pid: 10)
2016-07-22T14:09:22.286766+00:00 heroku[web.1]: Process exited with status 0
2016-07-22T14:09:23.344822+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [3] [INFO] Starting gunicorn 19.6.0
2016-07-22T14:09:23.345481+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [3] [INFO] Using worker: sync
2016-07-22T14:09:23.351173+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [9] [INFO] Booting worker with pid: 9
2016-07-22T14:09:23.370580+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [10] [INFO] Booting worker with pid: 10
2016-07-22T14:09:23.345376+00:00 app[web.1]: [2016-07-22 14:09:23 +0000] [3] [INFO] Listening at: http://0.0.0.0:59867 (3)
2016-07-22T14:09:24.536725+00:00 heroku[web.1]: State changed from starting to up
2016-07-22T14:09:39.043240+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1544, in handle_user_exception
2016-07-22T14:09:39.043239+00:00 app[web.1]:     rv = self.handle_user_exception(e)
2016-07-22T14:09:39.043241+00:00 app[web.1]:     reraise(exc_type, exc_value, tb)
2016-07-22T14:09:39.043233+00:00 app[web.1]: Traceback (most recent call last):
2016-07-22T14:09:39.043238+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1641, in full_dispatch_request
2016-07-22T14:09:39.043236+00:00 app[web.1]:     response = self.full_dispatch_request()
2016-07-22T14:09:39.043235+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
2016-07-22T14:09:39.043214+00:00 app[web.1]: [2016-07-22 14:09:39,041] ERROR in app: Exception on / [GET]
2016-07-22T14:09:39.043241+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
2016-07-22T14:09:39.043242+00:00 app[web.1]:     rv = self.dispatch_request()
2016-07-22T14:09:39.043242+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
2016-07-22T14:09:39.043243+00:00 app[web.1]:     return self.view_functions[rule.endpoint](**req.view_args)
2016-07-22T14:09:39.043243+00:00 app[web.1]:   File "/app/application.py", line 23, in index
2016-07-22T14:09:39.043246+00:00 app[web.1]:     return func(*args, params=params, **kwargs)
2016-07-22T14:09:39.043245+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped
2016-07-22T14:09:39.043246+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 548, in search
2016-07-22T14:09:39.043247+00:00 app[web.1]:     doc_type, '_search'), params=params, body=body)
2016-07-22T14:09:39.043250+00:00 app[web.1]:     status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
2016-07-22T14:09:39.043250+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 105, in perform_request
2016-07-22T14:09:39.043244+00:00 app[web.1]:     data = client.search(index="mynewindex", body={"query": {"match": {"email": "gmail"}}})
2016-07-22T14:09:39.043251+00:00 app[web.1]:     raise ConnectionError('N/A', str(e), e)
2016-07-22T14:09:39.043249+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/transport.py", line 329, in perform_request
2016-07-22T14:09:39.043253+00:00 app[web.1]: ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7f185a94d8d0>: Failed to establish a new connection: [Errno -2] Name or service not known) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f185a94d8d0>: Failed to establish a new connection: [Errno -2] Name or service not known)
2016-07-22T14:09:42.692817+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1641, in full_dispatch_request
2016-07-22T14:09:42.692816+00:00 app[web.1]:     response = self.full_dispatch_request()
2016-07-22T14:09:42.692795+00:00 app[web.1]: [2016-07-22 14:09:42,691] ERROR in app: Exception on / [GET]
2016-07-22T14:09:42.692820+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
2016-07-22T14:09:42.692819+00:00 app[web.1]:     reraise(exc_type, exc_value, tb)
2016-07-22T14:09:42.692819+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1544, in handle_user_exception
2016-07-22T14:09:42.692827+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/transport.py", line 329, in perform_request
2016-07-22T14:09:42.692828+00:00 app[web.1]:     status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
2016-07-22T14:09:42.692828+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 105, in perform_request
2016-07-22T14:09:42.692829+00:00 app[web.1]:     raise ConnectionError('N/A', str(e), e)
2016-07-22T14:09:42.692831+00:00 app[web.1]: ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7f185a946d10>: Failed to establish a new connection: [Errno -2] Name or service not known) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f185a946d10>: Failed to establish a new connection: [Errno -2] Name or service not known)
2016-07-22T14:09:42.692821+00:00 app[web.1]:     rv = self.dispatch_request()
2016-07-22T14:09:42.692821+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
2016-07-22T14:09:42.692822+00:00 app[web.1]:     return self.view_functions[rule.endpoint](**req.view_args)
2016-07-22T14:09:42.692823+00:00 app[web.1]:   File "/app/application.py", line 23, in index
2016-07-22T14:09:42.692823+00:00 app[web.1]:     data = client.search(index="mynewindex", body={"query": {"match": {"email": "gmail"}}})
2016-07-22T14:09:42.692824+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped
2016-07-22T14:09:42.692814+00:00 app[web.1]: Traceback (most recent call last):
2016-07-22T14:09:42.692818+00:00 app[web.1]:     rv = self.handle_user_exception(e)
2016-07-22T14:09:42.692815+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
2016-07-22T14:09:42.692825+00:00 app[web.1]:     return func(*args, params=params, **kwargs)
2016-07-22T14:09:42.692826+00:00 app[web.1]:   File "/app/.heroku/python/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 548, in search
2016-07-22T14:09:42.692826+00:00 app[web.1]:     doc_type, '_search'), params=params, body=body)
2016-07-22T14:09:42.685540+00:00 heroku[router]: at=info method=GET path="/" host=elastictest.herokuapp.com request_id=87ae9ec2-edb6-4e58-b9d6-89709b883091 fwd="88.106.66.168" dyno=web.1 connect=1ms service=11ms status=500 bytes=456

I assume the error is with the "connection string" because the principal error appears to be ConnectionError

So two questions:

1) How can I connect correctly? Inbound security rules are currently configured to accept all incoming traffic

2) Is there an error in the query code?

Many thanks as always.

This is the correct way to connect to elasticsearch server using python:

es = Elasticsearch(['IP:PORT',])

Elasticsearch's constructor doesn't have the host nor the port parameters. The first parameter should be a list, where each item in the list can be either a string representing the host:

'schema://ip:port'

Or a dictionary with extended parameters regarding that host

{'host': 'ip/hostname', 'port': 443, 'url_prefix': 'es', 'use_ssl': True}

In your case you probably would like to use:

 client = Elasticsearch(['https://ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com:9200'])

The port is redundant since you are using the deafult one, so you can use remove it
client = Elasticsearch(['https://ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com'])

host = 'ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com' #without 'https'
YOUR_ACCESS_KEY = ''
YOUR_SECRET_KEY = ''
REGION = 'us-west-2' #change to your region
awsauth = AWS4Auth(YOUR_ACCESS_KEY, YOUR_SECRET_KEY, REGION, 'es')

es = Elasticsearch(
    hosts=[{'host': host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection
)
print(es.info())

This is a small script in Python that will help in creating a connection with AWS Elasticsearch instance.

from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth


host = '' # For example, my-test-domain.us-east-1.es.amazonaws.com
region = '' # e.g. us-west-1

service = 'es'

credentials = {
    'access_key': '',
    'secret_key': ''
}

awsauth = AWS4Auth(credentials['access_key'], credentials['secret_key'], region, service)

es = Elasticsearch(
    hosts=[{'host': host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection
)

print(es.info())

Reference: AWS Elasticsearch Signing Requests

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM