GithubHelp home page GithubHelp logo

Comments (12)

jaehyeongAN avatar jaehyeongAN commented on May 30, 2024 3

Hi, I have the same issue!

from haystack.document_store import ElasticsearchDocumentStore

ES_HOST = 'host ip' # my host ip
document_store = ElasticsearchDocumentStore(ES_HOST)

The above code works normally when connecting to 'localhost' after docker runs on a local PC, but an error occurs when accessing an external Elasticsearch IP!

and below is the error log.

Traceback (most recent call last):
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 246, in perform_request
    method, url, body, retries=Retry(False), headers=request_headers, **kw
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/util/retry.py", line 507, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 239, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/local/lib/python3.7/http/client.py", line 1277, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1323, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1272, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1032, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 972, in send
    self.connect()
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 205, in connect
    conn = self._new_conn()
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 187, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fec5925d510>: Failed to establish a new connection: [Errno -2] Name or service not known
11/17/2021 18:53:50 - WARNING - elasticsearch -   HEAD http://http://bb8-elasticsearch-prod.okc1.opsnow.com/:9300/ [status:N/A request:0.000s]
Traceback (most recent call last):
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 175, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/util/connection.py", line 73, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/local/lib/python3.7/socket.py", line 752, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 246, in perform_request
    method, url, body, retries=Retry(False), headers=request_headers, **kw
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/util/retry.py", line 507, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 239, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/local/lib/python3.7/http/client.py", line 1277, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1323, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1272, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1032, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 972, in send
    self.connect()
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 205, in connect
    conn = self._new_conn()
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 187, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fec59268650>: Failed to establish a new connection: [Errno -2] Name or service not known
Traceback (most recent call last):
  File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/haystack/document_store/elasticsearch.py", line 202, in _init_elastic_client
    f"Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance "
ConnectionError: Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at `[{'host': 'http://bb8-elasticsearch-prod.okc1.opsnow.com/', 'port': 9300}]` and that it has finished the initial ramp up (can take > 30s).

But, as shown in the code below, when connected directly to Elasticsearch, it worked fine.

from elasticsearch import Elasticsearch

ES_HOST = 'host ip' # my host ip
es = Elasticsearch(ES_HOST)
es.info() 

# {'name': 'bb8-elasticsearch-0', 'cluster_name': 'elasticsearch', 'cluster_uuid': 'CgeTCo1QT96Hy0cDOy15pw', 'version': {'number': '7.15.0', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '79d65f6e357953a5b3cbcc5e2c7c21073d89aa29', 'build_date': '2021-09-16T03:05:29.143308416Z', 'build_snapshot': False, 'lucene_version': '8.9.0', 'minimum_wire_compatibility_version': '6.8.0', 'minimum_index_compatibility_version': '6.0.0-beta1'}, 'tagline': 'You Know, for Search'}

from haystack.

tholor avatar tholor commented on May 30, 2024

Hey @laifuchicago,

Thanks for reporting. Could you please provide a bit more information on your case so that we can try to replicate it:

  • How are you running Elasticsearch? Are you using a local docker as described in the README (docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.5.1)?
  • How do you initialize the ElasticsearchDocumentStore?
  • What Retriever are you using?

Thanks!

from haystack.

laifuchicago avatar laifuchicago commented on May 30, 2024

To Tholor:
This is my code, and I have run the docker run
from haystack import Finder
#from haystack.database.sql import SQLDocumentStore
from haystack.database.elasticsearch import ElasticsearchDocumentStore
from haystack.indexing.cleaning import clean_wiki_text
from haystack.indexing.io import write_documents_to_db, fetch_archive_from_http
from haystack.reader.farm import FARMReader
from haystack.reader.transformers import TransformersReader
from haystack.utils import print_answers
from elasticsearch import Elasticsearch

document_store = ElasticsearchDocumentStore(host="localhost:9200",username="", password="", index="squad500")

from haystack.

laifuchicago avatar laifuchicago commented on May 30, 2024

I guess it's the elasticsearch_dsl error, we can not import that, my elasticsearch_dsl version is 7.1.0 and elasticsearch is 7.5.1

from haystack.

tholor avatar tholor commented on May 30, 2024

The versions should be fine.
However, you initialize the DocumentStore slightly wrong. Can you please try the following:

document_store = ElasticsearchDocumentStore(host="localhost", username="", password="", index="document")

host doesn't expect a port number.

I also just saw that the index name is currently bound to "document". We will add a PR to allow flexible naming here ...
@tanaysoni can you please implement this? I think the issue is that we don't initialize the index correctly:

class Document(ESDoc):
name = Text()
text = Text()
tags = Text()
class Index:
name = "document"

from haystack.

jyotikhetan avatar jyotikhetan commented on May 30, 2024

hi, @tholor Can't we use Elasticsearch directly from the source..??
I tried using it, I am getting errors saying the connection is refused...& elastic search is getting killed itself..!!

f"Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at {hosts} and that it has finished the initial ramp up (can take > 30s).")
ConnectionError: Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at [{'host': 'localhost', 'port': 9200}] and that it has finished the initial ramp up (can take > 30s).

from haystack.

tholor avatar tholor commented on May 30, 2024

Hey @jyotikhetan , What do you mean by "directly from the source"?
If you mean running elasticsearch via a daemon like this, it is definitely possible:

import os
from subprocess import Popen, PIPE, STDOUT
es_server = Popen(['elasticsearch-7.9.2/bin/elasticsearch'],
                   stdout=PIPE, stderr=STDOUT,
                   preexec_fn=lambda: os.setuid(1)  # as daemon
                  )

In the end you can run elasticsearch as you wish (docker, daemon, remote cloud service ...). I'd suspect some network issues or elasticsearch has not fully started yet. Maybe try a simple curl to elasticsearch (outside of Haystack) to verify that it is up and running correctly.

from haystack.

chicuong209 avatar chicuong209 commented on May 30, 2024

@jaehyeongAN Did you resolve this issue? If yes, could you please share your solution?

from haystack.

jaehyeongAN avatar jaehyeongAN commented on May 30, 2024

@chicuong209 My exteral Elasticsearch IP was cloud service. So, access was only available by https. I added parameter of 'sheme'

document_store = ElasticsearchDocumentStore(
    host=es_host,
    port=es_port,
    scheme='https',
           :
)

from haystack.

 avatar commented on May 30, 2024

I am using ElasticSearch version 8.5.1 and the latest python library of ElasticSearch concurrent with version 8.5.1.

I am also having trouble with ElasticsearchDocumentStore. After following the ElasticSearch instructions [here] for deploying an instance of a single node in a container using a docker image (https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html), I was able to run the following 2 code blocks successfully:

import requests
from datetime import datetime
from elasticsearch import Elasticsearch
from elasticsearch import RequestsHttpConnection

client = Elasticsearch( [{ 'host': '127.0.0.1', 'port': 9200,'scheme': 'https'}], ca_certs="../http_ca.crt", http_auth=('username', 'password'))
resp = client.info()
resp  # this executed correctly

and this:

r = requests.get('https://localhost:9200/_cluster/health', verify="../http_ca.crt", headers={"Authorization": 'Basic ' + TOKEN})
r.json()

Then I tried

from haystack.document_stores.elasticsearch import ElasticsearchDocumentStore

doc_store = ElasticsearchDocumentStore(
    host="localhost",
    port=9200,
    scheme="https",
    username = "username",
    password = "password",
    index = "doc1",

)

and no matter what I try above, I get this error:

Output exceeds the size limit. Open the full output data in a text editor
WARNING:elasticsearch:GET https://localhost:9200/ [status:N/A request:0.029s]
Traceback (most recent call last):
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\connectionpool.py", line 1042, in validate_conn
conn.connect()
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\connection.py", line 414, in connect
self.sock = ssl_wrap_socket(
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\util\ssl
.py", line 449, in ssl_wrap_socket
ssl_sock = ssl_wrap_socket_impl(
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\util\ssl
.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Python310\lib\ssl.py", line 512, in wrap_socket
return self.sslsocket_class._create(
File "C:\Python310\lib\ssl.py", line 1070, in _create
self.do_handshake()
File "C:\Python310\lib\ssl.py", line 1341, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...
self.do_handshake()
File "C:\Python310\lib\ssl.py", line 1341, in do_handshake
self._sslobj.do_handshake()
urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)
Output exceeds the size limit. Open the full output data in a text editor

ConnectionError Traceback (most recent call last)
File c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\haystack\document_stores\elasticsearch.py:272, in ElasticsearchDocumentStore._init_elastic_client(cls, host, port, username, password, api_key_id, api_key, aws4auth, scheme, ca_certs, verify_certs, timeout, use_system_proxy)
271 if not status:
--> 272 raise ConnectionError(
273 f"Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance "
274 f"at {hosts} and that it has finished the initial ramp up (can take > 30s)."
275 )
276 except Exception:

ConnectionError: Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at [{'host': 'localhost', 'port': 9200}] and that it has finished the initial ramp up (can take > 30s).

During handling of the above exception, another exception occurred:

ConnectionError Traceback (most recent call last)
Cell In [97], line 1
----> 1 doc_store = ElasticsearchDocumentStore(
2 host="localhost",
3 port=9200,
4 scheme="https",
5 username = "username",
6 password = "password",
7 index = "aurelius",
8
9 )
...
278 f"Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at {hosts} and that it has finished the initial ramp up (can take > 30s)."
279 )
280 return client

ConnectionError: Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at [{'host': 'localhost', 'port': 9200}] and that it has finished the initial ramp up (can take > 30s).

Any ideas or solutions?

from haystack.

devvspaces avatar devvspaces commented on May 30, 2024

Just for others, i used Elastic version 8, i tried the method above but added certs to the function

document_store = ElasticsearchDocumentStore(host="localhost", username="elastic", password="***", index="document", scheme="https", ca_certs="./http_ca.crt")

from haystack.

slavaGanzin avatar slavaGanzin commented on May 30, 2024

This worked for me with 8.10.2

  elasticsearch:
    image: "docker.elastic.co/elasticsearch/elasticsearch:8.10.2"
    ports:
      - 9200:9200
    restart: on-failure
    environment:
      - xpack.security.enabled=false
      - xpack.license.self_generated.type=basic
      - network.host=0.0.0.0
      ...

from haystack.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.