Comments (12)
Hi, I have the same issue!
from haystack.document_store import ElasticsearchDocumentStore
ES_HOST = 'host ip' # my host ip
document_store = ElasticsearchDocumentStore(ES_HOST)
The above code works normally when connecting to 'localhost' after docker runs on a local PC, but an error occurs when accessing an external Elasticsearch IP!
and below is the error log.
Traceback (most recent call last):
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 246, in perform_request
method, url, body, retries=Retry(False), headers=request_headers, **kw
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/util/retry.py", line 507, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/packages/six.py", line 770, in reraise
raise value
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
chunked=chunked,
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 239, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/usr/local/lib/python3.7/http/client.py", line 1277, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1323, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1272, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1032, in _send_output
self.send(msg)
File "/usr/local/lib/python3.7/http/client.py", line 972, in send
self.connect()
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 205, in connect
conn = self._new_conn()
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 187, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fec5925d510>: Failed to establish a new connection: [Errno -2] Name or service not known
11/17/2021 18:53:50 - WARNING - elasticsearch - HEAD http://http://bb8-elasticsearch-prod.okc1.opsnow.com/:9300/ [status:N/A request:0.000s]
Traceback (most recent call last):
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 175, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/util/connection.py", line 73, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/local/lib/python3.7/socket.py", line 752, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 246, in perform_request
method, url, body, retries=Retry(False), headers=request_headers, **kw
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/util/retry.py", line 507, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/packages/six.py", line 770, in reraise
raise value
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
chunked=chunked,
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 239, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/usr/local/lib/python3.7/http/client.py", line 1277, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1323, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1272, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1032, in _send_output
self.send(msg)
File "/usr/local/lib/python3.7/http/client.py", line 972, in send
self.connect()
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 205, in connect
conn = self._new_conn()
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/urllib3/connection.py", line 187, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fec59268650>: Failed to establish a new connection: [Errno -2] Name or service not known
Traceback (most recent call last):
File "/home/jaehyeong/venv_py37/lib/python3.7/site-packages/haystack/document_store/elasticsearch.py", line 202, in _init_elastic_client
f"Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance "
ConnectionError: Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at `[{'host': 'http://bb8-elasticsearch-prod.okc1.opsnow.com/', 'port': 9300}]` and that it has finished the initial ramp up (can take > 30s).
But, as shown in the code below, when connected directly to Elasticsearch, it worked fine.
from elasticsearch import Elasticsearch
ES_HOST = 'host ip' # my host ip
es = Elasticsearch(ES_HOST)
es.info()
# {'name': 'bb8-elasticsearch-0', 'cluster_name': 'elasticsearch', 'cluster_uuid': 'CgeTCo1QT96Hy0cDOy15pw', 'version': {'number': '7.15.0', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '79d65f6e357953a5b3cbcc5e2c7c21073d89aa29', 'build_date': '2021-09-16T03:05:29.143308416Z', 'build_snapshot': False, 'lucene_version': '8.9.0', 'minimum_wire_compatibility_version': '6.8.0', 'minimum_index_compatibility_version': '6.0.0-beta1'}, 'tagline': 'You Know, for Search'}
from haystack.
Hey @laifuchicago,
Thanks for reporting. Could you please provide a bit more information on your case so that we can try to replicate it:
- How are you running Elasticsearch? Are you using a local docker as described in the README (
docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.5.1
)? - How do you initialize the
ElasticsearchDocumentStore
? - What Retriever are you using?
Thanks!
from haystack.
To Tholor:
This is my code, and I have run the docker run
from haystack import Finder
#from haystack.database.sql import SQLDocumentStore
from haystack.database.elasticsearch import ElasticsearchDocumentStore
from haystack.indexing.cleaning import clean_wiki_text
from haystack.indexing.io import write_documents_to_db, fetch_archive_from_http
from haystack.reader.farm import FARMReader
from haystack.reader.transformers import TransformersReader
from haystack.utils import print_answers
from elasticsearch import Elasticsearch
document_store = ElasticsearchDocumentStore(host="localhost:9200",username="", password="", index="squad500")
from haystack.
I guess it's the elasticsearch_dsl error, we can not import that, my elasticsearch_dsl version is 7.1.0 and elasticsearch is 7.5.1
from haystack.
The versions should be fine.
However, you initialize the DocumentStore slightly wrong. Can you please try the following:
document_store = ElasticsearchDocumentStore(host="localhost", username="", password="", index="document")
host
doesn't expect a port number.
I also just saw that the index name is currently bound to "document". We will add a PR to allow flexible naming here ...
@tanaysoni can you please implement this? I think the issue is that we don't initialize the index correctly:
haystack/haystack/database/elasticsearch.py
Lines 6 to 12 in d33ef9c
from haystack.
hi, @tholor Can't we use Elasticsearch directly from the source..??
I tried using it, I am getting errors saying the connection is refused...& elastic search is getting killed itself..!!
f"Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at {hosts}
and that it has finished the initial ramp up (can take > 30s).")
ConnectionError: Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at [{'host': 'localhost', 'port': 9200}]
and that it has finished the initial ramp up (can take > 30s).
from haystack.
Hey @jyotikhetan , What do you mean by "directly from the source"?
If you mean running elasticsearch via a daemon like this, it is definitely possible:
import os
from subprocess import Popen, PIPE, STDOUT
es_server = Popen(['elasticsearch-7.9.2/bin/elasticsearch'],
stdout=PIPE, stderr=STDOUT,
preexec_fn=lambda: os.setuid(1) # as daemon
)
In the end you can run elasticsearch as you wish (docker, daemon, remote cloud service ...). I'd suspect some network issues or elasticsearch has not fully started yet. Maybe try a simple curl to elasticsearch (outside of Haystack) to verify that it is up and running correctly.
from haystack.
@jaehyeongAN Did you resolve this issue? If yes, could you please share your solution?
from haystack.
@chicuong209 My exteral Elasticsearch IP was cloud service. So, access was only available by https. I added parameter of 'sheme'
document_store = ElasticsearchDocumentStore(
host=es_host,
port=es_port,
scheme='https',
:
)
from haystack.
I am using ElasticSearch version 8.5.1 and the latest python library of ElasticSearch concurrent with version 8.5.1.
I am also having trouble with ElasticsearchDocumentStore
. After following the ElasticSearch instructions [here] for deploying an instance of a single node in a container using a docker image (https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html), I was able to run the following 2 code blocks successfully:
import requests
from datetime import datetime
from elasticsearch import Elasticsearch
from elasticsearch import RequestsHttpConnection
client = Elasticsearch( [{ 'host': '127.0.0.1', 'port': 9200,'scheme': 'https'}], ca_certs="../http_ca.crt", http_auth=('username', 'password'))
resp = client.info()
resp # this executed correctly
and this:
r = requests.get('https://localhost:9200/_cluster/health', verify="../http_ca.crt", headers={"Authorization": 'Basic ' + TOKEN})
r.json()
Then I tried
from haystack.document_stores.elasticsearch import ElasticsearchDocumentStore
doc_store = ElasticsearchDocumentStore(
host="localhost",
port=9200,
scheme="https",
username = "username",
password = "password",
index = "doc1",
)
and no matter what I try above, I get this error:
Output exceeds the size limit. Open the full output data in a text editor
WARNING:elasticsearch:GET https://localhost:9200/ [status:N/A request:0.029s]
Traceback (most recent call last):
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\connectionpool.py", line 1042, in validate_conn
conn.connect()
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\connection.py", line 414, in connect
self.sock = ssl_wrap_socket(
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\util\ssl.py", line 449, in ssl_wrap_socket
ssl_sock = ssl_wrap_socket_impl(
File "c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\urllib3\util\ssl.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Python310\lib\ssl.py", line 512, in wrap_socket
return self.sslsocket_class._create(
File "C:\Python310\lib\ssl.py", line 1070, in _create
self.do_handshake()
File "C:\Python310\lib\ssl.py", line 1341, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
self.do_handshake()
File "C:\Python310\lib\ssl.py", line 1341, in do_handshake
self._sslobj.do_handshake()
urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)
Output exceeds the size limit. Open the full output data in a text editorConnectionError Traceback (most recent call last)
File c:\Users\k.mufti\Desktop\QA_system.venv\lib\site-packages\haystack\document_stores\elasticsearch.py:272, in ElasticsearchDocumentStore._init_elastic_client(cls, host, port, username, password, api_key_id, api_key, aws4auth, scheme, ca_certs, verify_certs, timeout, use_system_proxy)
271 if not status:
--> 272 raise ConnectionError(
273 f"Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance "
274 f"at{hosts}
and that it has finished the initial ramp up (can take > 30s)."
275 )
276 except Exception:ConnectionError: Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at
[{'host': 'localhost', 'port': 9200}]
and that it has finished the initial ramp up (can take > 30s).During handling of the above exception, another exception occurred:
ConnectionError Traceback (most recent call last)
Cell In [97], line 1
----> 1 doc_store = ElasticsearchDocumentStore(
2 host="localhost",
3 port=9200,
4 scheme="https",
5 username = "username",
6 password = "password",
7 index = "aurelius",
8
9 )
...
278 f"Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at{hosts}
and that it has finished the initial ramp up (can take > 30s)."
279 )
280 return clientConnectionError: Initial connection to Elasticsearch failed. Make sure you run an Elasticsearch instance at
[{'host': 'localhost', 'port': 9200}]
and that it has finished the initial ramp up (can take > 30s).
Any ideas or solutions?
from haystack.
Just for others, i used Elastic version 8, i tried the method above but added certs to the function
document_store = ElasticsearchDocumentStore(host="localhost", username="elastic", password="***", index="document", scheme="https", ca_certs="./http_ca.crt")
from haystack.
This worked for me with 8.10.2
elasticsearch:
image: "docker.elastic.co/elasticsearch/elasticsearch:8.10.2"
ports:
- 9200:9200
restart: on-failure
environment:
- xpack.security.enabled=false
- xpack.license.self_generated.type=basic
- network.host=0.0.0.0
...
from haystack.
Related Issues (20)
- Pipeline fails validation if component uses `from __future__ import annotations` HOT 1
- SASEvaluator output scores are list of ndarray for bi-encoders but should be list of float
- Use case Chat + RAG
- Benchmark existing techniques using evaluation harness
- Add the memory feature to the library
- Create a colab with an example template Chat + RAG pipeline
- Select 4 or 5 datasets
- Run evaluations on selected datasets to optimise basic RAG pipeline
- ModuleNotFoundError: No module named 'haystack.nodes' HOT 1
- `FileTypeRouter` should get mime type from `ByteStream` mime type attribute instead of `meta
- Use case Chat + tools
- Use case tools + plan
- Use case text-to-sql database explorer
- Allow Pipelines to be run/reused in "SuperPipelines" HOT 5
- ModuleNotFoundError: No module named 'haystack.nodes' HOT 2
- Installation issues on Databricks
- Use case RAG + one-shot query planning
- QA problem in using QdrantDocumentStore HOT 3
- Docs: SentenceTransformersDiversityRanker HOT 1
- (De-) Serialization is not properly working for HuggingFaceAPITextEmbedder HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from haystack.