Comments (1)
Merged with #44
Basic example:
document_store = ElasticsearchDocumentStore(host="localhost", username="", password="",
index="document", text_field="answer",
embedding_field="question_emb",
embedding_dim=768,
excluded_meta_data=["question_emb"])
retriever = ElasticsearchRetriever(document_store=document_store, embedding_model="bert-base-cased", gpu=False)
# Get dataframe with columns "question", "answer" and some custom metadata
df = pd.read_csv("data/covid/faq_covidbert.csv")
df.fillna(value="", inplace=True)
# prepare docs for indexing (list of dicts)
docs_to_index = []
doc_id = 1
for idx, row in df.iterrows():
d = row.to_dict()
d = {k:v.strip() for k, v in d.items()}
d["document_id"] = idx
# add embedding
question_embedding = retriever.create_embedding(row["question"])
d["question_emb"] = question_embedding
docs_to_index.append(d)
# Index to ES
document_store.write_documents(docs_to_index)
# Use Finder to get answer
finder = Finder(reader=None, retriever=retriever)
prediction = finder.get_answers_via_similar_questions(question="How is the virus spreading?", top_k_retriever=10)
print_answers(prediction, details="all")
Will add a tutorial soon ....
from haystack.
Related Issues (20)
- Use case Chat + tools
- Use case tools + plan
- Use case text-to-sql database explorer
- Allow Pipelines to be run/reused in "SuperPipelines" HOT 5
- ModuleNotFoundError: No module named 'haystack.nodes' HOT 2
- Installation issues on Databricks
- Use case RAG + one-shot query planning
- QA problem in using QdrantDocumentStore HOT 3
- Docs: SentenceTransformersDiversityRanker HOT 1
- (De-) Serialization is not properly working for HuggingFaceAPITextEmbedder HOT 1
- (De-) Serialization is not properly working for NamedEntityExtractor
- LLM-based evaluators not always returning a valid JSON
- Port Haystack v1 DocumentClassifier node to Haystack v2 HOT 3
- LLM-based evaluators shouldn't return `NaN`
- Provide an abstraction for Tools HOT 4
- redundant logging statement causes KeyError due to name collision HOT 1
- Homogenize Generator meta output HOT 2
- `MetaFieldRanker`: allow different options for what to do with missing metadata field
- Deserialize - AzureOpenAIGenerator
- Pipeline - `include_outputs_from` parameter of `run` method has unexpected behavior
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from haystack.