Comments (3)
The text key in the metadata stores the text the vector represents. So if the function doesn't error when the metadata doesn't contain the text key, some documents may be empty witch can which can lead to bugs that are hard to debug. In regards to the metadata, only the text key that stores the text is deleted. If there is any other metadata it will be added to the metadata in the document.
You are right that this is unclear, and this should be better documented. Adding the ID to document also sounds like a good idea. Maybe in the "id" key in the metadata?
from langchaingo.
Ya metadata struct is definitely one option. Metadata is an optional param when calling Pinecone's query endpoint but here in restQuery func its hardcoded to true.
func (s Store) restQuery(
ctx context.Context,
vector []float64,
numVectors int,
nameSpace string,
) ([]schema.Document, error) {
payload := queryPayload{
IncludeValues: true,
IncludeMetadata: true,
Vector: vector,
TopK: numVectors,
Namespace: nameSpace,
}
It kinda conflicts with the idiomatic structure of the Pinecone API which returns ID has a separate field. It probably could confuse things further by adding a new ID field to schema.Document, it's really a pinecone thing. So maybe the MetaData field is a happy medium, and just some comments to explain the nuance. I can work on a PR for that change.
On the textKey you are right , it only removes that specific key from the metadata struct. It still seems restrictive that I have to limit Pinecone results to only those documents containing a specific metadata field. The Pinecone query API has a filter option for this type of use case. If this is the way, then the docs probably just need more explicit instructions. Right or wrong when I tried to build a similarity search I was relying on consistency with the Pinecone api
from langchaingo.
Going to close this issue, nether of the JS or Python versions of langchain return the id field and instead place the content in the metadata struct.
from langchaingo.
Related Issues (20)
- Calculator failed to calculate 30 ** 0.23
- Have you considered adding support for the Qwen model? HOT 3
- The problem with FunctionDefinition struct HOT 1
- 请问有Tongyi Qwen的对接计划吗? HOT 4
- Ollama JSON format bug HOT 2
- Unable to Use LLaMA Model with Groq API in langchaingo HOT 4
- Allow Gemini to be forced to respond with JSON. HOT 1
- Add documentation
- Support Context Caching
- Support Context Pinecone HOT 1
- Support Context Pinecone HOT 1
- Is there a plan that supports parallelism? HOT 1
- [feat] googleai file api support HOT 1
- [Bug] Response Format of json is not working in openai HOT 3
- Add sqlite-vec example HOT 1
- Support idiomatic Structured Generation
- Could MarkdownTextSplitter be stoped at table not at row level? HOT 5
- How do you rate limit calls to llms configured through langchaingo?
- agents: add create and use assistents openai
- error decoding streaming response: invalid character ':' looking for beginning of value
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from langchaingo.