GithubHelp home page GithubHelp logo

What is the role of the textKey option when defining a new Pinecone store, and Is it possible to circumvent it being used as a filter? about langchaingo HOT 3 CLOSED

tmc avatar tmc commented on July 23, 2024
What is the role of the textKey option when defining a new Pinecone store, and Is it possible to circumvent it being used as a filter?

from langchaingo.

Comments (3)

FluffyKebab avatar FluffyKebab commented on July 23, 2024

The text key in the metadata stores the text the vector represents. So if the function doesn't error when the metadata doesn't contain the text key, some documents may be empty witch can which can lead to bugs that are hard to debug. In regards to the metadata, only the text key that stores the text is deleted. If there is any other metadata it will be added to the metadata in the document.

You are right that this is unclear, and this should be better documented. Adding the ID to document also sounds like a good idea. Maybe in the "id" key in the metadata?

from langchaingo.

cduggn avatar cduggn commented on July 23, 2024

Ya metadata struct is definitely one option. Metadata is an optional param when calling Pinecone's query endpoint but here in restQuery func its hardcoded to true.

func (s Store) restQuery(
	ctx context.Context,
	vector []float64,
	numVectors int,
	nameSpace string,
) ([]schema.Document, error) {
	payload := queryPayload{
		IncludeValues:   true,
		IncludeMetadata: true,
		Vector:          vector,
		TopK:            numVectors,
		Namespace:       nameSpace,
	}

It kinda conflicts with the idiomatic structure of the Pinecone API which returns ID has a separate field. It probably could confuse things further by adding a new ID field to schema.Document, it's really a pinecone thing. So maybe the MetaData field is a happy medium, and just some comments to explain the nuance. I can work on a PR for that change.

On the textKey you are right , it only removes that specific key from the metadata struct. It still seems restrictive that I have to limit Pinecone results to only those documents containing a specific metadata field. The Pinecone query API has a filter option for this type of use case. If this is the way, then the docs probably just need more explicit instructions. Right or wrong when I tried to build a similarity search I was relying on consistency with the Pinecone api

from langchaingo.

cduggn avatar cduggn commented on July 23, 2024

Going to close this issue, nether of the JS or Python versions of langchain return the id field and instead place the content in the metadata struct.

from langchaingo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.