GithubHelp home page GithubHelp logo

feat: RAG support about ollama-webui HOT 22 CLOSED

ollama-webui avatar ollama-webui commented on September 25, 2024 11
feat: RAG support

from ollama-webui.

Comments (22)

JDRay42 avatar JDRay42 commented on September 25, 2024 12

Document parsing should (probably) be implemented as an external service that can be called (Apache Tika, etc.). Similarly the RAG workflow, which involves a lot of "knowns" (vectorizer, vector store, block size, etc.). But having support for initiating those functions in the UI would be nice.

from ollama-webui.

justinh-rahb avatar justinh-rahb commented on September 25, 2024 7

Congratulations are in order Timothy, you really knocked this one out of the park. I can say without hint of exaggeration that this is the best and easiest local AI interface going right now for chat and now RAG too. I've tried a lot of them, I've even done my own forks of Chatbot-UI (who hasn't at this point), and none of them measure up. I don't know this is quite production ready for most folks yet, but it's good enough for me and my small team to hammer away on to see what comes loose 👍

from ollama-webui.

gitmybox avatar gitmybox commented on September 25, 2024 6

I have tried many others local ChatPDF, none of them is close to Ollama Web UI. The GUI is intuitive to navigate & the upload documents (image, pdf, doc) process is reliable. Great effort & very well done.

As a suggestion, for those text-based document uploaded, it will be good to include the source of the extracted sentences, lines, pages etc. in the chat reply. I believe this can be obtained from the meta data during the text-splitting & embedding process. This extra info can be display as another icon beside the current existing icons for each message

from ollama-webui.

tjbck avatar tjbck commented on September 25, 2024 5

@justinh-rahb Thank you so much for your kind words, I'm glad you and your team's finding this project useful, it means a lot to me! I would also like to take this opportunity to thank you for being a part of our journey by helping me with the troubleshooting at all. We're just getting started, and there will be a lot more to come! Stay tuned for more updates!🌟

Also! FYI, the docker image has been update to have the weights baked in to the container, so you won't have to redownload every time you update.

from ollama-webui.

amirvenus avatar amirvenus commented on September 25, 2024 5

While this has been a really great feature and I really appreciate it, I find the manual document upload rather tedious.

I was wondering if admins could make available a vector database in which they have already stored [large] embeddings?

Thanks

from ollama-webui.

justinh-rahb avatar justinh-rahb commented on September 25, 2024 4

YES! HOLY GRAIL DIVINE TRINITY ACHIEVED! We've got external APIs, we've got local AI, and now RAG 🚀
Screenshot 2024-01-07 at 2 46 34 PM

from ollama-webui.

justinh-rahb avatar justinh-rahb commented on September 25, 2024 4

Embedding models are relatively lightweight, and the performance enhancement from running them on a GPU is often minimal for smaller ones. In some cases, they may even run more slowly on a GPU than on a CPU due to the overhead of shuffling data around.

Given this, it seems reasonable that Ollama WebUI does not currently support GPU acceleration for text-embedding; implementing such a feature would likely require no small amount of effort for what might only benefit a limited number of users, and questionably at that.

However, if someone were to submit a well-crafted pull request demonstrating reliable GPU support for the Docker image on compatible systems, it would undoubtedly receive serious consideration.

from ollama-webui.

gitmybox avatar gitmybox commented on September 25, 2024 3

Another possible different scenario regarding document upload is audio-based data. There is use case that user can upload audio file recorded during meeting / class lesson / interview etc. It is very useful to chat with this type of voice converted text seamlessly without searching up & down the recording.

from ollama-webui.

tjbck avatar tjbck commented on September 25, 2024 1

@justinh-rahb If you just updated the container, it might take a while to download the embedding model weights. You might also want to clear your browser cache, Let me know if the issue persists!

from ollama-webui.

justinh-rahb avatar justinh-rahb commented on September 25, 2024 1

@justinh-rahb If you just updated the container, it might take a while to download the embedding model weights. You might also want to clear your browser cache, Let me know if the issue persists!

Ah yes you may be right, seems to be working now several minutes later. Will need to note that for next time I re-pull the image.

EDIT: Is there a directory that can be persistent volume mounted to prevent redownloading of weights?

from ollama-webui.

justinh-rahb avatar justinh-rahb commented on September 25, 2024 1

@tjbck thanks for the addition of the document section mate. Could we probably add a functionality to hide private docs from users (or is this the default behavior)? Just pulled the most recent update. Tried it only in the admin account.

@oliverbob The Documents button in the sidebar is only available to admins. I believe the intention here is that you can upload some documents that are globally available to all users, and anything a user drops into their chat box themselves is only for them. @tjbck can correct me if I've gotten any of the above wrong.

from ollama-webui.

oliverbob avatar oliverbob commented on September 25, 2024 1

Thank for the clarification.

from ollama-webui.

justinh-rahb avatar justinh-rahb commented on September 25, 2024

Doesn't seem to work with .docx files, is that a known issue?

from ollama-webui.

tjbck avatar tjbck commented on September 25, 2024

@justinh-rahb Added support for.docx file with #418, Try it out!

from ollama-webui.

justinh-rahb avatar justinh-rahb commented on September 25, 2024

@justinh-rahb Added support for.docx file with #418, Try it out!

Will do as soon as the docker image finishes building 👌

from ollama-webui.

justinh-rahb avatar justinh-rahb commented on September 25, 2024

.docx is working now, excellent work! There seems to be an issue with starting a new chat and uploading a document that has already been uploaded to another chat, it doesn't complete.

from ollama-webui.

jukofyork avatar jukofyork commented on September 25, 2024

Yeah, just trying this now and it seems really useful!

from ollama-webui.

oliverbob avatar oliverbob commented on September 25, 2024

@tjbck thanks for the addition of the document section mate. Could we probably add a functionality to hide private docs from users (or is this the default behavior)? Just pulled the most recent update. Tried it only in the admin account.

from ollama-webui.

tjbck avatar tjbck commented on September 25, 2024

@gitmybox sounds like a great idea, will try to take a look and see what can be done!

@oliverbob @justinh-rahb's explanation is on point, the documents page is intended to be used by admins to add documents to make them globally available to all users. Users with 'user' role should still be able to drag and drop their file to use RAG feature, it just won't be available to all users like the documents in the documents page. Hope that clarify things a bit.

from ollama-webui.

gitmybox avatar gitmybox commented on September 25, 2024

I have noticed that Ollama Web-UI is using CPU to embed the pdf document while the chat conversation is using GPU, if there is one in system.

However, I did some testing in the past using PrivateGPT, I remember both pdf embedding & chat is using GPU, if there is one in system. But the embedding performance is very very slooow in PrivateGPT.

My understanding is by right GPU is better than CPU to perform any LLM tasks, particularly embedding. But Ollama Web-UI is using CPU instead with very good result. That puzzle me....

Appreciate if anyone can share some insights on whether embedding using CPU or GPU is better for speed ?

from ollama-webui.

gitmybox avatar gitmybox commented on September 25, 2024

Embedding models are relatively lightweight, and the performance enhancement from running them on a GPU is often minimal for smaller ones. In some cases, they may even run more slowly on a GPU than on a CPU due to the overhead of shuffling data around.

Given this, it seems reasonable that Ollama WebUI does not currently support GPU acceleration for text-embedding; implementing such a feature would likely require no small amount of effort for what might only benefit a limited number of users, and questionably at that.

However, if someone were to submit a well-crafted pull request demonstrating reliable GPU support for the Docker image on compatible systems, it would undoubtedly receive serious consideration.

@justinh-rahb thanks for the explanation. You mentioned overhead shuffling data around that trigger me to think about the memory. My system GPU has 8GB VRAM but CPU has 32GB RAM (4 times more). That could explains why CPU embedding performance is so much out-perform GPU in my use-case. It is perfectly fine that Ollama Web-UI use CPU embedding for me.

from ollama-webui.

araffin avatar araffin commented on September 25, 2024

@tjbck for clarification, is there any way currently to have full document parsing/text upload? (as suggested in #60, the same way it is done in https://huggingface.co/chat/).

What I mean is that currently it is a RAG system, so the document is parsed, embedded, and then chunks are retrieved.
With models with longer context (e.g. llama 3.1) it would be nice to be able to give the whole document as input (and not just chunks, so just parse and give as input).

As a concrete example, it would be nice to upload a PDF and ask for a summary section by section.
With the current system, it only works partially as only part of the document is retrieved.

from ollama-webui.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.