GithubHelp home page GithubHelp logo

farzad-r / advanced-qa-and-rag-series Goto Github PK

View Code? Open in Web Editor NEW
52.0 4.0 31.0 8.24 MB

This repository contains advanced LLM-based chatbots for Q&A using LLM agents, and Retrieval Augmented Generation (RAG) and with different databases. (VectorDB, GraphDB, SQLite, CSV, XLSX, etc.)

Python 5.28% Jupyter Notebook 94.72%

advanced-qa-and-rag-series's Introduction

Advanced-RAG-Series

This repository contains advanced LLM-based chatbots for Retrieval Augmented Generation (RAG) and Q&A with different databases. (VectorDB, GraphDB, SQLite, CSV, XLSX, etc.). The repository provides guide on using both AzureOpenAI and OpenAI API for each project.

List of projects:

General structure of the projects:

Project-folder
  ├── README.md           <- The top-level README for developers using this project.
  ├── HELPER.md           <- Contains extra information that might be useful to know for executing the project.
  ├── .env                <- dotenv file for local configuration.
  ├── .here               <- Marker for project root.
  ├── configs             <- Holds yml files for project configs
  ├── explore             <- Contains my exploration notebooks and the teaching material for YouTube videos. 
  ├── data                <- Contains the sample data for the project.
  ├── src                 <- Contains the source code(s) for executing the project.
  |   └── utils           <- Contains all the necessary project modules. 
  └── images              <- Contains all the images used in the user interface and the README file. 

NOTE: This is the general structure of the projects, however there might be small changes duo to the specific needs of each project.

Key Notes:

Key Note 1: All the project uses Azure OpenAI. So, to use OpenAI API directly, just change the credentials and switch the models completions.

Key Note 2 : When we interact with databases using LLM agents, good informative column names can help the agents to navigate easier through the database.

Key Note 3: When we interact with databases using LLM agents, remember to NOT use the database with WRITE privileges. Use only READ and limit the scope. Otherwise your user can manupulate the data (e.g ask your chain to delete data).

Key Note 4: Familiarity with database query languages such as Pandas for Python, SQL, and Cypher can enhance the user's ability to ask more better questions and have a richer interaction with the graph agent.

Project description:

Q&A-and-RAG-with-SQL-and-TabularData is a chatbot project that utilizes GPT 3.5, Langchain, SQLite, and ChromaDB and allows users to interact (perform Q&A and RAG) with SQL databases, CSV, and XLSX files using natrual language.

Features:

  • Chat with SQL data.
  • Chat with preprocessed CSV and XLSX data.
  • Chat with uploaded CSV and XSLX files during the interaction with the user interface.
  • RAG with Tabular datasets.

Databases:

  • Diabetes dataset: Link
  • Cancer dataset: Link
  • Chinook SQL database: Link

YouTube video: Link

KnowledgeGraph-Q&A-and-RAG-with-TabularData is a chatbot project that utilizes knowledge graph, GPT 3.5, Langchain graph agent, and Neo4j graph database and allows users to interact (perform Q&A and RAG) with Tabular databases (CSV, XLSX, etc.) using natrual language. This project also demonstrates an approach for cunstructing the knowledge graph from unstructured data by leveraging LLMs.

Features:

  • Chat with a graphDB created from tabular data.
  • RAG with a graphDB created from tabular data.

Databases:

  • Movie dataset: Link
  • Medical reports dataset: Link

YouTube video:: Link

advanced-qa-and-rag-series's People

Contributors

farzad-r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

advanced-qa-and-rag-series's Issues

ValueError: An output parsing error occurred. In order to pass this error back to the agent and have it try again, pass `handle_parsing_errors=True` to the AgentExecutor. This is the error: Could not parse tool input: {'arguments': "# First, let's list the tables in the database\nfrom functions import sql_db_list_tables\n\ntables = sql_db_list_tables({})\nprint(tables)", 'name': 'python'} because the `arguments` is not valid JSON.

Hi,
I'm using the below code but I got the error , Any help would be really appereciatable.

code :
import pandas as pd
from pyprojroot import here
df = pd.read_csv(here(r"D:\Python\csv_data_yani\data\titanic.csv"))
print(df.shape)
print(df.columns.tolist())
display(df.head(3))

from langchain_community.utilities import SQLDatabase
from sqlalchemy import create_engine
db_path = r"D:\Python\csv_data_yani\data\titanic.db"
db_path = f"sqlite:///{db_path}"

engine = create_engine(db_path)

df.to_sql("titanic", engine, index=False)

df.to_sql("titanic", engine, index=False)

db = SQLDatabase(engine=engine)
print(db.dialect)
print(db.get_usable_table_names())
db.run("SELECT * FROM titanic WHERE Age < 2;")

import os
os.environ["OPENAI_API_TYPE"]="azure"
os.environ["OPENAI_API_VERSION"]="2023-07-01-preview"
os.environ["OPENAI_API_BASE"]="https://xxxxxx.openai.azure.com/"
os.environ["OPENAI_API_KEY"]="xxxxxxxxf" # Your Azure OpenAI resource key
os.environ["OPENAI_CHAT_MODEL"]="xxxxxxxx" # Use name of deployment

from langchain.chat_models import AzureChatOpenAI

model_name = "ai_deployment"
llm = AzureChatOpenAI(
azure_deployment=model_name,
model_name=model_name,
temperature=0.0)

from langchain_community.agent_toolkits import create_sql_agent

agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True,handle_parsing_errors=True)

agent_executor.invoke({"input": "Tell me more about Anders Johan Andersson"})

ERROR:

JSONDecodeError Traceback (most recent call last)
File d:\Python\csv_data_yani\venv\lib\site-packages\langchain\agents\output_parsers\tools.py:43, in parse_ai_message_to_tool_action(message)
42 try:
---> 43 args = json.loads(function["arguments"] or "{}")
44 tool_calls.append(
45 ToolCall(name=function_name, args=args, id=tool_call["id"])
46 )

File ~\AppData\Local\Programs\Python\Python310\lib\json_init_.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
343 if (cls is None and object_hook is None and
344 parse_int is None and parse_float is None and
345 parse_constant is None and object_pairs_hook is None and not kw):
--> 346 return _default_decoder.decode(s)
347 if cls is None:

File ~\AppData\Local\Programs\Python\Python310\lib\json\decoder.py:337, in JSONDecoder.decode(self, s, _w)
333 """Return the Python representation of s (a str instance
334 containing a JSON document).
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()

File ~\AppData\Local\Programs\Python\Python310\lib\json\decoder.py:355, in JSONDecoder.raw_decode(self, s, idx)
...
1182 )
1183 text = str(e)
1184 if isinstance(self.handle_parsing_errors, bool):

ValueError: An output parsing error occurred. In order to pass this error back to the agent and have it try again, pass handle_parsing_errors=True to the AgentExecutor. This is the error: Could not parse tool input: {'arguments': "# First, let's list the tables in the database\nfrom functions import sql_db_list_tables\n\ntables = sql_db_list_tables({})\nprint(tables)", 'name': 'python'} because the arguments is not valid JSON.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.