GithubHelp home page GithubHelp logo

haruiz / geminiplayground Goto Github PK

View Code? Open in Web Editor NEW
32.0 3.0 8.0 94.27 MB

Gemini-Playground provides a Python interface and a UI to interact with Gemini Playground provides a Python interface and a user interface to interact with different Gemini model variants. With Gemini Playground, you can:

License: MIT License

Python 30.06% HTML 22.09% JavaScript 31.35% Shell 0.03% CSS 16.38% PowerShell 0.09%

geminiplayground's Introduction

Gemini Playground

Gemini Logo

Gemini Playground provides a Python interface and a user interface to interact with different Gemini model variants. With Gemini Playground, you can:

  • Engage in conversation with your data either through a simple code API or using the API: Upload images and videos using a simple API and generate responses based on your prompts.
  • Chat with your codebase as you do with images, PDFs and audio files: Ask Gemini to analyze your code, explain its functionality, suggest improvements or even write documentation for it.
  • Explore multimodal capabilities: Combine different data types in your prompts, like asking Gemini to describe what's happening in a video and an image simultaneously.

Features

  • Intuitive API: The GeminiClient class offers a simple and easy-to-use interface for interacting with the Gemini API.
  • Multimodal Support: Upload and use text, images, videos, and code in your prompts.
  • File Management: Upload, list, and remove files from your Gemini storage.
  • Token Counting: Estimate the number of tokens required for a prompt and response.
  • Response Generation: Generate responses from Gemini based on your prompts and uploaded content.
  • Rich Logging: Get informative and colorful logging messages for better understanding of the process.

You can find usage examples in the examples directory.

Installation

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ geminiplayground

Usage

  1. Set up your API key:

    • Obtain an API key from Google AI-Studio.
    • Set the AISTUDIO_API_KEY environment variable with your API key.
  2. Create a GeminiClient instance:

from geminiplayground.core import GeminiClient
from geminiplayground.parts import VideoFile, ImageFile

gemini_client = GeminiClient()
  1. Define your files:
video_file_path = "./data/BigBuckBunny_320x180.mp4"
video_file = VideoFile(video_file_path, gemini_client=gemini_client)

image_file_path = "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png"
image_file = ImageFile(image_file_path, gemini_client=gemini_client)
  1. Create a prompt:
multimodal_prompt = [
    "See this video",
    video_file,
    "and this image",
    image_file,
    "Explain what you see."
]
  1. Generate a response:
response = gemini_client.generate_response("models/gemini-1.5-pro-latest", multimodal_prompt,
                                           generation_config={"temperature": 0.0, "top_p": 1.0})
# Print the response
for candidate in response.candidates:
    for part in candidate._search_content_type.parts:
        if part.text:
            print(part.text)
The video is a short animated film called "Big Buck Bunny." It is a comedy about a large, white rabbit 
who is bullied by three smaller animals. The rabbit eventually gets revenge on his tormentors. The film 
was created using Blender, a free and open-source 3D animation software.

The image is of four dice, each a different color. The dice are transparent and have white dots. The 
image is isolated on a black background.
  1. You can also chat with your data:

Chat with your codebase:

from rich import print

from geminiplayground.core import GeminiClient
from geminiplayground.parts import GitRepo
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())


def chat_wit_your_code():
    """
    Get the content parts of a github repo and generate a request.
    :return:
    """
    repo = GitRepo.from_url(
        "https://github.com/karpathy/ng-video-lecture",
        branch="master",
        config={
            "content": "code-files",  # "code-files" or "issues"
            "file_extensions": [".py"],
        },
    )
    prompt = [
        "use this codebase:",
        repo,
        "Describe the `bigram.py` file, and generate some code snippets",
    ]
    model = "models/gemini-1.5-pro-latest"
    gemini_client = GeminiClient()
    tokens_count = gemini_client.count_tokens(model, prompt)
    print("Tokens count: ", tokens_count)
    response = gemini_client.generate_response(model, prompt, stream=True)

    # Print the response
    for message_chunk in response:
        if message_chunk.parts:
            print(message_chunk.text)


if __name__ == "__main__":
    chat_wit_your_code()

Chat with your videos:

from rich import print

from geminiplayground.core import GeminiClient
from geminiplayground.parts import VideoFile
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())


def chat_wit_your_video():
    """
    Get the content parts of a video and generate a request.
    :return:
    """
    gemini_client = GeminiClient()
    model_name = "models/gemini-1.5-pro-latest"

    video_file_path = "./../data/transformers-explained.mp4"
    video_file = VideoFile(video_file_path, gemini_client=gemini_client)
    keyframes = video_file.extract_keyframes()
    print(keyframes)

    prompt = [
        "Describe the content of the video",
        video_file,
        "what is the video about?",
    ]
    tokens_count = gemini_client.count_tokens(model_name, prompt)
    print("Tokens count: ", tokens_count)
    response = gemini_client.generate_response(model_name, prompt, stream=True)
    for message_chunk in response:
        if message_chunk.parts:
            print(message_chunk.text)


if __name__ == "__main__":
    chat_wit_your_video()

Chat with your images:

from rich import print

from geminiplayground.core import GeminiClient
from geminiplayground.parts import ImageFile
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())


def chat_wit_your_images():
    """
    Get the content parts of an image and generate a request.
    :return:
    """
    gemini_client = GeminiClient()

    image_file_path = "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png"
    image_file = ImageFile(image_file_path, gemini_client=gemini_client)
    prompt = ["what do you see in this image?", image_file]
    model_name = "models/gemini-1.5-pro-latest"
    tokens_count = gemini_client.count_tokens(model_name, prompt)
    print(f"Tokens count: {tokens_count}")
    response = gemini_client.generate_response(model_name, prompt, stream=True)
    for message_chunk in response:
        if message_chunk.parts:
            print(message_chunk.text)


if __name__ == "__main__":
    chat_wit_your_images()

Chat with your Pdfs:

from rich import print

from geminiplayground.core import GeminiClient
from geminiplayground.parts import PdfFile
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())


def chat_wit_your_pdf():
    """
    Get the content parts of a pdf file and generate a request.
    :return:
    """
    gemini_client = GeminiClient()
    pdf_file_path = "https://www.tnstate.edu/faculty/fyao/COMP3050/Py-tutorial.pdf"
    pdf_file = PdfFile(pdf_file_path, gemini_client=gemini_client)

    prompt = ["Please create a summary of the pdf file:", pdf_file]
    model_name = "models/gemini-1.5-pro-latest"
    tokens_count = gemini_client.count_tokens(model_name, prompt)
    print(f"Tokens count: {tokens_count}")
    response = gemini_client.generate_response(model_name, prompt, stream=True)
    for message_chunk in response:
        if message_chunk.parts:
            print(message_chunk.text)


if __name__ == "__main__":
    chat_wit_your_pdf()

Function calling in chat:

from dotenv import load_dotenv, find_dotenv

from geminiplayground.core import GeminiPlayground, Message, ToolCall
from geminiplayground.parts import ImageFile

load_dotenv(find_dotenv())

if __name__ == "__main__":
    playground = GeminiPlayground(
        model="models/gemini-1.5-flash-latest"
    )


    @playground.tool
    def subtract(a: int, b: int) -> int:
        """This function only subtracts two numbers"""
        return a - b


    @playground.tool
    def write_poem() -> str:
        """write a poem"""
        return "Roses are red, violets are blue, sugar is sweet, and so are you."


    chat = playground.start_chat(history=[])
    while True:
        user_input = input("You: ")
        if user_input == "exit":
            print(chat.history)
            break
        try:
            model_response = chat.send_message(user_input, stream=True)
            for response_chunk in model_response:
                if isinstance(response_chunk, ToolCall):
                    print(
                        f"Tool: {response_chunk.tool_name}, "
                        f"Result: {response_chunk.tool_result}"
                    )
                    continue
                print(response_chunk.text, end="")
            print()
        except Exception as e:
            print("Something went wrong: ", e)
            break

This is a basic example. Explore the codebase and documentation for more advanced functionalities and examples.

GUI

You can also use the GUI to interact with Gemini. Remember to set the AISTUDIO_API_KEY environment variable with your API key. You can do so globally, pass it as an argument to the command, or create a .env file in the root of your project and set the AISTUDIO_API_KEY variable there.

For running the GUI, use the following command:

geminiplayground ui

or

geminiplayground ui --api-key YOUR_API_KEY

This will start a local server and open the GUI in your default browser.

Gemini GUI

To access the uploaded files from the UI, just type @. It will open a popup list where you can select the file you want.

Contributing

Contributions are welcome! Please see the 'CONTRIBUTING.md` file for guidelines [Coming soon].

License

This codebase is licensed under the MIT License. See theLICENSE file for details.

geminiplayground's People

Contributors

haruiz avatar jggomez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

geminiplayground's Issues

Image and Video upload

I have uploaded my file(s) in the my data section in the UI, but I don't have any idea how can I attach it to the message, like the one you showed. Please provide clear instruction.
Also, for convenience please add drag and drop to add file feature in the chat, or a add an attachment button in the chat message.

windows-support

Test Geminplayground on Windows:

  1. Validate if the UI works.
  2. Check and test the API.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.