GithubHelp home page GithubHelp logo

darianmiller / infero Goto Github PK

View Code? Open in Web Editor NEW
1.0 0.0 1.0 12.63 MB

High performant CUDA powered LLM inference library

License: BSD 3-Clause "New" or "Revised" License

C++ 1.20% C 3.23% Pascal 95.49% Batchfile 0.08%

infero's Introduction

Infero

Chat on Discord Twitter Follow

Infero

Overview Overview

A streamlined and user-friendly library designed for performing local LLM inference directly through your preferred programming language. This library efficiently loads LLMs in GGUF format into CPU or GPU memory, utilizing a CUDA backend for enhanced processing speed.

drawing Installation

  • Download Infero and extract it to your preferred location.

  • Acquire a GGUF model from Hugging Face, ensuring compatibility with llama.cpp. Reference the MODELS.txt for supported models.

  • The application utilizes CUDA for enhanced performance on supported GPUs. In the absence of a CUDA-enabled GPU, computation defaults to the CPU. Ensure the model size does not exceed the available system resources, considering the requisite memory.

  • Consult the installdir\examples directory for demonstrations on integrating Infero with your programming language.

  • Include the following DLLs in your project distribution: cublas64_12.dll, cublasLt64_12.dll, cudart64_12.dll, llama.dll, and Infero.dll.

  • Infero API supports integration across programming languages that accommodate Win64 and Unicode, with out-of-the-box support for Pascal (Delphi/FreePascal) and C/C++ (C++Builder, Visual Studio 2022).

  • Ship-ready DLLs are included in the repository; however, if there is a need to rebuild the Infero.dll, Delphi 12.1 is required.

  • This project is developed using RAD Studio 12.1, on Windows 11, powered by an Intel Core i5-12400F at 2500 MHz with 6 cores (12 logical), equipped with 36GB RAM and an NVIDIA RTX 3060 GPU with 12GB RAM.

  • We encourage testing and welcome pull requests.

  • If you find this project beneficial, please consider starring the repository, sponsoring, or promoting it. Your support is invaluable and highly appreciated.

Code Examples

Delphi example:

uses
  System.SysUtils,
  Infero;

begin
  // init
  if not Infero_Init('config.json', nil) then
    Exit;
  try
    // add message
    Infero_AddMessage(ROLE_SYSTEM, 'You are a helpful AI assistant');
    Infero_AddMessage(ROLE_USER, 'What is AI?');
    
    // do inference
    if Infero_Inference('phi3', 1024, nil, nil, nil) then
    begin
      // success
    end
  else
    begin
      // error
    end;
  finally
    Infero_Quit();
  end;
end.

C/CPP Example

#include <Infero.h>

int main()
{
    // init config
    Infero_InitConfig("config.json", nil);

    // add message
    Infero_AddMessage(ROLE_SYSTEM, "You are a helpful AI assistant");
    Infero_AddMessage(ROLE_USER, "What is AI?");

    // do inference
    if (Infero_Inference("phi34", 1024, NULL, NULL, NULL)
    {
        // success
    }
    else
    {
        // error
        return 1;
    };
    
    Infero_Quit();

    return 0;
}

Media Media

Infero01.mp4
Infero02.mp4

Support Support

Our development motto:

  • We will not release products that are buggy, incomplete, adding new features over not fixing underlying issues.
  • We will strive to fix issues found with our products in a timely manner.
  • We will maintain an attitude of quality over quantity for our products.
  • We will establish a great rapport with users/customers, with communication, transparency and respect, always encouragingng feedback to help shape the direction of our products.
  • We will be decent, fair, remain humble and committed to the craft.

Links Links

License License

Infero is a community-driven project created by tinyBigGAMES LLC.

BSD-3-Clause license - Core developers:

Acknowledgments Acknowledgments

Infero couldn't have been built without the help of wonderful people and great software already available from the community. Thank you!

Software

People

  • John Claw
  • Robert Jalarvo

infero's People

Contributors

jarroddavis68 avatar

Stargazers

ERDesigns - Ernst Reidinga avatar

Forkers

randydom

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.