GithubHelp home page GithubHelp logo

flepied / second-brain-agent Goto Github PK

View Code? Open in Web Editor NEW
140.0 140.0 6.0 1.33 MB

馃 Second Brain AI agent

License: GNU General Public License v3.0

Shell 13.23% Python 86.06% Makefile 0.71%
artificial-intelligence chatgpt chatgpt-api chatgpt-bot langchain langchain-python personal-knowledge-management pkm second-brain

second-brain-agent's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

second-brain-agent's Issues

add some tests

create tests that inject each document type and do some simple queries to validate they have been injected.

Implement semantic splitting

Split the notes using semantic knowledge. First semantic split could be for project or journal oriented notes. They are using an history format based on dates. Split these parts from the doc in their own documents with the date metadata corresponding to the extracted section.

create CI jobs

Create 2 pipelines of Github actions:

  1. run pre-commit to be sure everything is correct
  2. run the test from #6

Needs: #6

add a language metadata field

Support multiple languages in your data by detecting which language is used and storing it as the language metadata.

Document Amplenote Workflow

If we can identify a workflow that works well with Amplenote, that will unlock this tool for Amplenote users.

requirements.txt

Hello, I'd like to open an issue and was wondering if it would be possible to provide the 'requirements.txt' file to help with setting up the environment. This would greatly assist in configuring the environment correctly. Thank you!

use ChromaDB in client/server mode

create a Dockerfile and docker-compose.yml to start the ChromaDB in server mode using the same version used in the agent and inderxer

no newline in content is causing a backtrace

Backtrace:

ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: Storing files under /home/fred/.second-brain
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: Traceback (most recent call last):
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 106, in <module>
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     main(sys.argv[1], sys.argv[2])
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 101, in main
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     process_file(os.path.join(in_dir, entry.name), out_dir, indexer, splitter)
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 70, in process_file
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     for chunk in splitter.split_text(content):
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 531, in split_text
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     return split_text_on_tokens(text=text, tokenizer=tokenizer)
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 474, in split_text_on_tokens
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     input_ids = tokenizer.encode(text)
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:                 ^^^^^^^^^^^^^^^^^^^^^^
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 518, in _encode
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     return self._tokenizer.encode(
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:            ^^^^^^^^^^^^^^^^^^^^^^^
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib64/python3.11/site-packages/tiktoken/core.py", line 117, in encode
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     raise_disallowed_special_token(match.group())
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib64/python3.11/site-packages/tiktoken/core.py", line 351, in raise_disallowed_special_token
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     raise ValueError(
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: ValueError: Encountered text corresponding to disallowed special token '<|endoftext|>'.
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: If you want this text to be encoded as a special token, pass it to `allowed_special`, e.g. `allowed_special={'<|endoftext|>', ...}`.
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: If you want this text to be encoded as normal text, disable the check for this token by passing `disallowed_special=(enc.special_tokens_set - {'<|endoftext|>'})`.
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: To disable this check for all special tokens, pass `disallowed_special=()`.
a

add support for video

when there is no transcript for a Youtube video or a video without any, use a service to build one.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    馃枛 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 馃搳馃搱馃帀

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google 鉂わ笍 Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.