flepied / second-brain-agent Goto Github PK

🧠 Second Brain AI agent

License: GNU General Public License v3.0

Shell 13.23% Python 86.06% Makefile 0.71%

artificial-intelligence chatgpt chatgpt-api chatgpt-bot langchain langchain-python personal-knowledge-management pkm second-brain

second-brain-agent's People

Stargazers

Watchers

Forkers

touristshaun sorokinvld dapper-magician amro-alasri twilwa mastersatish

second-brain-agent's Issues

update to streamlit 1.25.0 or later

check if content has been modified before re-indexing

This will support export workflows from some editors like https://www.amplenote.com/ and it will also prevent re-indexing stuff that didn't change.

store dates about documents

store the creation date and updated date in the metadata.

Add links to the context documents when answering

Should display a list of sources (url from the metadata of the documents used as context).

add some tests

create tests that inject each document type and do some simple queries to validate they have been injected.

Display the question from the user

This has been lost with in 0.2.0.

Enhance the way documents are used in the dialogs

Make the agent able to exchange about documents like find documents in the knowledge base and act on them. Actions could be questions, summarizing or comparing.

use the metadata from the markdown file in the extracted documents.

Obsidian supports Markdown metadata, this way:

---
date: 2023/06/18 10:06
---

Use these fields in the metadata of the vector store. date will be transformed into created_at.

Split the notes using semantic knowledge. First semantic split could be for project or journal oriented notes. They are using an history format based on dates. Split these parts from the doc in their own documents with the date metadata corresponding to the extracted section.

create CI jobs

Create 2 pipelines of Github actions:

run pre-commit to be sure everything is correct
run the test from #6

Needs: #6

août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: Storing files under /home/fred/.second-brain
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: Traceback (most recent call last):
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 106, in <module>
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     main(sys.argv[1], sys.argv[2])
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 101, in main
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     process_file(os.path.join(in_dir, entry.name), out_dir, indexer, splitter)
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 70, in process_file
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     for chunk in splitter.split_text(content):
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 531, in split_text
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     return split_text_on_tokens(text=text, tokenizer=tokenizer)
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 474, in split_text_on_tokens
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     input_ids = tokenizer.encode(text)
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:                 ^^^^^^^^^^^^^^^^^^^^^^
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 518, in _encode
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     return self._tokenizer.encode(
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:            ^^^^^^^^^^^^^^^^^^^^^^^
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib64/python3.11/site-packages/tiktoken/core.py", line 117, in encode
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     raise_disallowed_special_token(match.group())
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:   File "/home/fred/perso/second-brain-agent/.venv/lib64/python3.11/site-packages/tiktoken/core.py", line 351, in raise_disallowed_special_token
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]:     raise ValueError(
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: ValueError: Encountered text corresponding to disallowed special token '<|endoftext|>'.
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: If you want this text to be encoded as a special token, pass it to `allowed_special`, e.g. `allowed_special={'<|endoftext|>', ...}`.
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: If you want this text to be encoded as normal text, disable the check for this token by passing `disallowed_special=(enc.special_tokens_set - {'<|endoftext|>'})`.
août 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: To disable this check for all special tokens, pass `disallowed_special=()`.
a

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble