flepied / second-brain-agent Goto Github PK
View Code? Open in Web Editor NEW馃 Second Brain AI agent
License: GNU General Public License v3.0
馃 Second Brain AI agent
License: GNU General Public License v3.0
This will support export workflows from some editors like https://www.amplenote.com/ and it will also prevent re-indexing stuff that didn't change.
store the creation date and updated date in the metadata.
Should display a list of sources (url from the metadata of the documents used as context).
create tests that inject each document type and do some simple queries to validate they have been injected.
This has been lost with in 0.2.0.
Make the agent able to exchange about documents like find documents in the knowledge base and act on them. Actions could be questions, summarizing or comparing.
Obsidian supports Markdown metadata, this way:
---
date: 2023/06/18 10:06
---
Use these fields in the metadata of the vector store. date
will be transformed into created_at
.
Split the notes using semantic knowledge. First semantic split could be for project or journal oriented notes. They are using an history format based on dates. Split these parts from the doc in their own documents with the date metadata corresponding to the extracted section.
Support multiple languages in your data by detecting which language is used and storing it as the language
metadata.
If we can identify a workflow that works well with Amplenote, that will unlock this tool for Amplenote users.
Hello, I'd like to open an issue and was wondering if it would be possible to provide the 'requirements.txt' file to help with setting up the environment. This would greatly assist in configuring the environment correctly. Thank you!
Improve the scoring of the document retriever by using the updated dates of the document as a parameter for similarity.
Needs #8
create a Dockerfile and docker-compose.yml to start the ChromaDB in server mode using the same version used in the agent and inderxer
Update pyproject.toml
to the lastest available versions and see if the workarounds (constraints) are still needed.
Backtrace:
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: Storing files under /home/fred/.second-brain
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: Traceback (most recent call last):
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 106, in <module>
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: main(sys.argv[1], sys.argv[2])
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 101, in main
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: process_file(os.path.join(in_dir, entry.name), out_dir, indexer, splitter)
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: File "/home/fred/perso/second-brain-agent/./transform_txt.py", line 70, in process_file
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: for chunk in splitter.split_text(content):
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 531, in split_text
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: return split_text_on_tokens(text=text, tokenizer=tokenizer)
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 474, in split_text_on_tokens
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: input_ids = tokenizer.encode(text)
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: ^^^^^^^^^^^^^^^^^^^^^^
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: File "/home/fred/perso/second-brain-agent/.venv/lib/python3.11/site-packages/langchain/text_splitter.py", line 518, in _encode
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: return self._tokenizer.encode(
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: ^^^^^^^^^^^^^^^^^^^^^^^
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: File "/home/fred/perso/second-brain-agent/.venv/lib64/python3.11/site-packages/tiktoken/core.py", line 117, in encode
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: raise_disallowed_special_token(match.group())
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: File "/home/fred/perso/second-brain-agent/.venv/lib64/python3.11/site-packages/tiktoken/core.py", line 351, in raise_disallowed_special_token
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: raise ValueError(
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: ValueError: Encountered text corresponding to disallowed special token '<|endoftext|>'.
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: If you want this text to be encoded as a special token, pass it to `allowed_special`, e.g. `allowed_special={'<|endoftext|>', ...}`.
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: If you want this text to be encoded as normal text, disable the check for this token by passing `disallowed_special=(enc.special_tokens_set - {'<|endoftext|>'})`.
ao没t 22 00:15:57 laptop-fred.local sba-txt-service.sh[2698822]: To disable this check for all special tokens, pass `disallowed_special=()`.
a
lookup the transcript or create one from a service.
when there is no transcript for a Youtube video or a video without any, use a service to build one.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
馃枛 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 馃搳馃搱馃帀
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google 鉂わ笍 Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.