sciphi-ai / synthesizer Goto Github PK
View Code? Open in Web Editor NEWA multi-purpose LLM framework for RAG and data creation.
License: Apache License 2.0
A multi-purpose LLM framework for RAG and data creation.
License: Apache License 2.0
Describe the bug
On some reruns for generating textbooks, I'm getting overlapping sections and off-by-one (i.e regen's the same last section). In addition, while I took it out for my code, restarting a run in between will always write the "This is an AI Generated..." in the textbook.
To Reproduce
Start a job, kill it halfway, watch where it restarts.
Expected behavior
I think the code is basically setting the start_flag
too early (or the check + generation order should be switched).
Basically, right now, it regenerates the last section (i.e the sections described in {textbook}_progress.txt
, but should actually generate the next one.
I will open this issue rn and investigate later today, keeping this issue updated and if needed, opening a PR to fix the issue. (I have some runs going, so don't want to disturb those)
This is an ongoing effort to train with existing synthetic data and evaluate on downstream tasks in order to identify what techniques are most effective and what distribution the synthetic data should be generated over.
I find this project need the table of content but I don't have many of it, does the table_of_content itself be generated by some kind of way also? If so, how do I prepare it?
Thanks!
Hi, thanks for the great project. I may have missed something but are there any pointers to the textbook generation prompt?
Is your feature request related to a problem? Please describe.
Currently, the key of YAML treated as long string while YAML has limited 1024 bytes, it would be nice if we can refactor it into a value of string and migrate the weight into separate attribute.
Describe the solution you'd like
Current:
prompt_template:
"While studying {course_name} and delving into {course_topic}, one encounters the importance of {sub_topic}. Could you offer {context} followed by an illustration of {example_style}?": 1
Suggest:
prompt_template:
- content: >
While studying {course_name} and delving into {course_topic}, one
encounters the importance of {sub_topic}. Could you offer {context}
followed by an illustration of {example_style}?
weight: 1
- content: Another prompt
weight: 1
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.