Comments (10)
Thanks @rootsongjc for this initiative and proposal!
@howardjohn @linsun @craigbox @kfaseela @rcernich @justinpettit @ctrath @hzxuzhonghu please take a look for the proposal.
from community.
Hi @rootsongjc thanks for proposing this. I saw the cost and complexity worry from other steering members which makes sense to me. Is it possible to start with a simple version where we simply alert users to check out a list of things before writing up the post in discuss?
from community.
I want to adapt and utilize the existing bots. What is available now? I've heard of some bots no longer planned to be maintained.
from community.
Any further thought Jimmy? I think that having a model trained on Istio documentation attempting to manually answer some user queries would be a good first step. You have such a model?
from community.
@craigbox I don't have a direct model available for open source yet. But we can train one, and there's also the technology or platform to consider using, and there's the cost involved when we have to integrate the model to call the APIs, and I haven't figured out how to do all that yet.
from community.
@rootsongjc is there still interest in doing this?
@craigbox has provided some feedback for my tool (https://devboard.gitsense.com/istio) and I'm looking to convert all my data into embeddings for future AI features and I would be interested in working with the Istio to gather requirements. I'm currently capturing comments (up to 30,000 per repository due to GitHub limitations), issues, pull request, commits, etc. so experimenting with different types of embedding models and chunking methods will be trivial.
My only concern right now is performance (not my indexing engine but rather the process of generating embeddings). My indexing engine can scale horizontally and you can rent GPUs by the hour, so I'm hoping there will only be a one time initial cost hit and from then on, commodity hardware can be used
from community.
What Jimmy is proposing is effectively training a transformer model on the Istio and Envoy documentation, Q&A etc, and then using that to answer user questions in the first instance.
Some GitHub data might be useful in this, but I think that is more likely to trend towards developer questions/answers.
from community.
I'm actually interested in generating embeddings that can support multiple personas (customers, developers, managers, executives, etc.) so what Jimmy is proposing does interest me, but I can see how including more development related data, can increase complexity.
I'm not sure what stage things are at, but I am very interested in learning more about your findings and knowing what sources (documents, questions, etc.) you are planning on training on.
from community.
@terrchen I haven't started yet. Your tool is very useful for showing the contribution data but do you have any bot to answer questions in the the GIthub issue or discussions?
from community.
@rootsongjc Not yet. The goal is to get to a point where I can create agents/bots to answer questions and to perform tasks for maintainers, developers, team leads, etc.
The conclusion that I've come to is, in order to create a useful automated Q and A system, you'll need an easy way to do the following:
- Classify and clean data.
- Generate Q and A pairs.
- Review and iterate on generated Q and A pairs
The part that I will be tackling in the near future is creating a system to classify and clean data, which is the most critical, since garbage in = garbage out. Once you have clean data (with good metadata), generating Q and A pairs should be pretty straight forward as you can use LLMs to generate them.
If you have a system or thoughts on how to clean/organize the data, I'd be interested to hear them. I'm currently planning on creating a system that will leverage LLM to help classify and prep data.
from community.
Related Issues (20)
- Slack invite link is broken: 2023/7/11 HOT 3
- Discuss an Istio Conformance program HOT 1
- Istio Operator could not deploy an AWS Network Load Balancer with multiple SSL certifications. HOT 4
- Slack invite link is broken: 2023/9/4 HOT 14
- Update Lizan's affliation HOT 1
- Twitter logo update required HOT 7
- how to set a header while using delegate option in virtualservice HOT 5
- Add emeritus maintainers list HOT 7
- Proposal: create an Istio Community Group HOT 16
- Nominate Daniel Hawton as BTR maintainers HOT 3
- Create project video for KubeCon EU 2024 HOT 2
- Nominate Ben Leggett for istio-cni maintainer HOT 19
- Nominate Iris Ding for istio-cni maintainer HOT 6
- Nominate PlatformLC for istio-cni maintainer HOT 9
- Considering adding github discussions contribution to istio steering contribution seat calculation HOT 1
- Set up mailing lists for anything we use discuss.istio.io for
- Turn down discuss.istio.io
- "Copy JWT Claims to HTTP Headers" not work as expect when I have Chinese character in jwt
- Add MorrisLaw as a UX maintainer HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from community.