Comments (31)
Hi folks, unfortunately, creating an AWS account requires adding a valid payment information. I could not find any way to request an exception for this and creating an internal AWS account would not work for our use case.
Here is a proposal which can aims to create a maintainable and sustainable path forward on this:
Creating a maintainable AWS Account
- Create a non-personal email which can we owned/shared by WG leads. This way it can be handed over if people change overtime.
- Create an AWS account associated with this email with valid payment information. Lets call this management account.
- Since we want to create separate accounts for each WG, instead of creating individual accounts, we can create an organization within the management account and create member accounts for each WG within this organization. The benefit is orgnanizations have consolidated billing and so the AWS credits can be applied to the management account and can be shared by member accounts.
- With this approach WGs will have flexibility w.r.t account, for e.g., each working group can decide to create a testing account and a separate production account or maybe one production account for hosting released artifacts(samples, container images, charts etc.) for whole of Kubeflow
Making this approach sustainable in the long term
- Next, I want to propose creating a mechanism which can be used to ensure AWS account is accessible and funded appropriately throughout the year.
- Add an item to the release checklist(in the beginning of release cycle) in which WGs:
- i. Would review the credits spent and remaining over the last quarter and determine if there are sufficient credits to complete the current release. If there is a need to renew for the NEXT release, connect with AWS. This will allow sufficient time on both sides to complete the process
- ii. Baseline the accounts to make sure only active members have permissions to the account
- iii. Baseline the list of point of contact information from AWS
- iv. Add/update the information related to accounts, maintainers, AWS points of contacts, infrastructure access etc. to a document or README
- (Optional) We can add POC from AWS to have access to these resources if WGs thinks it is needed(this was brought up in some discussions)
- Add an item to the release checklist(in the beginning of release cycle) in which WGs:
- Set up Alarms to detect when account is running low on credits or spending exceeds expectation
- Setup best practices and guidelines for: adding people to account, deploy using IaC etc which can be flushed out later
Action item: The question about adding a valid payment information to the account still remains, and hence I would like to ask, is there any other organization/company which is willing to partner here for adding a valid payment information to the management account stated above?
Please let us know what the community thinks about this proposal.
from testing.
Hi folks; its been a minute.
As a bit of additional context. The issue of a sustainable and scalable approach to test infrastructure was identified almost two years ago in #737.
What is the current thinking in terms of making each WG responsible for its own test-infra?
from testing.
If anyone needs assistance setting up github CI for their working group, etc., please reach out (on this issue or in Kubeflow Slack). I'm not an expert, but I have some experience and happy to put some time in.
from testing.
Overall sounds good. Pipelines will continue to use GCP infra provided by Google. /cc @zijianjoy
from testing.
update: I was trying to find out if there is a way to create an AWS account w/o payment info like credit card but havent found any documentation or a way to get exception for it. I will reach out to some more folks and will update if it is possible.
General requirement as per Customer Agreement, all AWS Accounts must have a valid form of payment to access our services. https://aws.amazon.com/agreement/
from testing.
@kubeflow/wg-manifests-leads @kubeflow/wg-automl-leads @kubeflow/wg-notebooks-leads @yuzisun @james-jwu
Please let me know what the community thinks about the above proposal assuming we have a partner for payment information. This will help us with #1008 as well and hence a timely response will be helpful.
from testing.
@surajkota thanks for moving this forward. I have been asking companies (that provide integration services for Kubeflow on AWS) to support of this effort. I believe that we need to scope the effort i.e. One headcount is needed to 1) manage the accounts and credits and 2) config, operate, tear-down the clusters on the testing infra. 3) the period of time i.e. 12 months. Additionally, the responsibilities and SLA need to be defined i.e. only for current release i.e. 1.6, change requests will be tracked, acknowledged and implemented based on a simple approval process. Finally, IMO, the companies that provide testing infrastructure and related services so be given a special designation by the Community. This is an investment that test infra operators are making and (IMO) the Community should provide a designation / benefit back to these contributors.
from testing.
@surajkota I believe that we said that interested parties should respond by COB today. It appears that we have Arrikto, Maven Code, One Convergence and @ca-scribner offering support. @annajung Perhaps we should ask the contributors to select a Working Group to support? @kimwnasptd @charlesa101 @songole do you have a preference for a working group to support ? I think it would be good to have representatives 2+ companies in each working group.
from testing.
@johnugeorge All credits cards have an expiration date so adding more than one would not be adding much value IMO. If one company wants to remove their payment info in future, we will do another callout and also have this issue as reference in case we want to reach out to others who expressed interest in this.
from testing.
Apologies for the late reply here. First of all @songole @charlesa101 nice to meet you! I'm sure WGs would be more that happy to have some more engineering firepower for the testing, thank you very much for the interest!
As @surajkota described above we are splitting the testing infra migration into 2 orthogonal efforts:
- Establishing a process for a team that will be responsible for the root AWS account, that will be funded from AWS, as well as how the WGs can use that account in a secure manner
- Deciding on how the testing infra for the affected WGs will be, how will we set up CI/CD, ECR registries etc.
1. Management root account
For the first part we have made progress and created the initial management account and we will create an AWS organization, in which WGs can join with an email they will own. The practical part for this is almost done, and what remains is for each WG to use the AWS Pricing Calculator to estimate their credit needs for the next 12 months.
The pricing calculation part is crucial as without it we can't bootstrap the process. So we kindly ask the WGs interested in this to provide such an estimate by the end of this week, early next one. This will hugely help @surajkota as well to push for this, since this will require some communication to get the credits in.
For Notebooks WG @thesuperzapper and I are already in the process of calculating the cost and will post an update tomorrow.
Lastly we are preparing a basic proposal for on the team responsible for the management account. Specifically we want to document:
- What are the selection criteria for members of that team
- What are the expectations and time commitments from that team
- Actions and setup that needs to happen within that account
2. Setting up the infra per WG
@songole @charlesa101 @ca-scribner for this part I highly suggest to reach out to the WGs you are interested it to discuss your thoughts and expertise on how to setup the CI/CD. We can then form proposals and even generalizing a solution across WGs once we have a solid understanding and approach.
You can find links for all the WG's calendars and info in https://github.com/kubeflow/community/blob/master/wgs.yaml#L80
cc @kubeflow/wg-automl-leads @kubeflow/wg-training-leads @pvaneck @yuzisun
from testing.
Hi everyone, thanks to all the WGs for the estimates, we got the management account created and credits approved!
Next steps: I have a draft of the design with the next steps on this document:
https://docs.google.com/document/d/1Z3K4q21Vko6SzQDu2JSov9DO2fRehsDB_X9Z663fym4/edit?usp=sharing and we will be looking into setting up the AWS organization to get this going.
As we previously discussed, each WG can choose its own test/release infrastructure depending on their requirements and it would be running in separate accounts. I am looking for contributors and WGs to come up with the requirements for the testing infrastructure or a proposal for the infrastructure based on their requirements (Infrastructure per WG section of the doc). I have laid out a high level expectation for each of the section, please ping me or request access on the doc if you would like to contribute. Can we target to have a draft by 07/27?
cc @songole @charlesa101 @ca-scribner @kubeflow/wg-automl-leads @kubeflow/wg-training-leads @kubeflow/wg-notebooks-leads @kubeflow/wg-manifests-leads @pvaneck @yuzisun
from testing.
Thanks @surajkota. Someone from my team would start with training wg. @mak-454 @anil3
from testing.
Thanks, @surajkota - We can start with the notebooks & manifest wg. Thank you!
from testing.
Hi WG leads, there is a discussion happening in parallel with AWS, but I also like to kick off a discussion to see if you would be transitioning to a different CI/CD pipeline before the release. If so, do you have any timelines in mind?
from testing.
@annajung I have created a new private slack channel with all the working group leads to organize the creation of separate AWS accounts for each Working Group, and get AWS credits applied to them.
from testing.
As discussed in the Community Meeting 05/31, each Working Groups will need to setup their own testing infrastructure, as the optional-test-infra has been deleted.
AWS is willing to provide credits to the Working Groups for testing, to facilitate this please do the following:
- Each WG must delegate a single “testing manager” who will be the point of contact between AWS and their WG about credits/etc
- Each WG creates their own AWS account for any tests they may need to run
- Each “testing manager” must send a AWS Pricing Calculator estimate of the costs for their testing infra (in this slack channel)
- Each “testing manager” will work with AWS to get credits applied to their AWS account (in this slack channel)
The above is based on the assumption that a decentralised infra is more scalable and sustainable approach in the long term
from testing.
@johnugeorge @kimwnasptd @yuzisun @james-jwu @annajung This seems like a very generous offer from @surajkota. Would you please reply in a timely manner with your perspective / response? Thanks!
from testing.
Thanks @surajkota !!
from testing.
Hi everyone, here is an update from the June 6th release team meeting
- As a short-term solution, affected WGs are migrating to GitHub Actions for the 1.6 release while working in parallel for a long term solution
- @pvaneck from KServe, @kimwnasptd from Notebooks and Manifest were present in the meeting and confirmed they will be able to meet the new feature freeze deadline of June 15th using their short term solution
- @surajkota from AWS was present in the meeting and has taken action items to find out answers to a few questions about creating a personal AWS account and the use of the previous AWS registry, see details in the meeting notes
- As mentioned by @surajkota before, each WG should go ahead and create an AWS account and work with Suraj and @akartsky to leverage AWS credit / infrastructure for testing
As for the 1.6 release, need to confirm with @johnugeorge @andreyvelich to see if Training Operators and AutoML WG can meet the June 15th feature freeze, if not, will work with them to determine if another extension is needed and update the community accordingly
from testing.
Some notes from the June 7th Community meeting,
- Johnu from Katib and Training Operators confirmed they are migrating to GitHub Actions and should be ready by June 15th
- All WGs are focused on short term solutions and the 1.6 release to get themselves unblocked, will review the long term solutions afterward
from testing.
Thank you very much for driving this @surajkota!
This proposal seems solid for allowing all WGs to share the same credit pool. Thumbs up from manifests and notebooks.
Add an item to the release checklist(in the beginning of release cycle) in which WGs
I really like this approach as well. Since it will ensure we have a cadence for the status checks.
from testing.
Thank you @surajkota. The proposal looks solid. We do a lot of work with Kubeflow, my company MavenCode will be able to provide the needed partnership support to get this going.
@kubeflow/wg-manifests-leads @kubeflow/wg-automl-leads @kubeflow/wg-notebooks-leads @yuzisun @james-jwu
Please let me know what the community thinks about the above proposal assuming we have a partner for payment information. This will help us with #1008 as well and hence a timely response will be helpful.
from testing.
Thank you @surajkota @jbottum for the proposal. We are very much interested in contributing to the effort. I am part of dkube.io and our product DKube is built on top of Kubeflow and MLflow and provides MLOps and Monitoring solutions to enterprise customers.
We look forward to partnering with other community members and providing the needed support
from testing.
@jbottum We like to represent the following working groups: AutoML, Pipelines, Training and Serving.
from testing.
@surajkota I believe that we said that interested parties should respond by COB today. It appears that we have Arrikto, Maven Code, One Convergence and @ca-scribner offering support. @annajung Perhaps we should ask the contributors to select a Working Group to support? @kimwnasptd @charlesa101 @songole do you have a preference for a working group to support ? I think it would be good to have representatives 2+ companies in each working group.
@jbottum - automl, notebook, manifest, pipelines but we are open to support any other WG
from testing.
@surajkota did you get a credit card for the AWS account from a partner? Do you need the credit card to move forward?
from testing.
@kimwnasptd @songole @charlesa101 @ca-scribner In the Release team meeting today, we discussed next steps. We propose that the parties interested (Maven Code, One, Arrikto and CA-Scribner) should contact the Working Groups, and create a PR for the test-infra config and operations effort. The Issue/PR should propose a design for the test infra and support. Is that a reasonable request ?
Please note that this issue (1006) will be used to track the account set-up, and the config and operations of the test infra for each working group should have an independent issue / PR. @surajkota @annajung @DomFleischmann please confirm that I captured this correctly. Thanks.
from testing.
@johnugeorge @kimwnasptd @pvaneck - @surajkota needs an estimate of each Working Group's expenses for the next 12 months. Please submit by Friday(July 1) for Manifests, Notebooks, Training, Katib/AutoML, and KService. Please use AWS Pricing Calculator (https://calculator.aws/#/). cc'ing @annajung
from testing.
Hi everyone, the initial proposal required us to attach a credit card per WG account. The current proposal that uses AWS Organization approach requires only one credit card which needs to be added to the management account since it offers consolidated billing. I propose that we move forward with Arrikto's payment information for the management account since @kimwnasptd has been testing it out and was the first one to respond.
Thank you Maven Code, One Convergence, Arrikto and CA-Scribner for the interest in this initiative. Creating the management account is the first step of this project. It is exciting to see all the folks who are interested to contribute to this effort and I am confident the WGs will appreciate all the help they can get to make this effort useful for the product!
from testing.
@surajkota One question. Will credit card again become a single point of failure for the management account similar to earlier personal account for AWS infra ? How can this be handled?
from testing.
Hi @kubeflow/wg-automl-leads, @kubeflow/wg-training-leads @kubeflow/wg-manifests-leads, @kubeflow/wg-notebooks-leads, @pvaneck
We are looking into creating the AWS organization and organization units for each WG using Infrastructure as code based on the proposal in this doc. Everyone has already looked at brief overview on this issue but the document will go into details. If you have any comments or would like to contribute to any of the TODO items, please let us know.
Following are the things we need from your end:
- We want to use an IaC tool and not have manual creation for the AWS organization. Does the community have a preference for using CDK or Terraform?
- Please help me with an email addresses by EOD 07/20 you would like to use for your WG account. Let me know if we should go ahead and create one on your behalf. We can create something like:
[email protected]
,[email protected]
and share it with each of the WGs.- Manifests WG: @kimwnasptd, Email: ?
- Notebooks WG: @kimwnasptd, Email: ?
- Training WG: @johnugeorge, Email: ?
- AutoML WG: @johnugeorge, Email: ?
- KServe Project: @pvaneck, Email: ?
Please let me know if you want to designate anyone else in the WG for this.
from testing.
Related Issues (20)
- Deprecate ECR repo provided by optional-test-infra HOT 17
- Image Scanning for CVs HOT 8
- Image Scanning HOT 2
- IAM as Code HOT 5
- [GCP] Migrate machine type to e2 family to save costs HOT 8
- [AWS] Configure dependabot for new-built image PR
- [AWS] Optional-Test-Infra Migration HOT 2
- Go license tools no longer returning licenses for k8s libraries like apimachinery, controller-runtime, etc HOT 4
- The Optional-test infra should run presubmit jobs for kubeflow/kubeflow
- [AWS] Infrastructure as Code HOT 3
- Improve unit tests for kubeflow/testing repo codebase
- Let optional test infra manage kubeflow/testing presubmit/postsubmit HOT 6
- Migrate to CDK-deployed AWS Resources HOT 2
- rebuild test-worker image HOT 4
- Postsubmit link formatting error
- tekton cluster has been deleted in AWS Optional Test Infrastructure? HOT 3
- eksctl latest release will break cluster setup HOT 3
- Optional Test Infra Deprecation Notice HOT 11
- Support AWS EKS cluster version 1.22 in CI HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from testing.