boavizta / cloud-scanner Goto Github PK
View Code? Open in Web Editor NEW๐ก Get Boavizta impact data for your aws cloud account usage.
License: GNU Affero General Public License v3.0
๐ก Get Boavizta impact data for your aws cloud account usage.
License: GNU Affero General Public License v3.0
Publish an additional docker version that includes a local copy of Boavizta API (to allow usage with reduced internet access), in which case it could be simpler to build on top of Boavizta API docker image.
Today, manufacturing impacts of instances are given globally (for the entire lifetime of the underlying infrastructure).
As a user, I would like to get the impacts corresponding to a given period of time.
Boavizta API v0.2.x brings the possibility to amortize manufacture impacts over a given time (using the Linear Allocation
).
Some aws instance types are not yet in the Boavizta data set.
The current behavior is to query Boavizta API for any instance type, and if the type is not found (resulting in error 500), the error is caught by cloud-scanner, and an empty impact string is returned ({}
) for this instance.
It would be more efficent to pre-check which type of instances are supported by Boavizta Dataset and query the API only for relevant instances.
When running cloud scanner in docker, the serve
function does not start (nothing seems exposed on 127.0.0.1:8000).
This is not only related to the missing expose port in docker file #153 , despite adding it on local test and it does not fix the startup.
The cloud scanner outputs logs, info messages directly to std err or sdt out.
We would like to be able to configure verbosity from nothing (except errors) to more detailled debug info
Could use https://docs.rs/loggerv/latest/loggerv/ as a simple solution
See https://deterministic.space/rust-cli-tips.html#bonus-logging
Means updating the CLI options to support multiple verbosity (like -v for info level, - vv for debug level)
Get the default impacts of an instances/account (i.e. based on instance type an location only).
Original context: in #3
Support assesment of Instances of Azure Account.
Feature On hold for the time being because we lack Azure VM impact data in Boavizta Dataset.
A a person who manage an AWS account, I want to run cloud scanner to retrieve the impacts of the resources in my account.
At first, this scanner is intended to be run manually, as a CLI tool, but we should consider wrapping it in a serverless function later.
Current sls version of cloud-scanner uses hardcoded parameters to do a default scan (defaut, region, defaut api url, 1 hour use time a.s.o)
Map Inputs parameters of the CLI to the severless app.
Doc is incomplete and Readme is becoming a bit complicated to follow.
Area to improve
Clippy generates a lot of warnings.
Follow clippy recommendations, update the code to avoid theses warning.
The generated SDK used to wrap calls to Boavizta API is embeded in the codebase of cloud-scanner.
It makes it heavy and unpractical to maintain (or resuer)
Extract the Boavizta API Rust SDK to it's own repository and publish it as a crate.
As a first step, this extract will be just a manual copy / publish of existing code (not necessary to automate it).
We tried first to publish rhe Rust SDK in the CI chain of Boavizta API repository (see Boavizta/boaviztapi#84), but it turns out it is not very practical because:
Docker image is unecessary big (~90MB because based on Ubuntu).
Update dockerfile to use Alpine as a base container.
We can use cloud scanner to export metrics that can be aggregated in a prometheus and displayed in grafana.
Document this setup, maybe provide a demo dashboard that could easily be imported in grafana.
At the moment only a limited list of aws_regions is supported.
We need to match ISO country codes with AWS regions.
Actual region<-> is translation code should be externalized, and even better we should rely on a third party package for this.
Current version of cloud-scanner only returns defaults impacts of the aws cloud instances (i.e. the usage rate of instances is not considered, only instance types).
Retrieve instance usage metrics from AWS API and use them when querying Boavizta API to get more realistic results.
See meta issue here #3
More info about how to query metrics statistics (like average cpu utilization for a given time period) of a single instance:https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/US_SingleMetricPerInstance.html
Python sample code using aws sdk
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/example_cloudwatch_GetMetricStatistics_section.html
And the corresponding function in rust aws sdk:
https://docs.rs/aws-sdk-cloudwatch/0.11.0/aws_sdk_cloudwatch/client/struct.Client.html#method.get_metric_statistics
Cloudwatch general concepts:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#dimension-combinations
Example payload format for custom cloud usage
query on Boavizta API
{
"hours_use_time": 2,
"usage_location": "FRA",
"workload": {
"10": {
"time": 0
},
"50": {
"time": 1
},
"100": {
"time": 0
},
"idle": {
"time": 0
}
}
}
Some part of generated code for embeeded Boaviztat rust SDK generate clippy warning.
Fix generated code directly in this repo to avoid clippy warnings.
Docker image is currently build in CI and available only on the private registry of this project.
Cannot be retrieved without authentication / proper permission which make deployment difficult in Enterprise context.
Publish the image to docker registry.
Not sure if Boavizta org has account / credentials setup for docker registry.
As a new user I would like to quickly test the scanner without compiling or installing anything.
Running the cloud-scanner on a AWS account from outside the account is not the preferred setup for large organizations using AWS, for several reasons:
Wrap the scanner as a serveless app that can easily be deployed to an aws account. โก
This allows:
Preference for using the serverless framework to easily wrap the rust binary. https://www.serverless.com/
Currently used plugin (serverless-rust) is left unmaintained
Test using https://github.com/fdaciuk/sls-rust instead
Current version of cloud-scanner only returns defaults impacts of the aws cloud instances (i.e. the usage duration and location of instances is not considered, only instance types).
This means that we retrieve only defaults values.
Pass the following json body when querying API
{
"hours_use_time": 2,
"usage_location": "FRA"
}
โ We do not consider the workload for the time being, only global use time and location.
Rationale for this is that current workload implementation is under evolution API side, so we wait for implementation of:
API doc:
Reduce compile time
We only deploy using REST API gateway, so we can make use of lambda_http feature flags https://github.com/awslabs/aws-lambda-rust-runtime#feature-flags-in-lambda_http
[dependencies.lambda_http]
version = "0.6"
default-features = false
features = ["apigw_rest"]
We build a demo dashboard to display metrics of cloud scanner but the units and sample rate can be questionned.
Objective of this issue is to discuss which metrics would be relevant to display to an end-user.
Cargo run asks for a --bin option to specify which binary should start, we could make cli the default.
Use Cargo default run: The Manifest Format - The Cargo Book
It would be easier for quick testing if cloud scanner could be embedded with prometheus and grafana in a docker compose file.
But in standalone mode (i.e. non serverless) cloud scanner does not offer a metric http endpoint that can refreshed / scrapped.
Add an option in CLI to serve metrics continuously on localhost:300/metrics
Wen using scanner without passing custom API URL, queries fails.
Run the scanner without passing any custom URL, fails because the default set in code includes a trailing slash (that make requests fail).
Unit tests pass because they use correct URL (only CLI, wich is not tested fails).
Should use https://api.boavizta.org
(without trailing slash) as default.
Code is a bit hard to follow and there are several interdependencies between modules and data objects.
Refactor to use more API agnostic (cloud or Boavizta) data objects and "nterfaces" to ease testing.
Idea is also to ease the future support of differents cloud providers, impact providers and results exporters.
See https://github.com/Boavizta/cloud-scanner/blob/chore/refactor-for-testing/docs/refactor.mm.md (use a plantuml renderer to see the map).
Add link to the github repository and to Boavizta website in the cloud-scanner documentation.
Cloud scanner uses a local version of Boavizta rust SDK that can be outdated compared to API release.
Migrate to use publicly available SDK (from crate.io) once it i published, see Boavizta/boaviztapi#84
The scan returns all instances of a given account. For large accounts with a lot of instances, we may want to filter results.
Pass tags to the scan command so that only the instances having theses tags are returned.
The general semantic of tags queries on aws seems to be:
Use a github actions to automate CI.
Should we use cloud query : https://www.cloudquery.io/ ?
Document minimal AWS role/permissions to use cloud scanner
Cloud scanner CLI is lacking some options and lacks adaptability.
CLI covers 2 use cases (subcommands) which needs different flags or set of options.
In addition we need to provide
This CLAP tutorial seems to explain all we need : https://blog.logrocket.com/command-line-argument-parsing-rust-using-clap/
I want to quickly visualize evolution of my scans
Add routes to returns scan results as OpenMetrics (Prometheus format). This makes easy to plug a grafana visualization.
Documentation is unclear about how to pass aws credentials when use in CLI.
Now only exporting the aws profile through env var (for Linux) is explained... and not in all sections.
See issue #76
Improve documentation. Describe and test various ways to pass AWS credentials for Unixes and Windows.
By default the scanner only lists resources of the default region of the current AWS profile (or from where the lambda is deployed).
Allow to scan resources located in a different region.
Need a easy way to know the actual version of scanner (when deployed as serverless application).
Have a basic route in serverless app to return the version of cloud-scanner, or write it to logs.
See https://docs.rs/pkg-version/latest/pkg_version/
I want ot quicky spin up a full demo of cloud-scanner, used to show a real time dashboard.
A docker compose file to showcase the monitoring use-case:
๐ฅ cloud-scanner has to be started with a specific flag to ensure it uses the local API (not the public one)
cloud-scanner -b http://localname/v1 serve
Get the detailed impacts of instances/account (taking into consideration the usage of the instance through cloud watch metrics or other mechanism).
Original request and way to implement it in #3
Both CLI and serverlss version allow using a private (or specific) instance of Boaviztapi as datasource (instead of the default public instance of Boaviztapi).
This is important for enterprise customer setup where we want to be sure that no data of cloud usage leaks outs to a public API.
The parameter exists (as an env var for the serverless version and a CLI parameter) for the CLI version but in practice, both software use the hardcoded URL of the public API.
Ensure that
Support other cloud providers like Azure
May imply CLI option changes See #14
Errors are not well managed, they can result in program exiting without much information
Investigate how it could be better managed.
see:
Docker file is missing the expose port (8000) requiered to expose a metric server
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.