dgzlopes / cloud-detect Goto Github PK
View Code? Open in Web Editor NEWModule that determines a host's cloud provider.
Home Page: https://pypi.org/project/cloud-detect/
License: MIT License
Module that determines a host's cloud provider.
Home Page: https://pypi.org/project/cloud-detect/
License: MIT License
Hello, the example in the readme does not work (cloud-detect 0.0.5)
import cloud_detect
cloud_detect.provider()
gives
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_7506/1274223439.py in <module>
----> 1 cloud_detect.provider()
~/anaconda3/envs/monk-mathis/lib/python3.8/site-packages/cloud_detect/__init__.py in provider(excluded)
10
11 def provider(excluded=[]):
---> 12 if 'alibaba' not in excluded and AlibabaProvider.identify():
13 logging.debug('Cloud_detect result is alibaba')
14 return 'alibaba'
TypeError: identify() missing 1 required positional argument: 'self'
AlibabaProvider should be instanced as identify() is not a class method.
AWSProvider is broken too (cloud_detect.AWSProvider().identify()
), check_metadata_server()
checks the value of instanceID but it should be instanceId.
I will make a PR.
EDIT: the tests are completely broken
Executing cloud-detect on my laptop, I get an unknown after a very long time. This also happens in my openstack environment. It appears that the metadata url used by alibaba provider is a public address and takes a while before erroring. Adding a timeout to ClientSession helps fix this.
time python3 -c 'from cloud_detect import provider; provider()'
python3 -c 'from cloud_detect import provider; provider()' 0.37s user 0.06s system 0% cpu 2:11.18 total
time curl -sq http://100.100.100.200/latest/meta-data/latest/meta-data/instance/virtualization-solution
curl -sq 0.01s user 0.02s system 0% cpu 2:10.17 total
I have an issue that I'm getting 'unknown' even when on an AWS
instance (windows).
It seems that the API for http://169.254.169.254/latest/dynamic/instance-identity/document
is returning a 401 status code.
According to the documentation we need first to acquire a token and then to the call instance-metadata-returns
You can look at this project to see how it is implemented in here https://pypi.org/project/ec2-metadata/
Tried to run the detection on a Sagemaker instance, the response keys seem to have changed (case mismatch):
curl http://169.254.169.254/latest/dynamic/instance-identity/document
{
...
"imageId" : "ami-08xxxxxxxxxxx",
"instanceId" : "i-00xxxxxxxxxxxx",
...
}
Also, the Azure detection does not check the response code, and thus ends up being falsely detected in this case instead of unknown
(since the metadata server does exist at the same IP).
Right now, I run make publish
locally, to build and push each new release to PyPI.
All this could be automated with a Github Action triggered on new releases.
Cloud detect has the capability to detect multiple cloud providers, however some applications may only support a few cloud providers and trying to detect the others is a waste of time and resources. I propose accepting a list of cloud providers that a user wants to detect.
I have attached sample code below but would me more than willing to create a PR to implement this if you are open to this feature request.
For example an application may support running on AWS, Azure, or GCP but nothing else. The application would only care if it is one of those so attempting to detect others makes no sense to the using app.
A possible solution while keeping it dynamic would be to modify the cloud_detect/init.py class so that the usage would look like the sample script and the init.py would look like the code below the sample script.
#sample.py
from cloud_detect import provider
MY_APP_PROVIDERS = ['aws', 'azure', 'gcp']
def do_aws_work(p_id):
print(p_id == "aws")
def do_azure_work(p_id):
print(p_id == "azure")
def do_gcp_work(p_id):
print(p_id == "gcp")
def error(id):
print("Error unknown id = " + id)
MY_APP_PROVIDERS = {
'aws': do_aws_work,
'azure': do_azure_work,
'gcp': do_gcp_work,
}
def detect_env():
only_these = [key for key in MY_APP_PROVIDERS]
provider_id = provider(only_these)
MY_APP_PROVIDERS.get(provider_id, error)(provider_id)
#cloud_detect/__init__.py
__PROVIDER_CLASSES = {
AlibabaProvider.identifier: AlibabaProvider,
AWSProvider.identifier :AWSProvider,
AzureProvider.identifier: AzureProvider,
DOProvider.identifier: DOProvider,
GCPProvider.identifier: GCPProvider,
OCIProvider.identifier: OCIProvider
}
async def _identify(timeout, providers= None):
if not providers:
providers = [identifier for identifier in __PROVIDER_CLASSES]
......
......
tasks = {
__PROVIDER_CLASSES[p_id].identifier : asyncio.ensure_future(wrapper(__PROVIDER_CLASSES[p_id])) for p_id in providers if p_id in __PROVIDER_CLASSES
}
......
def provider(timeout=None, providers= None,):
.....
.....
if py_version.minor >= 7:
result = asyncio.run(_identify(timeout, providers))
else:
loop = asyncio.new_event_loop()
result = loop.run_until_complete(_identify(timeout, providers))
loop.close()
return result
........
Hi, it would be great to see the MacStadium cloud supported.
The one supposed test is to curl https://ipinfo.io/
and check if .org field of output is 'AS395336 MacStadium, Inc.'. Not sure if good API calls exist for that, as entire their API seems to require authentication.
I was testing this library on AWS EC2.
It was stuck at the below line when I executed provider()
. Hitting Ctrl+C printed 'aws' on the console and the program exited.
I am getting instant response with AWSProvider().identify()
Unlike files, url requests may take considerable time to respond even if the url is not accessible.
So, identifying the cloud provider who's at the end of the if-clause
(in init.py) could take lot of time.
All the providers' urls and files can be read concurrently using asyncio
. That means irrespective of the position of a provider in the if-else
clause, all providers would be detected within a similar duration.
I can create a PR in couple of days if this issue is accepted.
When reading the self.vendor_file
the file is not closed, I will be happy to create a PR to use Path.read_text() instead of the open()
call so the file will be closed.
cloud-detect identifies Vultr as Azure:
[root@test-id ~]# python3
Python 3.11.3 (main, May 24 2023, 00:00:00) [GCC 13.1.1 20230511 (Red Hat 13.1.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cloud_detect import provider
>>> provider()
'azure'
>>>
It looks like azure provider only checks for return status of 200. In vultr, it also responds with a 200.
[root@test-id ~]# curl -v http://169.254.169.254/metadata/instance?api-version=2017-12-01
* Trying 169.254.169.254:80...
* Connected to 169.254.169.254 (169.254.169.254) port 80 (#0)
> GET /metadata/instance?api-version=2017-12-01 HTTP/1.1
> Host: 169.254.169.254
> User-Agent: curl/8.0.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx
< Date: Sat, 15 Jul 2023 11:25:35 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Expires: Sat, 15 Jul 2023 11:25:34 GMT
< Cache-Control: no-cache
< X-Frame-Options: DENY
< X-Content-Type-Options: nosniff
<
* Connection #0 to host 169.254.169.254 left intact
You can use the existing .travis.yaml
as a reference. Also, the badge on the readme needs to be changed.
I have an older AWS instance that is not detected, after investigating I noticed 2 issues
/sys/class/dmi/id/product_version
file is emptyhttp://169.254.169.254/latest/dynamic/instance-identity/document
call returns Content-Type: text/plain
, this cause the response.json()
function to failWARNING:cloud_detect.providers.aws_provider:0, message='Attempt to decode JSON with unexpected mimetype: text/plain', url=URL('http://169.254.169.254/latest/dynamic/instance-identity/document')
Will you accept a PR to ignore the content-type header?
response.json(content_type=None)
Also maybe the file flow can be revisited, do we have to look at the file content? isn't it good enough if it exists?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.