GithubHelp home page GithubHelp logo

dgzlopes / cloud-detect Goto Github PK

View Code? Open in Web Editor NEW
35.0 35.0 13.0 40 KB

Module that determines a host's cloud provider.

Home Page: https://pypi.org/project/cloud-detect/

License: MIT License

Makefile 1.33% Python 98.67%
agnostic aws azure cloud detect devops digitalocean gcp metadata multicloud python3 vendoring

cloud-detect's Introduction

Hi there ๐Ÿ‘‹

cloud-detect's People

Contributors

arossert avatar arsh25 avatar artemisart avatar cittarasu avatar dennis-pg avatar derekjc avatar dgzlopes avatar kshivakumar avatar rhyspowell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

cloud-detect's Issues

Doesn't work + key error for aws

Hello, the example in the readme does not work (cloud-detect 0.0.5)

import cloud_detect
cloud_detect.provider()

gives

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_7506/1274223439.py in <module>
----> 1 cloud_detect.provider()

~/anaconda3/envs/monk-mathis/lib/python3.8/site-packages/cloud_detect/__init__.py in provider(excluded)
     10 
     11 def provider(excluded=[]):
---> 12     if 'alibaba' not in excluded and AlibabaProvider.identify():
     13         logging.debug('Cloud_detect result is alibaba')
     14         return 'alibaba'

TypeError: identify() missing 1 required positional argument: 'self'

AlibabaProvider should be instanced as identify() is not a class method.

AWSProvider is broken too (cloud_detect.AWSProvider().identify()), check_metadata_server() checks the value of instanceID but it should be instanceId.
I will make a PR.

EDIT: the tests are completely broken

reporting unknown cloud takes very long

Executing cloud-detect on my laptop, I get an unknown after a very long time. This also happens in my openstack environment. It appears that the metadata url used by alibaba provider is a public address and takes a while before erroring. Adding a timeout to ClientSession helps fix this.

time python3 -c 'from cloud_detect import provider; provider()'
python3 -c 'from cloud_detect import provider; provider()'  0.37s user 0.06s system 0% cpu 2:11.18 total
time curl -sq http://100.100.100.200/latest/meta-data/latest/meta-data/instance/virtualization-solution
curl -sq   0.01s user 0.02s system 0% cpu 2:10.17 total

AWS IMDSv2 not supported

I have an issue that I'm getting 'unknown' even when on an AWS instance (windows).
It seems that the API for http://169.254.169.254/latest/dynamic/instance-identity/document is returning a 401 status code.

According to the documentation we need first to acquire a token and then to the call instance-metadata-returns

You can look at this project to see how it is implemented in here https://pypi.org/project/ec2-metadata/

AWS and Azure detection is broken

Tried to run the detection on a Sagemaker instance, the response keys seem to have changed (case mismatch):

curl http://169.254.169.254/latest/dynamic/instance-identity/document
{
...
  "imageId" : "ami-08xxxxxxxxxxx",
  "instanceId" : "i-00xxxxxxxxxxxx",
...
}

Also, the Azure detection does not check the response code, and thus ends up being falsely detected in this case instead of unknown (since the metadata server does exist at the same IP).

Automate the release process

Right now, I run make publish locally, to build and push each new release to PyPI.

All this could be automated with a Github Action triggered on new releases.

Detect specific supported cloud providers

Cloud detect has the capability to detect multiple cloud providers, however some applications may only support a few cloud providers and trying to detect the others is a waste of time and resources. I propose accepting a list of cloud providers that a user wants to detect.

I have attached sample code below but would me more than willing to create a PR to implement this if you are open to this feature request.

For example an application may support running on AWS, Azure, or GCP but nothing else. The application would only care if it is one of those so attempting to detect others makes no sense to the using app.

A possible solution while keeping it dynamic would be to modify the cloud_detect/init.py class so that the usage would look like the sample script and the init.py would look like the code below the sample script.

#sample.py
from cloud_detect import provider

MY_APP_PROVIDERS = ['aws', 'azure', 'gcp']

def do_aws_work(p_id):
  print(p_id == "aws")

def do_azure_work(p_id):
  print(p_id == "azure")

def do_gcp_work(p_id):
  print(p_id == "gcp")

def error(id):
  print("Error unknown id = " + id)
  
MY_APP_PROVIDERS = {
      'aws': do_aws_work, 
      'azure': do_azure_work, 
      'gcp': do_gcp_work, 
}
        
def detect_env():
       only_these = [key for key in MY_APP_PROVIDERS]
       provider_id  = provider(only_these)
       
       MY_APP_PROVIDERS.get(provider_id, error)(provider_id)
#cloud_detect/__init__.py
__PROVIDER_CLASSES = {
    AlibabaProvider.identifier: AlibabaProvider, 
    AWSProvider.identifier :AWSProvider, 
    AzureProvider.identifier: AzureProvider,
    DOProvider.identifier: DOProvider,
    GCPProvider.identifier: GCPProvider,
    OCIProvider.identifier: OCIProvider
}

async def _identify(timeout, providers= None):

     if not providers:
        providers = [identifier for identifier in __PROVIDER_CLASSES]
       ......
       ......
       tasks = {
        __PROVIDER_CLASSES[p_id].identifier : asyncio.ensure_future(wrapper(__PROVIDER_CLASSES[p_id])) for p_id in providers if p_id in __PROVIDER_CLASSES
    }
        ......

def provider(timeout=None, providers= None,):
     .....
     .....
     if py_version.minor >= 7:
        result = asyncio.run(_identify(timeout, providers))
    else:
        loop = asyncio.new_event_loop()
        result = loop.run_until_complete(_identify(timeout, providers))
        loop.close()
    return result
........

Suport of MacStadium cloud

Hi, it would be great to see the MacStadium cloud supported.
The one supposed test is to curl https://ipinfo.io/ and check if .org field of output is 'AS395336 MacStadium, Inc.'. Not sure if good API calls exist for that, as entire their API seems to require authentication.

Improve detection speed using asyncio

I was testing this library on AWS EC2.
It was stuck at the below line when I executed provider(). Hitting Ctrl+C printed 'aws' on the console and the program exited.

response = requests.get(self.metadata_url)

I am getting instant response with AWSProvider().identify()

Unlike files, url requests may take considerable time to respond even if the url is not accessible.
So, identifying the cloud provider who's at the end of the if-clause(in init.py) could take lot of time.

All the providers' urls and files can be read concurrently using asyncio. That means irrespective of the position of a provider in the if-else clause, all providers would be detected within a similar duration.

I can create a PR in couple of days if this issue is accepted.

Add Alibaba cloud provider

Determine if the host cloud provider is Alibaba.

You can see how we detect the other providers (e.g AWS) here.

For reference, you can check banzaicloud/satellite code.

Vultr detected as Azure

cloud-detect identifies Vultr as Azure:

[root@test-id ~]# python3
Python 3.11.3 (main, May 24 2023, 00:00:00) [GCC 13.1.1 20230511 (Red Hat 13.1.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cloud_detect import provider
>>> provider()
'azure'
>>>

It looks like azure provider only checks for return status of 200. In vultr, it also responds with a 200.

[root@test-id ~]# curl -v http://169.254.169.254/metadata/instance?api-version=2017-12-01
*   Trying 169.254.169.254:80...
* Connected to 169.254.169.254 (169.254.169.254) port 80 (#0)
> GET /metadata/instance?api-version=2017-12-01 HTTP/1.1
> Host: 169.254.169.254
> User-Agent: curl/8.0.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx
< Date: Sat, 15 Jul 2023 11:25:35 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Expires: Sat, 15 Jul 2023 11:25:34 GMT
< Cache-Control: no-cache
< X-Frame-Options: DENY
< X-Content-Type-Options: nosniff
<
* Connection #0 to host 169.254.169.254 left intact

Add Oracle cloud provider

Determine if the host cloud provider is Oracle.

You can see how we detect the other providers (e.g AWS) here.

For reference, you can check banzaicloud/satellite code.

AWS instance not recognized

I have an older AWS instance that is not detected, after investigating I noticed 2 issues

  1. /sys/class/dmi/id/product_version file is empty
  2. http://169.254.169.254/latest/dynamic/instance-identity/document call returns Content-Type: text/plain, this cause the response.json() function to fail
WARNING:cloud_detect.providers.aws_provider:0, message='Attempt to decode JSON with unexpected mimetype: text/plain', url=URL('http://169.254.169.254/latest/dynamic/instance-identity/document')

Will you accept a PR to ignore the content-type header?
response.json(content_type=None)

Also maybe the file flow can be revisited, do we have to look at the file content? isn't it good enough if it exists?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.