rchakode / kube-opex-analytics Goto Github PK

🎨 Kubernetes Usage Analytics and Accounting for Cost Allocation and Capacity Planning - Hourly Trends, Daily and Monthly Accounting - Prometheus Exporter - Built-in & Grafana Dashboards.

Home Page: https://krossboard.app/

License: Apache License 2.0

JavaScript 28.73% Python 44.63% HTML 12.01% CSS 8.43% Dockerfile 0.73% Shell 3.58% Mustache 1.89%

kubernetes analytics capacity-planning cost-allocation dashboard monitoring prometheus-exporter grafana-dashboard

kube-opex-analytics's Introduction

Overview
Getting Started
Design Fundamentals
License
Support & Contributions

Overview

kube-opex-analytics (literally Kubernetes Opex Analytics) is a Kubernetes usage accounting and analytics tool to help organizations track the resources being consumed by their Kubernetes clusters over time (hourly, daily, monthly). The purpose of kube-opex-analytics is to help prevent overpaying. Indeed, it provides insightful usage analytics metrics and charts, that engineering and financial teams can use as key indicators to take appropriate cost optimization decisions.

Key features:

Hourly consumption trends, daily and monthly accounting per namespace. This feature provides analytics metrics tracking both actual usage and requested capacities over time. Metrics are namespaced-based, collected every 5 minutes, consolidated on a hourly basis for trends, from which daily and monthly accounting is processed.
Accounting of non-allocatable capacities. At node and cluster levels, kube-opex-analytics tracks and consolidates the share of non-allocatable capacities and highlights them against usable capacities (i.e. capacities used by actual application workloads). In contrary to usable capacities, non-allocatable capacities are dedicated to Kubernetes operations (OS, kubelets, etc).
Cluster usage accounting and capacity planning. This feature makes it easy to account and visualize capacities consumed on a cluster, globally, instantly and over time.
Usage/requests efficiency. Based on hourly-consolidated trends, this functionality helps know how efficient resource requests set on Kubernetes workloads are, compared against the actual resource usage over time.
Cost allocation and charge back analytics: automatic processing and visualization of resource usage accounting per namespace on daily and monthly periods.
Insightful and extensible visualization. kube-opex-analytics enables built-in analytics dashboards, as well as a native Prometheus exporter that exposes its analytics metrics for third-party visualization tools like Grafana.

Read the design fundamentals documentation to learn more concepts and implementation decisions.

Multi-cluster Integration: kube-opex-analytics tracks the usage for a single Kubernetes cluster. For a centralized multi-Kubernetes usage analytics, use our Krossboard Kubernetes Operator product. Watch a demo video.

Getting Started

Design Fundamentals

Checkout the Design Fundamentals documentation to learn more about kube-opex-analytics, it introduces concepts and implementation decisions.

License

kube-opex-analytics (code and documentation) is licensed under the terms of Apache License 2.0; read the LICENSE. Besides, it's bound to third-party libraries each with its specific license terms; read the NOTICE for additional information.

Support & Contributions

We encourage feedback and always make our best to handle any troubles you may encounter when using kube-opex-analytics.

Use this link to submit issues or improvement ideas.

To contribute bug patches or new features, please submit a Pull Request.

Contributions are accepted subject that the code and documentation be released under the terms of Apache 2.0 License.

kube-opex-analytics's People

Contributors

Stargazers

Watchers

kube-opex-analytics's Issues

Add option to provide cacert file for self-signed certificate

Currently it's only possible to ignore certificate verification, what leads to make the SSL communication unsecured. Enabling an option to provide a cacert file would keep the SSL communication secured.

install error due to HELM client and server

hello

got error while installing with HELM

helm upgrade --install kube-opex-analytics --namespace=kube-opex-analytics helm/kube-opex-analytics/
UPGRADE FAILED
ROLLING BACK
Error: incompatible versions client[v2.13.0] server[v2.12.0]
Error: UPGRADE FAILED: incompatible versions client[v2.13.0] server[v2.12.0]

any idea

Unable to run kube-opex-analytics container in userns enabled system

Refer Debugging Container ID Cannot Be Mapped to Host ID Error
This is blocking us from using kube-opex-analytics. Could the RUNTIME_USER_UID be changed to a number between 0-65535.

Metrics agregation crashs with keyerror

Describe the bug

ERROR:kube-opex-analytics:KeyError Exception in create_metrics_puller => Traceback (most recent call last):
  File "./backend.py", line 760, in create_metrics_puller
    k8s_usage.consolidate_ns_usage()
  File "./backend.py", line 491, in consolidate_ns_usage
    self.nodes[pod.nodeName].podsRunning.append(pod)
KeyError: 'gke-cluster-1-pool-1-9a9ea7a4-v23m'

env variable KOA_ENABLE_PROMETHEUS_EXPORTER not correclty decoded

It appears that the environment variable KOA_ENABLE_PROMETHEUS_EXPORTER introduced by e5616c3 in v0.4.6 is not correctly processed.

The console output shows:

./entrypoint.sh: 20: [: true: unexpected operator
 * Serving Flask app "backend" (lazy loading)

Issue is due to the wrong shell interpreter used to execute the script. The shebang states to use the bash interpreter, but the docker build states to use sh in the ENTRYPOINT (that latter wins).

what's mean non-allocatable resources ?

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Update docs

Topics:

usage efficiency factor
refined dashboard for v22.02.00

Create Kustomize deployment manifests

Add occupation chart on the nodes' popup

Add option to ignore SSL validation

When running from within a kubernetes cluster we can use https://kubernetes.default for the API endpoint but the ssl cert will not be valid for that host name, resulting in:

2019-55-12 15:55:27 [ERROR] HTTP exception requesting 'https://kubernetes.default/api/v1/pods' (HTTPSConnectionPool(host='kubernetes.default', port=443): Max retries exceeded with url: /api/v1/pods (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:847)'),)))
2019-55-12 15:55:27 [ERROR] HTTP exception requesting 'https://kubernetes.default/apis/metrics.k8s.io/v1beta1/pods' (HTTPSConnectionPool(host='kubernetes.default', port=443): Max retries exceeded with url: /apis/metrics.k8s.io/v1beta1/pods (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:847)'),)))

so it would be nice to be able to skip ssl validation for this situation.

Ingress apiVersion extensions/v1beta1 is deprecated

Describe the bug
apiVersion: extensions/v1beta1 is deprecated

To Reproduce
Steps to reproduce the behavior:

check /manifests/helm/templates/ingress.yaml file

Expected behavior
migrate to networking.k8s.io/v1

Grafana dashboard explanation.

hi team, can u just explain me the below grafana. what is it referring to is it cost or it is resource utilization unable to understand the data here.

Thanks

Running on a virtual machine with NAT and port forward the backend crashes on start

Traceback (most recent call last):
File "./backend.py", line 764, in
waitress_serve(wsgi_dispatcher, listen='*:5483')
File "/usr/local/lib/python3.6/dist-packages/waitress/init.py", line 12, in serve
server = _server(app, **kw)
File "/usr/local/lib/python3.6/dist-packages/waitress/server.py", line 88, in create_server
sockinfo=sockinfo,
File "/usr/local/lib/python3.6/dist-packages/waitress/server.py", line 232, in init
self.create_socket(self.family, self.socktype)
File "/usr/local/lib/python3.6/dist-packages/waitress/wasyncore.py", line 354, in create_socket
sock = socket.socket(family, type)
File "/usr/lib/python3.6/socket.py", line 144, in init
_socket.socket.init(self, family, type, proto, fileno)
OSError: [Errno 97] Address family not supported by protocol
La configuration réseau de ma VM était NAT + redirection de ports VM - Hôte(22 -> 2222, 80 -> 8080).

Add Grafana dashboard templates so to unify the integration with existing monitoring and visualization stacks within organizations.

Create a Helm 3 repo and make it available through out Helm Hub

Allows accounting based on resource.requests instead real usage

Is your feature request related to a problem? Please describe.
My users are charge by requested resources instead used resources. For example, when someone asks for 1Gi and ends up using 500Mi, it will be charged by 1Gi.

Describe the solution you'd like
Be able to calculate usage by allocated resources instead used resources.

Describe alternatives you've considered
none

Additional context
none

Helm based install service account

Hi,

It looks like the helm chart deploy needs a service account to be created?

2019-05-17 10:06:13,597 - kube-opex-analytics - ERROR - call to https://kubernetes.default/api/v1/namespaces returned error ({"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"namespaces is forbidden: User \"system:anonymous\" cannot list resource \"namespaces\" in API group \"\" at the cluster scope","reason":"Forbidden","details":{"kind":"namespaces"},"code":403}
)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:847: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
2019-05-17 10:06:13,612 - kube-opex-analytics - ERROR - call to https://kubernetes.default/api/v1/nodes returned error ({"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"nodes is forbidden: User \"system:anonymous\" cannot list resource \"nodes\" in API group \"\" at the cluster scope","reason":"Forbidden","details":{"kind":"nodes"},"code":403}
)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:847: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
2019-05-17 10:06:13,628 - kube-opex-analytics - ERROR - call to https://kubernetes.default/apis/metrics.k8s.io/v1beta1/nodes returned error ({"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"nodes.metrics.k8s.io is forbidden: User \"system:anonymous\" cannot list resource \"nodes\" in API group \"metrics.k8s.io\" at the cluster scope","reason":"Forbidden","details":{"group":"metrics.k8s.io","kind":"nodes"},"code":403}
)

I'm testing this out on minikube, with the following values yaml:

# Default values for kube-opex-analytics.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

envs:
  KOA_DB_LOCATION: /data/db
  KOA_K8S_API_ENDPOINT: https://kubernetes.default
  KOA_K8S_API_VERIFY_SSL: false
  KOA_COST_MODEL: RATIO
  KOA_BILLING_HOURLY_RATE: 0.0
  KOA_BILLING_CURRENCY_SYMBOL: £

dataVolume:
  persist: false
  capacity: 4Gi

replicaCount: 1

image:
  repository: rchakode/kube-opex-analytics
  pullPolicy: Always

nameOverride: ""
fullnameOverride: ""

metadata:
  labels:
    app: kube-opex-analytics

service:
  type: ClusterIP
  port: 80
  targetPort: 5483

ingress:
  enabled: true
  hosts:
    - host: minikube
      paths:
        - /

# uncomment the following lines and ajust the values to enable TLS
  # tls:
  #  - secretName: kube-opex-analytics-tls
  #    hosts:
  #      - kube-opex-analytics.local

resources: {}
# No resource requests not limits is specified by default.
# We leave this as a conscious choice for the user, as this increases chances charts run
# on environments with little resources, such as Minikube.
# If you do want to specify resources, uncomment the following lines, adjust them as necessary,
# and remove the curly braces after 'resources:'.
  # limits:
  #   cpu: 256m
  #   memory: 512Mi
  # requests:
  #   cpu: 100m
  #   memory: 128Mi

nodeSelector: {}

tolerations: []

affinity: {}

Webserver component not working

Hi,

after starting the kube-opex-analytics pod, it starts collecting metrics, but the webserver isn't responding (timeout).

root@kube-opex-analytics-785b5f8876-vcpjl:/koa# curl http://127.0.0.1:5483
[timeout]

We also had to disable the readiness and liveness probes, to get to debug. With those active, the container repeatedly restarts, because it's never healthy.

*** Starting uWSGI 2.0.18 (64bit) on [Tue Nov 12 09:51:25 2019] ***
compiled with version: 7.4.0 on 10 November 2019 22:22:21
os: Linux-4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u5 (2019-08-11)
nodename: kube-opex-analytics-785b5f8876-vcpjl
machine: x86_64
clock source: unix
detected number of CPU cores: 16
current working directory: /koa
detected binary path: /usr/local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
your processes number limit is 1048576
your memory page size is 4096 bytes
detected max file descriptor number: 1048576
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
*** RRDtool library available at 0x55b9cc4e8e10 ***
uwsgi socket 0 bound to TCP address :5483 fd 3
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
Python version: 3.6.8 (default, Oct 7 2019, 12:59:55) [GCC 8.3.0]
Python main interpreter initialized at 0x55b9cc51a2b0
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 145808 bytes (142 KB) for 1 cores
*** Operational MODE: single process ***
2019-11-12 09:51:27,158 - kube-opex-analytics - DEBUG - {puller] collecting new metrics
WSGI app 0 (mountpoint='') ready in 2 seconds on interpreter 0x55b9cc51a2b0 pid: 8 (default app)
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 8)
spawned uWSGI worker 1 (pid: 14, cores: 1)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
2019-11-12 09:51:54,873 - kube-opex-analytics - DEBUG - [puller][sample] non-allocatable, 0.000000, 0.336612

Config Values:

envs:
KOA_DB_LOCATION: /data/db
KOA_K8S_API_ENDPOINT: https://kubernetes.default
KOA_K8S_API_VERIFY_SSL: false
KOA_COST_MODEL: CUMULATIVE_RATIO
KOA_BILLING_HOURLY_RATE: 0.0
KOA_BILLING_CURRENCY_SYMBOL: $
KOA_ENABLE_DEBUG: true
KOA_K8S_AUTH_TOKEN:

Anythin we can contribute to help debug this issue?

Best,
Alex

Add date filter for trends charts

Given a start and end dates, the trends charts shall be updated accordingly.

Add exporters to push generated analytics to third-party timeseries databases such as InfluxDB, Graphique or ElasticSearch, so that they can be visualized using external systems such as Grafana and Kibana.

No cpuUsage and MeMusage metric popping up

Hi Team,

I have installed Kubernetes Opex Analytics using Helm. But I am unable to see the metrics on the dashboard.

Can anyone would like to help?

Error:
The following items failed:

No cpuUsage metric on node X.X.X.X

Add exporters for charts to provide the ability to use the generated charts for reports

kubectl wording

Fix documentation kubectl wording in README.md

Get Kubernetes API Endpoint

$ kubetcl proxy

Add Prometheus exporter

Key error when API call fail

Describe the bug
2022-04-20 03:05:37,263 - kube-opex-analytics - ERROR - Exception calling HTTP endpoint https://kubernetes.default/api/v1/namespaces (HTTPSConnectionPool(host='kubernetes.default', port=443): Max retries exceeded with url: /api/v1/namespaces (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f90fc400780>: Failed to establish a new connection: [Errno 110]Connection timed out',)))
ERROR:kube-opex-analytics:Exception calling HTTP endpoint https://kubernetes.default/api/v1/namespaces (HTTPSConnectionPool(host='kubernetes.default', port=443): Max retries exceeded with url:/api/v1/namespaces (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f90fc400780>: Failed to establish a new connection: [Errno 110] Connection timed out',)))
2022-04-20 03:06:09,890 - kube-opex-analytics - ERROR - KeyError Exception in create_metrics_puller => Traceback (most recent call last):
File "./backend.py", line 752, in create_metrics_puller
k8s_usage.consolidate_ns_usage()
File "./backend.py", line 476, in consolidate_ns_usage
self.usageByNamespace[pod.namespace].cpu += pod.cpuUsage
KeyError: 'default'

ERROR:kube-opex-analytics:KeyError Exception in create_metrics_puller => Traceback (most recent call last):
File "./backend.py", line 752, in create_metrics_puller
k8s_usage.consolidate_ns_usage()
File "./backend.py", line 476, in consolidate_ns_usage
self.usageByNamespace[pod.namespace].cpu += pod.cpuUsage
KeyError: 'default'

Add exporters to provide CSV data so that the resulting analytics data can be used to produce other custom analytics.

How to install kube-opex-analytics using helm3 (without tiller)

I am trying to install kube-opex-analytics using helm3 (without tiller). Even though I did helm repo update couple of times but nothing seems to work. Do i need to have the repo "helm/kube-opex-analytics" locally downloaded to get this working.? Any help would be appreciated

Unauthorized error

I'm getting this error

2019-08-15 04:29:40,180 - kube-opex-analytics - ERROR - call to http://127.0.0.1:8001/api/v1/nodes returned error ({"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}
)

On documentation doesn't say where to put the config file or setup KUBECONFIG environment variable.

unable to access Dashboard using kubectl proxy

I am getting the following error while accessing the dashboard .
Error: E1114 10:59:07.848191 26072 portforward.go:400] an error occurred forwarding 8080 -> 8080: error forwarding port 8080 to pod 8e17b8b61ec08320b960c1c00a32a108b1457186a8fd9fc2078688f4f26512a0, uid : exit status 1: 2019/11/14 05:29:07 socat[20928] E connect(5, AF=2 127.0.0.1:8080, 16): Connection refused

pod log: [pid: 13|app: 0|req: 28/28] 172.31.8.230 () {26 vars in 320 bytes} [Thu Nov 14 05:33:41 2019] GET / => generated 14604 bytes in 0 msecs (HTTP/1.1 200) 5 headers in 146 bytes (1 switches on core 0)

Test and validate RBAC settings on Cloud Container Engine (CCE)

By analyzing the issue #17, it seems that on CCE access to metrics API is forbidden to Kubernetes Opex Analytics, though the RBAC resources (serviceaccount, clusterrole, clusterrolebinding) have been property created.

We don't have access to a CCE to qualify this issue, help from CCE users is welcomed.

Add support for a cost model

We could provide a config file where we assign a dollar amount to the CPU and memory units and then we can have dashboards showing how much money is being used where.

ValueError: invalid literal for int() with base 10

Hi,

looks like m unit does not handled properly in decode_memory_capacity() function:

Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "backend.py", line 661, in create_metrics_puller
    k8s_usage.extract_nodes(pull_k8s('/api/v1/nodes'))
  File "backend.py", line 326, in extract_nodes
    node.memAllocatable = self.decode_memory_capacity(item['status']['allocatable']['memory'])
  File "backend.py", line 273, in decode_memory_capacity
    return int(cap_value)
ValueError: invalid literal for int() with base 10: '14162554060800m'

as well as E, P, T, G, K analogs

https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-memory

deployed to Kubernetes, uwsgi doesn't respond, flask does

Continuing from #30 :

we're seeing the same thing in a Kubernetes 1.15 cluster deployed with RKE.

This deployment works:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: kube-opex
  name: kube-opex
  namespace: kube-opex
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kube-opex
  template:
    metadata:
      labels:
        app: kube-opex
    spec:
      containers:
      - env:
        - name: KOA_BILLING_CURRENCY_SYMBOL
          value: $
        - name: KOA_COST_MODEL
          value: RATIO
        - name: KOA_DB_LOCATION
          value: /data/db
        - name: KOA_ENABLE_PROMETHEUS_EXPORTER
          value: "false"
        - name: KOA_K8S_API_ENDPOINT
          value: https://kubernetes.default
        - name: KOA_K8S_API_VERIFY_SSL
          value: "false"
        image: rchakode/kube-opex-analytics
        imagePullPolicy: Always
        name: kube-opex
        resources:
          limits:
            cpu: "2"
            memory: 3Gi
          requests:
            cpu: 700m
            memory: 1Gi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities: {}
          privileged: false
          readOnlyRootFilesystem: false
          runAsNonRoot: true
          runAsUser: 830405
        volumeMounts:
        - mountPath: /data
          name: kube-opex
      volumes:
      - name: kube-opex
        persistentVolumeClaim:
          claimName: kube-opex
      securityContext: {}
      serviceAccount: kube-opex
      serviceAccountName: kube-opex

but if I set KOA_ENABLE_PROMETHEUS_EXPORTER in that deployment to "true", the uwsgi server never responds. If I look at the process table, or install the iproute2 package and run "ss," I can see the process is up and listening on all interfaces:

# ss -tln
State                 Recv-Q                 Send-Q                                   Local Address:Port                                   Peer Address:Port
LISTEN                4                      100                                            0.0.0.0:5483                                        0.0.0.0:*

But uwsgi never returns a response, and doesn't log anything, it just hangs until the client times out the connection.

wsgi service is running as root

pod logs show warning about wsgi being run as root, it need to be run as nonroot user, logs:

*** Starting uWSGI 2.0.18 (64bit) on [Sat Jul 13 07:39:24 2019] ***
compiled with version: 7.4.0 on 28 June 2019 21:50:41
os: Linux-4.14.106-97.85.amzn2.x86_64 #1 SMP Fri Mar 15 17:07:54 UTC 2019
nodename: kube-opex-analytics-5cbcd6d4df-w5cd8
machine: x86_64
clock source: unix
detected number of CPU cores: 8
current working directory: /koa
detected binary path: /usr/local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) *** 
*** WARNING: you are running uWSGI without its master process manager ***
your memory page size is 4096 bytes
detected max file descriptor number: 65536

Issue with CPU capacity reported by virtual nodes on Alicloud K8s

My cluster is running a combination of regular and virtual nodes. The virtual nodes report a cpu capacity of '192k' which is 192000000 according to the K9s console.

spec:
  capacity:
    cpu: 192k
    ephemeral-storage: 60000Gi
    hugepages-2Mi: 60Ti
    memory: 640Ti
    nvidia.com/gpu: 1k
    pods: 3k

This causes an error with kube-opex-analytics on startup:

ERROR:kube-opex-analytics:ValueError Exception in create_metrics_puller => Traceback (most recent call last):
  File "./backend.py", line 756, in create_metrics_puller
    k8s_usage.extract_nodes(pull_k8s('/api/v1/nodes'))
  File "./backend.py", line 371, in extract_nodes
    node.cpuCapacity = self.decode_cpu_capacity(status['capacity']['cpu'])
  File "./backend.py", line 337, in decode_cpu_capacity
    return int(cap_input)
ValueError: invalid literal for int() with base 10: '192k'

I can see a couple of similar closed issues relating to mem and cpu units of measurement, would it be possible to add this use-case?

Usage Accounting (costs $) charts are not displaying

Describe the bug
Usage Accounting (costs $) charts are not displaying:

2022-11-29 11:32:59,530 - kube-opex-analytics - ERROR - KeyError Exception in dump_analytics => Traceback (most recent call last):
  File "./backend.py", line 908, in dump_analytics
    Rrd.dump_histogram_analytics(dbfiles=ns_dbfiles, period=RrdPeriod.PERIOD_14_DAYS_SEC)
  File "./backend.py", line 763, in dump_histogram_analytics
    usage_ratio * usage_per_type_date[res][date_key][KOA_CONFIG.db_billing_hourly_rate],
KeyError: '.billing-hourly-rate'

404 In browser:

https://kube-opex-analytics-url/static/data/cpu_usage_period_1209600.json?_=1669722368117
https://kube-opex-analytics-url/static/data/memory_usage_period_1209600.json?_=1669722368118

Configuration:
k8s: 1.22.15
k8s.gcr.io/metrics-server/metrics-server:v0.6.1

KOA_COST_MODEL=CHARGE_BACK
KOA_BILLING_HOURLY_RATE=6.00

Steps to reproduce:

set KOA_COST_MODEL to CHARGE_BACK and KOA_BILLING_HOURLY_RATE to 6.00
Deploy kube-analytics-tool in k8s

[helm install] Pod don`t get ready

Install through helm chart with default values and the pod didn`t get read.

+ kube-opex-analytics-849ff45d4b-nt2qz › kube-opex-analytics kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics *** Starting uWSGI 2.0.18 (64bit) on [Tue Oct 29 14:00:19 2019] *** kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics compiled with version: 7.4.0 on 28 September 2019 14:43:37 kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics os: Linux-4.15.0-43-generic #46~16.04.1-Ubuntu SMP Fri Dec 7 13:31:08 UTC 2018 kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics nodename: kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics machine: x86_64 kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics clock source: unix kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics detected number of CPU cores: 8 kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics current working directory: /koa kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics detected binary path: /usr/local/bin/uwsgi kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics !!! no internal routing support, rebuild with pcre support !!! kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics your memory page size is 4096 bytes kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics detected max file descriptor number: 1048576 kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics lock engine: pthread robust mutexes kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics thunder lock: disabled (you can enable it with --thunder-lock) kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics *** RRDtool library available at 0x55d9b0349a20 *** kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics uwsgi socket 0 bound to TCP address :5483 fd 3 kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics Python version: 3.6.8 (default, Aug 20 2019, 17:12:48) [GCC 8.3.0] kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics Python main interpreter initialized at 0x55d9b037a660 kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics python threads support enabled kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics your server socket listen backlog is limited to 100 connections kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics your mercy for graceful operations on workers is 60 seconds kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics mapped 145808 bytes (142 KB) for 1 cores kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics *** Operational MODE: single process *** kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics WSGI app 0 (mountpoint='') ready in 1 seconds on interpreter 0x55d9b037a660 pid: 9 (default app) kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics *** uWSGI is running in multiple interpreter mode *** kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics spawned uWSGI master process (pid: 9) kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics spawned uWSGI worker 1 (pid: 15, cores: 1) kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics InsecureRequestWarning, kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics InsecureRequestWarning, kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics InsecureRequestWarning, kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics InsecureRequestWarning, kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings kube-opex-analytics-849ff45d4b-nt2qz kube-opex-analytics InsecureRequestWarning,

Is there no way to have KOA_BILLING_HOURLY_RATE be dynamically updated as cluster usage increases and decreases using autoscaler?

Add support for multiple Kubernetes clusters.

Error displaying hourly, daily charts

The hourly 7d and daily 14d cumulative charts are not displaying. There is an error in the javascript console:

Uncaught TypeError: Cannot read property 'length' of undefined
    at drawStackedBar (stacked-bar.js:574)
    at HTMLDivElement.<anonymous> (stacked-bar.js:177)
    at R.each (d3-selection.min.js:2)
    at exports (stacked-bar.js:165)
    at R.call (d3-selection.min.js:2)
    at updateStackedBarChart (frontend.js:242)
    at Object.success (frontend.js:636)
    at j (jquery-1.11.0.min.js:2)
    at Object.fireWith [as resolveWith] (jquery-1.11.0.min.js:2)
    at x (jquery-1.11.0.min.js:4)

and an error in the server:

2020-12-01 12:31:02,437 - kube-opex-analytics - ERROR - KeyError Exception in dump_analytics ('cpu')
ERROR:kube-opex-analytics:KeyError Exception in dump_analytics ('cpu')
WARNING:waitress.queue:Task queue depth is 1

Config info:

      - env:
        - name: KOA_BILLING_CURRENCY_SYMBOL
          value: $
        - name: KOA_BILLING_HOURLY_RATE
          value: "0"
        - name: KOA_COST_MODEL
          value: CUMULATIVE_RATIO
        - name: KOA_DB_LOCATION
          value: /data/db
        - name: KOA_K8S_API_ENDPOINT
          value: https://kubernetes.default
        - name: KOA_K8S_API_VERIFY_SSL
          value: "false"

image: rchakode/kube-opex-analytics:20.10.4
kube version: v1.18.10

Investigate the cause of "Internal Server Error: "/api/v1/nodes": the server is currently unable to handle the request"

Serving on http://0.0.0.0:5483
2021-04-15 19:26:51,556 - kube-opex-analytics - ERROR - call to https://34.122.5.125/api/v1/nodes returned error (Internal Server Error: "/api/v1/nodes": the server is currently unable to handle the request
)
ERROR:kube-opex-analytics:call to https://34.122.5.125/api/v1/nodes returned error (Internal Server Error: "/api/v1/nodes": the server is currently unable to handle the request
)
2021-04-15 19:26:51,706 - kube-opex-analytics - ERROR - KeyError Exception in dump_analytics ('gke-cluster-1-default-pool-7f5e6673-m6so')
ERROR:kube-opex-analytics:KeyError Exception in dump_analytics ('gke-cluster-1-default-pool-7f5e6673-m6so')

ValueError: invalid literal for int() with base 10

I tried to install kube-opex-analytics via helm in my kubernetes cluster.

but get an error

  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "backend.py", line 639, in create_metrics_puller
    k8s_usage.extract_pod_metrics( pull_k8s('/apis/metrics.k8s.io/v1beta1/pods') )
  File "backend.py", line 401, in extract_pod_metrics
    pod.cpuUsage += self.decode_cpu_capacity(container['usage']['cpu'])
  File "backend.py", line 285, in decode_cpu_capacity
    return int(cap_input)
ValueError: invalid literal for int() with base 10: '8001u'

kubernetes version is 1.11.5

Is there anything I can do about it?

Thank you