kubecost / cost-analyzer-helm-chart Goto Github PK
View Code? Open in Web Editor NEWKubecost helm chart
Home Page: http://kubecost.com/install
License: Apache License 2.0
Kubecost helm chart
Home Page: http://kubecost.com/install
License: Apache License 2.0
Ironic, given #12 but one can see in the Reason:
code that node is being OOMKilled
cost-analyzer-checks:
Container ID: docker://d9d86367a7c7b54e22331f2bc6db2178a94667239105d491f3bde9877e4b4171
Image: ajaytripathy/kubecost-checks:prod-1.18.2
Image ID: docker-pullable://ajaytripathy/kubecost-checks@sha256:f9c45b42c8facd366a0515544f2a9fbfc8b75af8dea5e54f29bcbd4ecbdfeff8
Port: <none>
Host Port: <none>
Args:
node
./node/cron.js
State: Terminated
Reason: OOMKilled
Exit Code: 0
Started: Tue, 30 Apr 2019 15:00:21 -0700
Finished: Tue, 30 Apr 2019 15:01:48 -0700
Ready: False
Restart Count: 0
Limits:
cpu: 10m
memory: 55M
Requests:
cpu: 10m
memory: 55M
We use Reckoner to manage our infrastructure charts, and this particular issue blocks us being able to use it for installing kubecost. There is a workaround for the issue here that would be nice to have.
I completely understand if you do not want to implement the workaround, but I thought I would request.
Installed via helm service currently does not use the fullname helper. This results in cost-analyzer-cost-analyzer
if the release is the same as the chart vs the proper cost-analyzer
version 1.41.0
See previous discussion here: opencost/opencost#31
In short, this is doable but there is enough trickiness that we ideally like to work with teams to make sure this is a smooth experience (e.g. ensure all dependancies are there, Prom isn't being throttled, etc.)
Documentation available here: http://docs.kubecost.com/custom-prom
Describe the bug
Error in the logs cost-analyzer-7d84df98f9-bwvwn:cost-model I0909 12:58:15.784836 1 awsprovider.go:647] Skipping AWS spot data download: MissingRegion: could not find region configuration
To Reproduce
Deploy via helm to AWS
Expected behavior
The ability to discover or set a region
The label key used for app
is configurable. Let's allow this to be templated from values.yaml. More discussion available here.
imagePullSecrets:
- name: regcred
else kubelet
whines oppressively:
W0505 06:20:09.240086 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:21:14.241350 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:22:35.240232 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:24:00.240323 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:25:30.240170 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:26:38.241061 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:28:02.240327 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:29:22.240383 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:30:32.242500 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:31:45.240160 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
W0505 06:32:59.240258 6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found. The image pull may not succeed.
See: opencost/opencost#77
Some deployments require auth for intra-app communication, we should provide an example of how to do this.
As of this issue, the cost-analyzer-server
container does not declare any resources:
, which makes the kubernetes scheduler's job harder (and, ironically, AIUI makes assigning the cost associated with that container harder, too)
Deployment/Statefulset/Daemonset utilization metrics is installed with a preset absolute time range of Feb 5 to Feb 19, 2019. Obviously this is not a generally helpful default. Please change the default to a relative time frame. I suggest "Last 7 days".
The first feature request is to show cost in local currencies.
Allow creating grafana dashboards config maps even if grafana is not installed with this chart. If we already have grafana installed and we create this config maps, on restart it should take them as well, no?
Could you make the docker images configurable through the values file? Where I work I can’t pull public images directly it would be nice to be able to change it in the values file instead of having to go through the templates every time.
@mdaniel encountered the following error message in 4 separate "500 Internal Server Error" responses from Grafana:
{"status":"error","errorType":"internal","error":"many-to-many matching not allowed: matching labels must be unique on one side"}
sum(
(
(
sum(kube_node_status_capacity_cpu_cores) by (node)
sum(
(
(
sum(kube_node_status_capacity_cpu_cores) by (node)
sum (
sum(kube_persistentvolumeclaim_info{storageclass=~".ssd."}) by (persistentvolumeclaim, namespace, storageclass)
on (persistentvolumeclaim, namespace) group_right(storageclass)
sum(kube_persistentvolumeclaim_resource_requests_storage_bytes) by (persistentvolumeclaim, namespace) or up * 0
) / 1024 / 1024 /1024 * .17
sum (
sum(kube_persistentvolumeclaim_info{storageclass!~".ssd."}) by (persistentvolumeclaim, namespace, storageclass)
on (persistentvolumeclaim, namespace) group_right(storageclass)
sum(kube_persistentvolumeclaim_resource_requests_storage_bytes) by (persistentvolumeclaim, namespace) or up * 0
) / 1024 / 1024 /1024 * 0.040
sum(container_fs_limit_bytes{id="/"}) / 1024 / 1024 / 1024 * 1.03 * 0.040
sum(
(
(
sum(kube_node_status_capacity_memory_bytes) by (node)
#Network
SUM(rate(node_network_transmit_bytes_total{device="eth0"}[60m]) / 1024 / 1024 / 1024 ) * (60 * 60 * 24 * 30) * .12
sum(
(
(
sum(kube_node_status_capacity_memory_bytes) by (node)
sum(
(
(
sum(kube_node_status_capacity_cpu_cores) by (node)
sum (
sum(kube_persistentvolumeclaim_info{storageclass=~".ssd."}) by (persistentvolumeclaim, namespace, storageclass)
on (persistentvolumeclaim, namespace) group_right(storageclass)
sum(kube_persistentvolumeclaim_resource_requests_storage_bytes) by (persistentvolumeclaim, namespace) or up * 0
) / 1024 / 1024 /1024 * .17
sum (
sum(kube_persistentvolumeclaim_info{storageclass!~".ssd."}) by (persistentvolumeclaim, namespace, storageclass)
on (persistentvolumeclaim, namespace) group_right(storageclass)
sum(kube_persistentvolumeclaim_resource_requests_storage_bytes) by (persistentvolumeclaim, namespace) or up * 0
) / 1024 / 1024 /1024 * 0.040
sum(container_fs_limit_bytes{id="/"}) / 1024 / 1024 / 1024 * 1.03 * 0.040
sum(
(
(
sum(kube_node_status_capacity_memory_bytes) by (node)
#Network
SUM(rate(node_network_transmit_bytes_total{device="eth0"}[60m]) / 1024 / 1024 / 1024 ) * (60 * 60 * 24 * 30) * .12
Describe the bug
A clear and concise description of what the bug is.
From the kubernetes spec:
Labels are key/value pairs. Valid label keys have two segments: an optional prefix and name, separated by a slash (/). The name segment is required and must be 63 characters or less, beginning and ending with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (_), dots (.), and alphanumerics between. The prefix is optional. If specified, the prefix must be a DNS subdomain: a series of DNS labels separated by dots (.), not longer than 253 characters in total, followed by a slash (/).
Prometheus only supports underscores in label names, so if you have an aggregation on label foo.bar and on foo-bar, they will both be merged to the label foo_bar in prometheus and therefore all costs will be charged to foo_bar in kubecost.
Expected behavior
Kubecost should reject invalid prometheus label characters as first class aggregation labels (owner, product, etc.)
In https://github.com/kubecost/cost-analyzer-helm-chart/pull/76/files, it looks like the name of the service was changed to be hard-coded to kubecost-cost-analyzer
, however that goes against the Helm norms where the release name is part of the generated service name.
It also broke the ingress definition since the ingress spec does use the generated / full name for the target service which no longer matches:
Hi,
A good practice in Helm chart for persistence is to enable user to customize it further:
storageClass
for people having multiple storage backendsannotations
for instance for people using backup solutions that rely on volume annotationsaccessModes
So i created this issue to talk about that :-)
What problem are you trying to solve?
We at Astronomer (astronomer.io) use node taints in combination with node-affinity and tolerations to organize components in node pools. In our case, this is because we want multi-tenant components on separate node pool(s) from our platform components. We hope to use Kubecost in our platform components.
When I say 'node selector', I am actually referring to nodeAffinity + nodeSelectorTerms, which is the 'new and improved' way of doing node selectors.
Describe the solution you'd like
I would like for there to be a global configuration at the top-level values.yaml
example, node selectors:
global:
nodeSelectors:
"astronomer.io/multi-tenant": "false"
"astronomer.io/another-one": "ok"
I want this to end up on the containers using affinity.
# in the container spec
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "astronomer.io/multi-tenant"
operator: In
values:
- "false"
- matchExpressions:
- key: "astronomer.io/another-one"
operator: In
values:
- "ok"
example, tolerations:
values.yaml
global:
tolerations:
- key: "platform"
operator: "Equal"
value: "true"
effect: "NoSchedule"
outcome:
# (output of kubectl describe pod command on any component)
Tolerations:
platform=true:NoSchedule
Describe alternatives you've considered
I have noticed some configurations like this exist in the subcharts. I will try to do it by configuring each sub-chart appropriately by passing the values from the top-level chart to the subcharts.
How would users interact with this feature?
helm values.
Describe the bug
In certain cases, spot node labels are not getting picked up correctly on the real-time view. In these cases, getConfig() is not correctly reporting what's provided in settings.
This should be easy to track down, because it's not related to the cost-model directly. Let's push for a fix in our next build.
The go binary in the production docker images is showing up as +dirty
?
I0430 19:44:55.279596 1 main.go:202] Starting cost-model (git commit "bd779830c98be5b101f2d9f0fe9b1e1f1fcea78f+dirty")```
Today, the commercial Kubecost product allocates out-of-cluster AWS costs by a fixed set of tags (e.g. kubernetes_namespace
). We should instead allow these to be easily configured on the frontend.
We should also support the ability to have multiple tags per individual Kubernetes concept (e.g. k8s_namespace and k8s/ns). This part can be considered out of scope for this issue if necessary.
This will allow us to consistently support spot and other pricing options provided by the cost-model.
Seeing new versions of urllib and openssl that we can upgrade to.
In v1.35, there's a regression that prevents deleted clusters from being properly removed from local storage. The result is that clusters are removed from the DOM but then unexpectedly reappear after page reload.
This issue is causing javascript parsing errors for browsers that we were released before March 2017.
Several users have reported our Ingress as hard to use. Here are my initial ideas on improving:
servicePort
correct?Describe the bug
cff5c39 introduced a persistent volume claim to the postgres deployment, but there is no associate PVC creation. This blocks the deployment from creating.
To Reproduce
Steps to reproduce the behavior:
remoteWrite.postgres.enabled
set to trueExpected behavior
I would expect there to be a PVC template either inline with the deployment or as a separate file in a manner similar to the cost-analyzer PVC template.
Talking to a user, it would be helpful for us to show column totals on the cost allocation page. Otherwise, teams have to export data to excel in order to determine total spend by asset class across all namespaces, etc.
What problem are you trying to solve?
I want to be able to run kubecost on a specific node
Thanks!
We just need to remove the slack on the getContainerAddress I believe.
Hi,
I need price listings in €(EURO), I am able to enable Custom Pricing and able to change it to euro. But some of the tabs still needs to be modified as below:
1). The Savings panel/tab still showing price details in $(Dollar).
2). The 'Switch Clusters' tab showing the total price / month in $ instead of €.
3). After exporting the price to CSV files via allocation tab, The CSV is taking garbage value of '€' instead of '€'.
Kindly address these fix in your next release.
Because multiple applications installed in a single cluster all want to have visibility via Prometheus, Grafana, and Alertmanager, CoreOS has developed the Prometheus-Operator chart which installs a single instance of these tools that can be shared by all the applications in the cluster. Following the principle of Kubernetes that each application should specify its own resource needs, Prometheus Operator includes new Custom Resources that allow applications to add configuration items to Prometheus, Grafana, and Alertmanager.
I am grateful that you did this work for the Grafana dashboards and ask that you now do it for ServiceMonitors (which replace scrape_configs
) and PrometheusRules.
One tricky thing is that ServiceMonoitors
are meant to monitor Services
, although technically they monitor Endpoints
that are typically automatically created by Kubernenets for Services
. The important point is that ServiceMonitors
target endpoints to scrape via LabelSelectors
, which means your Services
need to have stable and unique labels. Currently, your kubecost-cost-analyzer
Service
, which is what you target with your custom scrape_config
and I now want you to target with a ServiceMonitor
, only has 1 label: chart: cost-analyzer-1.26.0
. This is not robust enough for this purpose. I suggest you add the app: cost-analyzer
label to it as well. With that, the following ServiceMonitor
should work:
kind: ServiceMonitor
metadata:
labels:
app: prometheus-operator
release: prometheus-operator
name: cost-analyzer-model
spec:
endpoints:
- honorLabels: true
interval: 1m
metrics_path: /metrics
port: cost-analyzer-model
scheme: http
scrapeTimeout: 10s
selector:
matchLabels:
app: cost-analyzer
I don't know if you have PrometheusRules
meant for publishing. If the ones you have are just examples so your Prometheus installation is not empty, then they do not need to be converted to PrometheusRule
resources. However, if you have some you want everyone to use, then please install them as PrometheusRule
resources rather than as configuration files.
It should already work and implementation would be as simple as setting
server.statefulSet.enabled in the prometheus block of values.yaml
Considerations would be statefulsets were only GA as of k8s 1.9
For testing purposes, it makes it easier if I don't have to worry about provisioning persistent storage.
It would be nice to have an option in the helm values file to use emptyDir volumes instead of persistent volume claims.
This is happening for both requests & usage
As of this issue, none of the resources:
blocks contain limits:
clauses, meaning those containers can grow without bound
That's not an idea situation, and if the memory requests are only 55MB then setting a 256M upper bound seems like a perfectly reasonable starting point -- although the correct answer is influenced by observing the actual memory pressure of a running instance
Once the service monitor template is rendered, the selector includes an app: cost-analyzer
label that is not included in the set of common labels, so it does not work out of the box.
cost-analyzer-helm-chart/cost-analyzer/templates/_helpers.tpl
Lines 46 to 61 in 004faa4
I've confirmed that either removing the selector from the ServiceMonitor or adding the label to the service fixes the problem.
I see no image tags here: https://github.com/kubecost/cost-analyzer-helm-chart/blob/master/cost-analyzer/values.yaml#L24.
I'm using https://github.com/keel-hq/keel and very helpful when there is defined image repo and tag to be able to easily do updates.
Thx!
Currently breaking service discovery.
Brian Kruger is unable to connect to /api directly or any other endpoint that talks to other Kubecost containers. Let's add logs + tools to the frontend container to help debug this.
From: @mdaniel the "Usage Type" in the /resources.html dialog shows "On-demand" for a spot instance.
The specific request is for namespace emails but this applies to all.
this will make it easier for
Would be nice to be able to configure the imagePullPolicy to save on startup speed when a given node has already previously pulled a given image
What problem are you trying to solve?
Looking to run kubecost on-premise (referencing cloud costs) as training for teams to see cloud costs representation, and test out kube-cost capabilities for an enterprise offering that will run partially internally.
Describe the solution you'd like
Separate kube-cost yaml/chart for with common NFS storage-provider resources added.
Ex.
spec:
# Add the server as an NFS volume for the pod
volumes:
- name: {{volume-name}}
nfs:
# URL for the NFS server
server: {{ server_host_variable}} # Change this!
path: {{path_variable}}
Describe alternatives you've considered
Forking repository and updating existing kube-cost.yaml file to use nfs configuration.
How would users interact with this feature?
This would just allow developers the ability to deploy to on-premise and local clusters without a storage driver. It is relatively simple to create a set of NFS servers for a small cluster and stand up apps to utilize it. Using no complicated storage providers, cloud or cluster filesystems like gluster/ceph.
APP VERSION: 42
SOURCE: /detail.html
jQuery.Deferred exception: null is not an object (evaluating 'container.pvcData[0].persistentVolume.hourlyCost'),http://localhost:9090/helper.js:589:66
each@https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js:2:2627
aggregateCostModelByField@http://localhost:9090/helper.js:541:13
printByNamespace@http://localhost:9090/detail.html:475:57
http://localhost:9090/detail.html:1386:25
l@https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js:2:29380
https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js:2:29678 -- no stack
This should make it much easier to look at trends over time for these metrics. Here are my proposals:
record: kubecost_container_memory_working_set_bytes
Expr: sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""}) by (container_name,pod_name,namespace)
record: kubecost_cluster_memory_working_set_bytes
Expr: sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""})
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.