GithubHelp home page GithubHelp logo

c4po / harbor_exporter Goto Github PK

View Code? Open in Web Editor NEW
84.0 6.0 51.0 6.96 MB

Harbor prometheus exporter

License: Apache License 2.0

Dockerfile 0.91% Makefile 1.82% Go 97.26%
prometheus docker harbor kubernetes grafana

harbor_exporter's People

Contributors

c4po avatar cjnosal avatar cmur2 avatar czenker avatar dkulchinsky avatar ermakovdmitriy avatar gui13 avatar gwiersma avatar nick-triller avatar pja-kit avatar thechristschn avatar v9rt3x avatar warroyo avatar wusphinx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

harbor_exporter's Issues

Single pod running multiple containers

Hi,
I made some changes in your code, and I need to know if it's ok for you.

that change:

this: listenAddress = kingpin.Flag("web.listen-address", "Address to listen on for web interface and telemetry.").Default(":9107").String()
to: listenAddress = kingpin.Flag("web.listen-address", "Address to listen on for web interface and telemetry.").Envar("EXPORTER_PORT").Default(":9107").String()

Add this to harbor-exporter.yaml
- name: EXPORTER_PORT
value: ":PORT" # change this if you need to run your pod in another port or multiple containers in a single pod

Thats important to use **:**PORT, I dont know how to solve this :(

Best Regards
Tiago Machado

Volume Metrics are not Generated

When We ran the exporter, came across the below error -

level=info ts=2021-06-14T16:48:42.261Z caller=harbor_exporter.go:175 msg="check v1 with /api/systeminfo" code=404
level=info ts=2021-06-14T16:48:42.307Z caller=harbor_exporter.go:186 msg="check v2 with /api/v2.0/systeminfo" code=200
level=info ts=2021-06-14T16:48:42.308Z caller=harbor_exporter.go:302 msg="Listening on address" address=:9107
level=error ts=2021-06-14T16:49:44.161Z caller=metrics_systemvolumes.go:19 json:cannotunmarshalarrayintoGostructfieldsystemVolumesMetric.Storageoftypestruct{Totalfloat64;Freefloat64}=(MISSING)

How to debug the issue related to : "cannot get harbor api version" err="unable to determine harbor version"

image


level=info ts=2020-09-05T03:13:22.421Z caller=harbor_exporter.go:200 msg="check v1 with /api/systeminfo" err="Get https://myregistry.com/api/systeminfo: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
level=info ts=2020-09-05T03:13:32.422Z caller=harbor_exporter.go:211 msg="check v2 with /api/v2.0/systeminfo" erro="Get https://myregistry.com/api/v2.0/systeminfo: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
level=error ts=2020-09-05T03:13:32.422Z caller=harbor_exporter.go:305 msg="cannot get harbor api version" err="unable to determine harbor version"
level=info ts=2020-09-05T03:13:32.946Z caller=harbor_exporter.go:286 msg="Starting harbor_exporter" version="(version=v0.5.2, branch=master, revision=69bff757da5d6676ea42f06064e0fd4dc3ea3820)"
level=info ts=2020-09-05T03:13:32.946Z caller=harbor_exporter.go:287 build_context="(go=go1.13.15, user=root, date=2020-08-26-18:43:17-UTC)"

Quotas metrics are not showing

Hi there,

I was using harbor v2.0.1 and exporter v0.4.0 before, and everything works well.
But recently I tried to upgrade harbor to v2.1.0 and found the harobr_up metric keeping 0 with exporter v0.4.0.
Then I changed the exporter to latest version(v0.5.8), and harobr_up is back to 1 as expected.
However, quotas metrics (harbor_quotas_*) are not showing anymore. Do you know the possible reason or any idea?
I used docker to run exporter and didn't use "--skip.metrics" flag.

All the components looks good...

# TYPE harbor_components_health gauge
harbor_components_health{component="chartmuseum"} 1
harbor_components_health{component="core"} 1
harbor_components_health{component="database"} 1
harbor_components_health{component="jobservice"} 1
harbor_components_health{component="portal"} 1
harbor_components_health{component="redis"} 1
harbor_components_health{component="registry"} 1
harbor_components_health{component="registryctl"} 1

Thank you in advance.

Error handling request for /repositories

Latest build is throwing the below error -

level=error ts=2020-07-14T03:54:05.982Z caller=harbor_exporter.go:112 msg="Error handling request for /repositories?project_id=81" http-statuscode="404 Not Found"
level=error ts=2020-07-14T03:54:05.982Z caller=metrics_repositories.go:56 unexpectedendofJSONinput=(MISSING)

Code at Line 52 of metrics_repositories.go seems to be updated as per the API

/projects/{project_name}/repositories 

"422 Unprocessable Entity" error when scraping "/projects?page=1&page_size=500" with default HARBOR_PAGESIZE (500)

Hey folks,

Been testing the latest version with pagination on a Harbor v2.1.0 deployment and with the default HARBOR_PAGESIZE=500, we're getting the following error:

level=error ts=2020-10-05T13:12:30.165Z caller=harbor_exporter.go:249 msg="Error handling request for /projects?page=1&page_size=500" http-statuscode="422 Unprocessable Entity"
level=error ts=2020-10-05T13:12:30.165Z caller=metrics_repositories.go:64 unexpectedendofJSONinput=(MISSING)

adjusting HARBOR_PAGESIZE to 100 "fixes" the issue.

as far as I can tell, the API does support a max of 500: https://github.com/goharbor/harbor/blob/58b7242a2551205428584b24bf11f1fd7d2b69fd/src/common/api/base.go#L32-L33

but any value >100 results in above error and missing repositories metrics (it appears only /projects endpoint is affected).

the API swagger does indeed suggest the max is 100:
https://github.com/goharbor/harbor/blob/5293c8ff4b6778cd79cda156dff994929517a674/api/v2.0/swagger.yaml#L1499-L1507

/cc @c4po
/cc @xtreme-conor-nosal

404 error for repository with slash in name

Hello!
I have repository with this name:

sberclass_dev/dataspace-core-sdp

On scrape metrics exporter logs this error:

harbor_exporter | level=error ts=2021-03-05T12:10:05.478Z caller=harbor_exporter.go:263 msg="Error handling request for /projects/dataspace/repositories/sberclass_dev%2Fdataspace-core-sdp/artifacts?with_tag=true&with_scan_overview=true&page=1&page_size=100" http-statuscode="404 Not Found"

Looks like the slash is incorrectly encoded. This issue can help you: goharbor/harbor#12224

harbor-exporter version: 0.6.3

use multiple collectors to reduce /metrics latency

The prometheus /gatherer spins off a goroutine for every Collector to collect metrics in parallel (https://github.com/prometheus/client_golang/blob/master/prometheus/registry.go#L440)

The harbor exporter is registered as a single collector that grabs all the metrics serially (https://github.com/c4po/harbor_exporter/blob/master/harbor_exporter.go#L310)

In my deployment scraping /metrics regularly takes 20-30 seconds, exceeding the default interval and scrape timeouts of 10 seconds.

harbor_repositories_tags_total metric is not working for Harbor 2

After we've upgraded Harbor to v2.0.1 and harbor-exporter to v0.4.0, we found out the tag count metric harbor_repositories_tags_total is not working anymore (as it has been working with Harbor 1.x)

This metric is very useful with Harbor 2, since Harbor has dropped the native tag count support for Quotas, and without alerting using metrics from the exporter it's not possible get warned before a repository has blown up due to an extensive number of tags.

Since Harbor has shifted focus from tags to a more generic concept of artifacts, would it be possible to get metrics for the count of those per repo? They seem to be included in the response of /api/v2.0/projects/<...>/repositories

Prometheus alert rules

Would be nice to add prometheus alerts for metrics like:
harbor_up, harbor_system_volumes_bytes{storage="free"}, harbor_health, harbor_components_*, go_memstats_frees_total, harbor_image_vulnerability

prometheusRule:
  labels:
    app: prometheus-operator
    release: prometheus-operator
  rules:
    - alert: HarborImageScanVulnerability
      expr: harbor_image_vulnerability > 3
      for: 5m
      labels:
        severity: critical
      annotations:
        description: Harbor Image Scan Vulnerability Severity Critical
        summary: Image {{ "{{ $labels.image_name }}" }} has critical vulnerabilities

harbor v2.1.3 pannic error

harbor version: v2.1.3-b6de84c5

error info:
level=info ts=2021-02-07T08:19:52.983Z caller=harbor_exporter.go:460 metrics_group=systeminfo collect=true
level=info ts=2021-02-07T08:19:52.983Z caller=harbor_exporter.go:460 metrics_group=artifacts collect=true
level=info ts=2021-02-07T08:19:52.983Z caller=harbor_exporter.go:460 metrics_group=health collect=true
level=info ts=2021-02-07T08:19:52.983Z caller=harbor_exporter.go:460 metrics_group=scans collect=true
level=info ts=2021-02-07T08:19:52.983Z caller=harbor_exporter.go:460 metrics_group=statistics collect=true
level=info ts=2021-02-07T08:19:52.983Z caller=harbor_exporter.go:460 metrics_group=quotas collect=true
level=info ts=2021-02-07T08:19:52.983Z caller=harbor_exporter.go:460 metrics_group=repositories collect=true
level=info ts=2021-02-07T08:19:52.983Z caller=harbor_exporter.go:460 metrics_group=replication collect=true
level=info ts=2021-02-07T08:19:52.998Z caller=harbor_exporter.go:278 msg="check v1 with /api/systeminfo" code=404
level=info ts=2021-02-07T08:19:53.010Z caller=harbor_exporter.go:289 msg="check v2 with /api/v2.0/systeminfo" code=200
panic: descriptor Desc{fqName: "harbor_harbor-registry_artifacts_vulnerabilities_scan_start", help: "Vulnerabilities scan start time", constLabels: {}, variableLabels: [project_name project_id repo_name repo_id artifact_name artifact_id report_id]} is invalid: "harbor_harbor-registry_artifacts_vulnerabilities_scan_start" is not a valid metric name

goroutine 1 [running]:
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0xc0002186e0, 0xc000509180, 0x1, 0x1)
/root/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/registry.go:401 +0xad
github.com/prometheus/client_golang/prometheus.MustRegister(...)
/root/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/registry.go:178
main.main()
/root/docker-file/harbor-exporter/harbor_exporter-master/harbor_exporter.go:473 +0x24f5

Duplicate values while scraping the metrics through Exporter

level=error ts=2020-03-17T13:58:20.891Z caller=harbor_exporter.go:111 msg="error gathering metrics:8 error(s) occurred:\n* collected metric "harbor_quotas_count_total" { label:<name:"repo_id" value:"0" > label:<name:"repo_name" value:"" > label:<name:"type" value:"hard" > gauge:<value:-1 > } was collected before with the same name and label values\n* collected metric "harbor_quotas_count_total" { label:<name:"repo_id" value:"0" > label:<name:"repo_name" value:"" > label:<name:"type" value:"used" > gauge:<value:0 > } was collected before with the same name and label values\n* collected metric

Question: What's the best way to deploy?

I have got a harbor cluster deployed outside of k8s.
I am running multiple harbor nodes fronted by a load balancer, using the same storage.

Should I be running this exporter within k8s or run it on each harbor node (as either a separate systemd unit or club it with harbor's docker-compose)?

Kindly share some insights considering system availability and performance.

Getting Error : unsupported protocol scheme

exporter version : v0.3.1, branch=master (kubernetes)
harbor version : Version v1.10.1-f3e11715

level=info ts=2020-05-26T15:32:03.751Z caller=harbor_exporter.go:300 msg="Starting harbor_exporter" version="(version=v0.3.1, branch=master, revision=5755c2d97a2fd5309fa883ad11aeea74bca24942) level=info ts=2020-05-26T15:32:03.752Z caller=harbor_exporter.go:301 build_context="(go=go1.13.9, user=root, date=2020-04-03-14:14:06-UTC)" level=info ts=2020-05-26T15:32:03.865Z caller=harbor_exporter.go:343 msg="Listening on address" address=:9107 level=error ts=2020-05-26T15:32:16.692Z caller=harbor_exporter.go:105 msg="Error handling request for /api/scans/all/metrics" err="Get registry-harbor-core/api/scans/all/metrics: unsupported protocol scheme \"\""

include the tag name in the artifacts_* metrics

Happy to see the new artifacts_* metrics, great stuff 👍🏼

I was wondering if it would be worth adding the tag name as a label? the artifact_name includes the artifact's digest but tags are more human readable I'd say.

Thoughts?

Wrong trimming of artifact name

When collection metrics for a Harbor repository with a name starting with 'c' doesn't work.
A reponame:
dockerhub-proxy/c4po/harbor-exporter:0.6.2
causes a request to:
/api/v2.0/projects/dockerhub-proxy/repositories/4po%2Fharbor-exporter/artifacts?with_tag=true&with_scan_overview=true
and not the expected:
/api/v2.0/projects/dockerhub-proxy/repositories/c4po%2Fharbor-exporter/artifacts?with_tag=true&with_scan_overview=true

Therefor metrics for that repository will be missing. The causing line seems to be this one:

"/repositories/" + url.PathEscape(strings.TrimLeft(repoData[i].Name, projectName+"/")) +

TrimPrefix should be used instead og TrimLeft

unsupported protocol scheme \"\"", using the yaml for a kubernetes cluster.

Hello,

overview:

I'm trying to export some metrics from Harbor to prometheus. I deployed Harbor using the Helm Chart on a Kubernetes cluster. So I used the yaml file given to deploy the harbor exporter on my kubernetes cluster. I only get two metrics related to harbor. The Harbor instance is configurated with HTTPS.

Environment:

OS : Centos 7
Kubernetes version : 1.16.7
Chart Helm version : harbor-1.3.1
Harbor version: 1.10.1
Helm version: 2
Harbor-exporter: v0.3.1

Expected Behavior:

When deploying the yaml file for harbor exporter in the namespace where my harbor instance is located and changing every prefix in the yaml file, I should get all the metrics listed in the documentation of the harbor exporter.

Actual behavior:

I deploy the harbor exporter and i get some metrics about go and two metrics regarding harbor: harbor_harbor_up and harbor_exporter_build_info.

Steps to reproduce the problem:

First deploy Harbor using the Helm chart on a kubernetes cluster. Then change the prefix and the namespace in the yaml file of the harbor exporter and deploy it in the same namespace as the Harbor instance. Get the logs of the harbor-exporter pod.

Logs:

level=error ts=2020-04-16T10:10:51.200Z caller=harbor_exporter.go:105 msg="Error handling request for /api/systeminfo/volumes" err="Get harboor-harbor-core/api/systeminfo/volumes: unsupported protocol scheme \"\""
level=error ts=2020-04-16T10:10:51.200Z caller=metrics_systemvolumes.go:19 unexpectedendofJSONinput=(MISSING)
level=error ts=2020-04-16T10:10:51.200Z caller=harbor_exporter.go:105 msg="Error handling request for /api/projects" err="Get harboor-harbor-core/api/projects: unsupported protocol scheme \"\""
level=error ts=2020-04-16T10:10:51.200Z caller=metrics_repositories.go:45 unexpectedendofJSONinput=(MISSING)
level=error ts=2020-04-16T10:10:51.200Z caller=harbor_exporter.go:105 msg="Error handling request for /api/replication/policies" err="Get harboor-harbor-core/api/replication/policies: unsupported protocol scheme \"\""
level=error ts=2020-04-16T10:10:51.200Z caller=metrics_replications.go:30 msg="Error retrieving replication policies" err="unexpected end of JSON input"

Some projects are not visible in the vulnerability stats.

I have a lot of docker images, so I have a lot of scan reports.

From what I can see, the exporter does not gather ALL vulnerabilitie: I have an image that has more than 100 vulns, and curling the metrics endpoint does not show it to me.

Could this be that the gathering of stats is never capable of retrieving all my report?

can't scrape metrics from harbor-exporter, it looks like timeout, not response.

run log:
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:447 CacheEnabled=true
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:448 CacheDuration=20s
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:450 msg="Starting harbor_exporter" version="(version=v0.6.4, branch=master, revision=052f98a8fabafcfc116b359914da1f978b10203d)"
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:451 build_context="(go=go1.13.15, user=root, date=2021-03-23-15:28:44-UTC)"
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:462 metrics_group=repositories collect=true
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:462 metrics_group=replication collect=true
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:462 metrics_group=systeminfo collect=true
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:462 metrics_group=artifacts collect=true
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:462 metrics_group=health collect=true
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:462 metrics_group=scans collect=true
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:462 metrics_group=statistics collect=true
level=info ts=2021-04-01T04:33:57.757Z caller=harbor_exporter.go:462 metrics_group=quotas collect=true
level=info ts=2021-04-01T04:33:57.770Z caller=harbor_exporter.go:280 msg="check v1 with /api/systeminfo" code=404
level=info ts=2021-04-01T04:33:57.860Z caller=harbor_exporter.go:291 msg="check v2 with /api/v2.0/systeminfo" code=200
level=info ts=2021-04-01T04:33:57.860Z caller=harbor_exporter.go:500 msg="Listening on address" address=:9107

prometheus error:
context deadline exceeded

harbor version:2.1.4

only first page of harbor results are fetched

Harbor APIs are paginated. If the number of projects/repos/etc exceeds the default page size then not all metrics will be exported.

A quick fix might by to set an excessively large default page size (or expose as an environment variable) but that might have performance impacts. A more rigorous solution will need to loop over each page of results.

volume free/used metrics

Hello!
I tried to use this solution to get free/used space, but it doesn't show any results.
Could you please help me to sort out this issue?
I use 0.5.9 version. Some metrics e.g. repo count, up/down status, scan quantity works fine
image

Repositories and statitics metrics got zero values

Well I'm using harbor v2.1.3 with the harbor exporter v0.6.1, statistics and repositories api work well, but I have this

# HELP harbor_project_count_total projects number relevant to the user
# TYPE harbor_project_count_total gauge
harbor_project_count_total{type="private_project"} 0
harbor_project_count_total{type="public_project"} 0
harbor_project_count_total{type="total_project"} 0
# HELP harbor_repo_count_total repositories number relevant to the user
# TYPE harbor_repo_count_total gauge
harbor_repo_count_total{type="private_repo"} 0
harbor_repo_count_total{type="public_repo"} 0
harbor_repo_count_total{type="total_repo"} 0
harbor_repositories_pull_total{repo_id="1",repo_name="postgres"} 0
harbor_repositories_pull_total{repo_id="2",repo_name="mongodb"} 0
harbor_repositories_pull_total{repo_id="3",repo_name="express-boilerplate"} 0
harbor_repositories_tags_total{repo_id="1",repo_name="postgres"} 0
harbor_repositories_tags_total{repo_id="2",repo_name="mongodb"} 0
harbor_repositories_tags_total{repo_id="3",repo_name="express-boilerplate"} 0

After get in touch of this metrics metrics_statistics.go, all is good after this line

if err := json.Unmarshal(body, &data); err != nil {

it seems some kind issue with the json parse, for harbor_repositories_pull_total and harbor_repositories_tags_total maybe same to.

Thanks

Docker container crashed

Hi,

I tried to run the service as docker but it's throwing below error message and docker dead while hit ipaddress:9107

level=info ts=2020-02-13T08:35:01.260Z caller=harbor_exporter.go:261 msg="Starting harbor_exporter" version="(version=v0.2, branch=master, revision=70e0d756ff79908745b4395439df31cb57c1f7eb)"
level=info ts=2020-02-13T08:35:01.260Z caller=harbor_exporter.go:262 build_context="(go=go1.13.7, user=root, date=2020-02-13-07:48:34-UTC)"
level=info ts=2020-02-13T08:35:01.327Z caller=harbor_exporter.go:304 msg="Listening on address" address=:9107

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x877144]
goroutine 39 [running]:
main.HarborClient.request(0xc00018c570, 0x7ffe685d2f21, 0x1c, 0xc000022030, 0xe, 0xc000022050, 0xa, 0x1dcd6500, 0x0, 0xa48680, ...)
/src/harbor_exporter.go:146 +0x3a4
main.(*Exporter).collectScanMetric(0xc0000a18c0, 0xc000104060, 0xc0003aaea8)
/src/metrics_scan.go:19 +0xac
main.(*Exporter).Collect(0xc0000a18c0, 0xc000104060)
/src/harbor_exporter.go:221 +0x3c
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1()
/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/registry.go:443 +0x19d
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/registry.go:454 +0x57d

Version 0.5.3 crashes during operation

Here's the error I see after the upgrade to 0.5.3:

level=info ts=2020-09-09T10:30:07.803Z caller=harbor_exporter.go:333 CacheEnabled=false
level=info ts=2020-09-09T10:30:07.803Z caller=harbor_exporter.go:334 CacheDuration=20s
level=info ts=2020-09-09T10:30:07.803Z caller=harbor_exporter.go:336 msg="Starting harbor_exporter" version="(version=v0.5.3, branch=master, revision=738e855f8164aefa8e65b8fa996054c984fc552c)"
level=info ts=2020-09-09T10:30:07.804Z caller=harbor_exporter.go:337 build_context="(go=go1.13.15, user=root, date=2020-09-07-01:52:25-UTC)"
level=info ts=2020-09-09T10:30:07.804Z caller=harbor_exporter.go:348 metrics_group=quotas collect=true
level=info ts=2020-09-09T10:30:07.804Z caller=harbor_exporter.go:348 metrics_group=repositories collect=true
level=info ts=2020-09-09T10:30:07.804Z caller=harbor_exporter.go:348 metrics_group=replication collect=true
level=info ts=2020-09-09T10:30:07.804Z caller=harbor_exporter.go:348 metrics_group=scans collect=true
level=info ts=2020-09-09T10:30:07.804Z caller=harbor_exporter.go:348 metrics_group=statistics collect=true
level=info ts=2020-09-09T10:30:08.607Z caller=harbor_exporter.go:211 msg="check v1 with /api/systeminfo" code=404
level=info ts=2020-09-09T10:30:08.617Z caller=harbor_exporter.go:222 msg="check v2 with /api/v2.0/systeminfo" code=200
level=info ts=2020-09-09T10:30:08.617Z caller=harbor_exporter.go:386 msg="Listening on address" address=:9107
panic: send on closed channel

goroutine 75 [running]:
main.(*HarborExporter).Collect.func1(0xc00006b020, 0xc00005ea80, 0xc00015a000)
	/src/harbor_exporter.go:266 +0x90
created by main.(*HarborExporter).Collect
	/src/harbor_exporter.go:264 +0x232

0.5.2 works fine.

Harbor version: 2.0.1

Issues with replication status

There are some issues with the replication status: it is alerting far too often.
We found two situations where there is an alert:

  1. When a replication execution is in progress it is not successful. Then, when the next metric is retrieved, another replication is in progress, and so on, the overall status of the Harbor replications is unsuccessful.
    A fix would be to check the before-last execution of a replication policy in case the last one is in progress.
  2. When a replication policy is manual or disabled, it is possible that the last execution (maybe a long time ago) happened to fail. Currently all policies are taken into account for the metrics, so the overall status would be that the replications fail.
    This is not very logical. It is not necessary to monitor manual or disabled replications.

Application Crashes

The application starts up correctly, but when I do curl http://localhost:9107/metrics

The application crashes. It might be because we are using OIDC but I am not sure.

Error Message + Stack Trace:

level=info ts=2020-09-07T21:55:29.794Z caller=harbor_exporter.go:333 CacheEnabled=false
level=info ts=2020-09-07T21:55:29.794Z caller=harbor_exporter.go:334 CacheDuration=20s
level=info ts=2020-09-07T21:55:29.795Z caller=harbor_exporter.go:336 msg="Starting harbor_exporter" version="(version=v0.5.3, branch=master, revision=738e855f8164aefa8e65b8fa996054c984fc552c)"
level=info ts=2020-09-07T21:55:29.795Z caller=harbor_exporter.go:337 build_context="(go=go1.13.15, user=root, date=2020-09-07-01:52:25-UTC)"
level=info ts=2020-09-07T21:55:29.795Z caller=harbor_exporter.go:348 metrics_group=scans collect=true
level=info ts=2020-09-07T21:55:29.795Z caller=harbor_exporter.go:348 metrics_group=statistics collect=true
level=info ts=2020-09-07T21:55:29.795Z caller=harbor_exporter.go:348 metrics_group=quotas collect=true
level=info ts=2020-09-07T21:55:29.795Z caller=harbor_exporter.go:348 metrics_group=repositories collect=true
level=info ts=2020-09-07T21:55:29.797Z caller=harbor_exporter.go:348 metrics_group=replication collect=true
level=info ts=2020-09-07T21:55:30.041Z caller=harbor_exporter.go:211 msg="check v1 with /api/systeminfo" code=404
level=info ts=2020-09-07T21:55:30.244Z caller=harbor_exporter.go:222 msg="check v2 with /api/v2.0/systeminfo" code=200
level=info ts=2020-09-07T21:55:30.244Z caller=harbor_exporter.go:386 msg="Listening on address" address=:9107
panic: send on closed channel

goroutine 44 [running]:
main.(*HarborExporter).Collect.func1(0xc0001aed80, 0xc0001d4d20, 0xc0001a8000)
	/src/harbor_exporter.go:266 +0x90
created by main.(*HarborExporter).Collect
	/src/harbor_exporter.go:264 +0x232

grafana/harbor-overview.json

Good day! I use JSON for Grafana and the storage panel does not work - there is no variable there - harbor_quotas_size_bytes. how to fix?

401 invalid credentials

kubernets:1.14

error:
=error ts=2020-05-25T07:43:23.053Z caller=harbor_exporter.go:109 msg="Error handling request for /api/scans/all/metrics" http-statuscode="404 Not Found"
level=error ts=2020-05-25T07:43:23.053Z caller=metrics_scan.go:23 unexpectedendofJSONinput=(MISSING)
level=error ts=2020-05-25T07:43:23.059Z caller=harbor_exporter.go:109 msg="Error handling request for /api/statistics" http-statuscode="401 Unauthorized"
level=error ts=2020-05-25T07:43:23.060Z caller=metrics_statistics.go:25 unexpectedendofJSONinput=(MISSING)
level=error ts=2020-05-25T07:43:23.066Z caller=harbor_exporter.go:109 msg="Error handling request for /api/quotas" http-statuscode="401 Unauthorized"
level=error ts=2020-05-25T07:43:23.066Z caller=metrics_quotas.go:35 unexpectedendofJSONinput=(MISSING)
level=error ts=2020-05-25T07:43:23.071Z caller=harbor_exporter.go:109 msg="Error handling request for /api/systeminfo/volumes" http-statuscode="401 Unauthorized"
level=error ts=2020-05-25T07:43:23.071Z caller=metrics_systemvolumes.go:19 unexpectedendofJSONinput=(MISSING)
level=error ts=2020-05-25T07:43:23.446Z caller=harbor_exporter.go:109 msg="Error handling request for /api/replication/policies" http-statuscode="401 Unauthorized"
level=error ts=2020-05-25T07:43:23.446Z caller=metrics_replications.go:30 msg="Error retrieving replication policies" err="unexpected end of JSON input"

user:admin
password is corect

help me please,tks

Possible memory leak leading to recurring OOMKilled

Hey there @c4po!

We've been testing the exporter in our sandbox environment for few weeks now and it looks like there might be a memory leak causing OOMKilled condition.

Our Harbor setup runs in K8s (GKE), harbor-exporter runs in a separate pod with the following resource allocations:

    Limits:
      cpu:     400m
      memory:  512Mi
    Requests:
      cpu:      100m
      memory:   64Mi

this should reasonable, especially considering the sandbox deployment has 2 projects and a handful of repos (no replications).

Looking at the memory usage graph it seems to suggest that memory usage grows pretty quickly over time until the limit is reached and the pod is OOMkilled.
image

I didn't try to debug the root cause yet, going to try and enable metric caching and see if that helps, will report back.

If there's any additional info you need, please let me know.

Collect metrics periodically instead of on scrape

Hi,

collecting metrics takes a long time for Harbor installations with a large amount of content. We are at a point where Prometheus scrapes time out when the exporter invalidates the cached metrics because collecting the metrics takes too long. This might be the problem in #92, too.

I propose implementing continuous, periodic metrics collection that is independent from scrape requests. Scrapes receive the last set of collected metrics. This way scrape requests can be responded to quickly at all times. Additionally, load on Harbor can be reduced because metrics collection doesn't have to happen as fast as possible anymore.

@c4po I am happy to implement this solution if it's a contribution you will accept.

Best
Nick

Add possibility to collect only specific metrics

I noticed that harbor_exporter makes Harbor use significantly more resources when harbor_exporter is running. A few times the harbor-core container ended up in a CrashLoopBackOff. Adding more resources might help, increasing the interval in the servicemonitor also helps, but in the end that's just attacking symptoms.
The root cause, of course, it that the harbor_exporter is sending a burst of API calls on every interval. Would the metrics be built into Harbor then you don't need the overhead of API calls, you could make the collection of the metrics more efficient and you could spread that collection of metrics over time. But well, there have been many requests for metrics in Harbor over almost 3 years and no one's building it...

So, some suggestions to improve harbor_exporter until Harbor exposes metrics itself:

  1. Add configuration options to enable or disable certain collectors. Then you can run the exporter for just the metrics you're interested in. Or you can run multiple instances of the exporter, each with a subset of the metrics and make separate servicemonitors for each exporter.
  2. Add query parameters to the /metrics endpoint so the servicemonitor can specify which metrics it wants. Then you can run only one exporter and define multiple servicemonitors. For example one that retrieves quota metrics every 30 sec (because you might need that often and it uses almost no resources in Harbor) and one that retrieves repository metrics (tags, pulls, stars) every 10 minutes (because you need that data, but know that it requires significant resources from your Harbor).

Option 2 seems the best option but for option 1 I know how I can implement it myself. And there might be better options still.

Interested to hear your opinion!

artifacts collector fails when there are repos without artifacts

Hey folks 👋🏼

Just upgraded to 0.6.0 and very excited about the new artifacts collector, however it seems it has a small issue where it doesn't handle the situation when repositories have no artifacts/images:

level=error ts=2021-01-26T14:41:24.282Z caller=harbor_exporter.go:263 msg="Error handling request for /projects/dockerhub/repositories/library/vault/artifacts?with_tag=true&with_scan_overview=true&page=1&page_size=100" http-statuscode="404 Not Found"
level=error ts=2021-01-26T14:41:24.282Z caller=metric_artifacts.go:296 unexpectedendofJSONinput=(MISSING)
level=error ts=2021-01-26T14:41:24.282Z caller=metric_artifacts.go:217 unexpectedendofJSONinput=(MISSING)

above is a proxy cache project with a bunch of repositories where images were deleted by the default retention rule (although we have similar conditions in non proxy cache projects).

I haven't looked at the code, but I suppose it doesn't account for the possibility where a repo might not have any artifacts/images.

I'll take a look at the code, see if I can spot anything obvious.

/cc @ErmakovDmitriy, @c4po

Exporter OOMKilled all the time

Docker image tags tested: debug latest v0.5.3 v0.5.2

Harbor version: v2.1.0

Memory Limits: 256Mi

Pod is always restarting due to OOMKill after some time of scraping. Maybe there is a leak to be caught in here? No logs showing some error or something not normal.
Screenshot from 2020-09-23 09-10-46

Feature request: add start_time, end_time & duration labels in replication_tasks metric

Hey there 👋🏼 was thinking it would be useful to add a few more labels to the replication_tasks metric, specifically:

  • start_time
  • end_time
  • duration (end_time - start_time)

The replications/executions API used by the collector already returns this information:

[
  {
    "id": 44743,
    "policy_id": 140,
    "status": "Succeed",
    "status_text": "",
    "total": 1,
    "failed": 0,
    "succeed": 1,
    "in_progress": 0,
    "stopped": 0,
    "trigger": "scheduled",
    "start_time": "2021-04-27T13:50:01.183509Z",
    "end_time": "2021-04-27T13:50:30Z"
  },
]

In our case, we would like to use this information:

  1. To detect replication tasks that take a long time to complete, possibly due to improper or too permissive name/tag filters or potentially an issues with the upstream registry
  2. track the replication as the get executed over time, we had some issues with replication policies stop being scheduled for some reason and this would be a good way to detect if we ever hit this again
  3. an easy way to identify how "recent" the replicated images are

Caching not configurable via env vars

Currently, caching can only be configured via command parameters.
To maintain all configuration in one place for deployments with docker, it would be great to make these settings configurable via environment variables.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.