GithubHelp home page GithubHelp logo

datadog / integrations-extras Goto Github PK

View Code? Open in Web Editor NEW
245.0 224.0 712.0 337.42 MB

Community developed integrations and plugins for the Datadog Agent.

License: BSD 3-Clause "New" or "Revised" License

Ruby 0.48% Python 98.52% Dockerfile 0.22% Shell 0.26% PHP 0.19% Go 0.29% Mako 0.05%

integrations-extras's Introduction

Datadog Agent and API Integrations

Build Status Code style - black

Collecting data is cheap; not having it when you need it can be very expensive. So we recommend instrumenting as much of your systems and applications as possible. This integrations repository will help you do that by making it easier to create and share new integrations for Datadog.

Note: Integrations in this repo are not included with the Agent, and are not currently packaged.

Building Integrations

For more information about how to build a new integration, please see the guide at docs.datadoghq.com.

Also see the agent integrations developer documentation for more information on guidelines, tutorials, and metadata.

Community Maintenance

Please note that integrations in this repository are maintained by the community. The current maintainer is listed in manifest.json and will address the pull requests and issues opened for that integration. Additionally, Datadog will assist on a best-effort basis, and will support the current maintainer whenever possible. When submitting a new integration, please indicate in the PR that you're willing to become the maintainer. For current maintainers, we understand circumstances change. If you're no longer able to maintain an integration, please notify us so we can find a new maintainer or mark the integration as orphaned. If you have any questions about the process, don't hesitate to contact us.

Submitting Your Integration

Once you have completed the development of your integration, submit a pull request to have Datadog review your integration. Once we've reviewed your integration, we will approve and merge your pull request or provide feedback and next steps required for approval.

Reporting Issues

For more information on integrations, please reference our documentation and knowledge base. You can also visit our help page to connect with us.

integrations-extras's People

Contributors

alai97 avatar alexandreyang avatar arapulido avatar bgoldberg122 avatar christinetchen avatar coignetp avatar cswatt avatar ericmustin avatar fanny-jiang avatar florentclarret avatar florianveaux avatar hello-pliant avatar hithwen avatar ian28223 avatar jeremy-lq avatar jstanton617 avatar l0k0ms avatar masci avatar nathanmadams avatar nmuesch avatar ofek avatar ruthnaebeck avatar sarah-witt avatar steveny91 avatar swang392 avatar therve avatar truthbk avatar tsein-bc avatar tyrannosaurus-becks avatar yzhan289 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

integrations-extras's Issues

About building and distributing integrations.(request)

Thank you very much for accepting my PR.

I immediately tried to install it in each environment, but I ran into a problem.

Our environment and that of our customers are independent, and there is no common build server or download repository.

Of course, we can build and upload to the repositories in each environment, but it's not eco system to build servers and download repositories for each environment, so I'm hoping for official distribution.

It seems that the archive for each tag is distributed now.

https://github.com/DataDog/integrations-extras/releases

I think many people would be happier with a whl distribution than a source distribution, as follows

https://github.com/withgod/integrations-extras/releases/tag/20201223
(I can use this URL, but it would be nice if you could do it officially.)

Please consider this!

Filebeat 6.3.2 and Datadog 6

I don't think this integration is working quite right as this is only returning the registry file information and the GAUGE_METRIC_NAMES but none of the INCREMENT_METRIC_NAMES seem to have any response when I run the check:
datadog-agent check filebeat --check-rate

It appears that everything is getting flagged in this block:

    def _should_keep_metric(self, name):
        if name not in self._should_keep_metrics:
            self._should_keep_metrics[name] = self._config.should_keep_metric(name)
        return self._should_keep_metrics[name]

And I should be matching at least some metrics:


instances:
  # The absolute path to the registry file used by filebeat
  # See https://www.elastic.co/guide/en/beats/filebeat/current/migration-registry-file.html
  - registry_file_path: /var/lib/filebeat/registry
    # If filebeat has been started with the `--httpprof [HOST]:PORT` option, then
    # the DD agent can gather data about the metrics filebeat exposes to
    # http://<host>:<port>/debug/vars
    # See https://www.elastic.co/guide/en/beats/filebeat/current/command-line-options.html
    # For autodiscovery, use http://%%host%%:%%port%%/stats
    stats_endpoint: http://localhost:5066/stats
    # If given, this should be a list of regular expressions that stipulates
    # which variables should be reported to Datadog - a filebeat metric
    # reported by the HTTP profiler will only be reported if it matches at
    # least one of those regexes
    only_metrics:
      - 'filebeat.harvester.closed'
      - ^filebeat
      - ^publish\.events$
    # timeout, in seconds (defaults to 2)
    timeout: 2

could not convert string to float: neo4j-kernel, version: 3.3.5,69f246fab87bc58a931d27cfa5ec5c35f21b95fa

Output of the info page

# curl -H 'Accept: application/json; charset=UTF-8' -H 'Authorization: redacted' http://localhost:7474/db/manage/server/version
{
  "edition" : "community",
  "version" : "3.3.5"
}

# datadog-agent status

Getting the status from the agent.

==============
Agent (v6.4.1)
==============

  Status date: 2018-08-09 20:18:43.761257 UTC
  Pid: 104633
  Python Version: 2.7.9
  Logs:
  Check Runners: 1
  Log Level: info

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    NTP offset: -0.000640376 s
    System UTC time: 2018-08-09 20:18:43.761257 UTC

  Host Info
  =========
    bootTime: 2018-08-06 22:30:37.000000 UTC
    kernelVersion: 3.16.0-4-amd64
    os: linux
    platform: debian
    platformFamily: debian
    platformVersion: 8.10
    procs: 205
    uptime: 251266

....

=========
Collector
=========

  Running Checks
  ==============

    neo4j (unversioned)
    -------------------
      Total Runs: 1
      Metric Samples: 0, Total: 0
      Events: 0, Total: 0
      Service Checks: 1, Total: 1
      Average Execution Time : 21ms
      Error: could not convert string to float: neo4j-kernel, version: 3.3.5,69f246fab87bc58a931d27cfa5ec5c35f21b95fa
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/checks/base.py", line 303, in run
          self.check(copy.deepcopy(self.instances[0]))
        File "/etc/datadog-agent/checks.d/neo4j.py", line 137, in check
          self.gauge(self.display.get(doc['row'][0].lower(),""), doc['row'][1], tags=tags)
        File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/checks/base.py", line 132, in gauge
          self._submit_metric(aggregator.GAUGE, name, value, tags=tags, hostname=hostname, device_name=device_name)
        File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/checks/base.py", line 129, in _submit_metric
          aggregator.submit_metric(self, self.check_id, mtype, name, float(value), tags, hostname)
      ValueError: could not convert string to float: neo4j-kernel, version: 3.3.5,69f246fab87bc58a931d27cfa5ec5c35f21b95fa

Additional environment details (Operating System, Cloud provider, etc):

I added self.log.info("doc: %s" % (doc)) to see the original value we trying to convert:

2018-08-09 22:44:11 UTC | INFO | (datadog_agent.go:137 in LogMessage) | (neo4j.py:136) | doc: {u'meta': [None, None], u'row': [u'KernelVersion', u'neo4j-kernel, version: 3.3.5,*****
******************************b95fa']}

Steps to reproduce the issue:

Followed setup instructions from here:
https://docs.datadoghq.com/integrations/neo4j/

Additional information you deem important (e.g. issue happens only occasionally):

https://github.com/DataDog/integrations-core/blob/master/datadog_checks_base/datadog_checks/checks/base.py (line 129):

aggregator.submit_metric(self, self.check_id, mtype, ensure_bytes(name), float(value), tags, hostname)

Trying converts metric to float, when originally it is string ...

To work around I disabled few problematic metrics:

--- /etc/datadog-agent/checks.d/neo4j.py  2018-08-08 09:53:14.712361122 +0000
+++ /etc/datadog-agent/checks.d/neo4j.py  2018-08-09 22:45:51.643572539 +0000
@@ -16,8 +16,8 @@

     # Neo4j metrics to send
     keys = set([
-        'kernelversion',
-        'storeid',
+        'off_kernelversion',
+        'off_storeid',
         'storecreationdate',
         'storelogversion',
         'kernelstarttime',
@@ -133,6 +133,7 @@
             self.SERVICE_CHECK_NAME, AgentCheck.OK, tags=service_check_tags)

         for doc in stats['results'][0]['data']:
+            #self.log.info("doc: %s" % (doc))
             if doc['row'][0].lower() in self.keys:
                 self.gauge(self.display.get(doc['row'][0].lower(),""), doc['row'][1], tags=tags)

Filebeat instance cache is broken (which breaks increment metrics)

Output of the info page

====================
Collector (v 5.30.1)
====================

  Status date: 2019-01-08 06:09:58 (10s ago)
  Pid: 4468
  Platform: Linux-3.13.0-163-generic-x86_64-with-Ubuntu-14.04-trusty
  Python Version: 2.7.15, 64bit
  Logs: <stderr>, /var/log/datadog/collector.log
<snip>
    filebeat (custom)
    -----------------
      - instance #0 [OK]
      - Collected 1 metric, 0 events & 0 service checks
<snip>

Additional environment details (Operating System, Cloud provider, etc):
Filebeat version 6.5.1

Steps to reproduce the issue:

  1. Install and configure filebeat check with:
init_config:

instances:
  - registry_file_path: /var/lib/filebeat/registry
    stats_endpoint: http://localhost:5066/stats

Describe the results you received:
Metrics in GAUGE_METRIC_NAMES are reported to Datadog, but none of the metrics in INCREMENT_METRIC_NAMES are.

Describe the results you expected:
All listed metrics in the above variables and present in http://localhost:5066/stats are reported.

Additional information you deem important (e.g. issue happens only occasionally):
The issue is here:

def check(self, instance):
instance_key = hash_mutable(instance)
if instance_key in self.instance_cache:
config = self.instance_cache['config']
profiler = self.instance_cache['profiler']
else:
self.instance_cache['config'] = config = FilebeatCheckInstanceConfig(instance)
self.instance_cache['profiler'] = profiler = FilebeatCheckHttpProfiler(config)

Line 213 checks for instance_key in self.instance_cache, but that key is never actually put in that dictionary. The only keys in that dictionary are 'config' and 'profiler', added on lines 217 and 218.

The result of this is that the FilebeatCheckInstanceConfig and FilebeatCheckHttpProfiler are re-created every check, which means that the _previous_increment_values property on FilebeatCheckHttpProfiler is always empty, which means no increment metrics are ever reported.

This appears to have been introduced in #250.

gnatsd_streaming integation tests broken

The integration tests for gnatsd_streaming are broken. This is probably because they are pinned to latest docker image although I've tried with older images and that does not seem to solve the issue.

no eviction metrics

  Checks
  ======

    network (1.6.0)
    ---------------
      - instance #0 [OK]
      - Collected 20 metrics, 0 events & 0 service checks

    aerospike (custom)
    ------------------
      - instance #0 [OK]
      - Collected 484 metrics, 0 events & 1 service check


``````text
the integration doesnt send aerospike.evicted_objects aerospike.xdr (write,read,error)

it worked before we updated this check ( we used the code from 2016)

float() argument must be a string or a number

Output of the info page

2019-03-11 15:46:08 UTC | INFO | (pkg/collector/runner/runner.go:264 in work) | Running check neo4j
2019-03-11 15:46:08 UTC | WARN | (pkg/collector/py/datadog_agent.go:148 in LogMessage) | (base.py:579) | Metric: 'neo4j.kernel.version' has non float value: u'neo4j-kernel, version: 3.5.1,unknown-commit'. Only float values can be submitted as metrics.
2019-03-11 15:46:08 UTC | WARN | (pkg/collector/py/datadog_agent.go:148 in LogMessage) | (base.py:579) | Metric: 'neo4j.storeid' has non float value: u'54eb24e20ec052a0'. Only float values can be submitted as metrics.
2019-03-11 15:46:08 UTC | ERROR | (pkg/collector/runner/runner.go:295 in work) | Error running check neo4j: [{"message": "float() argument must be a string or a number", "traceback": "Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py\", line 774, in run\n    self.check(copy.deepcopy(self.instances[0]))\n  File \"/etc/datadog-agent/checks.d/neo4j.py\", line 138, in check\n    self.gauge(self.display.get(doc['row'][0].lower(),\"\"), doc['row'][1], tags=tags)\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py\", line 585, in gauge\n    self._submit_metric(aggregator.GAUGE, name, value, tags=tags, hostname=hostname, device_name=device_name)\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py\", line 571, in _submit_metric\n    value = float(value)\nTypeError: float() argument must be a string or a number\n"}]
2019-03-11 15:46:08 UTC | INFO | (pkg/collector/runner/runner.go:330 in work) | Done running check neo4j

Additional environment details (Operating System, Cloud provider, etc):
Debian 9, with neo4j integration (and running ongdb/neo4j version 3.5.1).

Steps to reproduce the issue:

  1. install the neo4j check with a valid yaml configuration (you need neo4j 3.5.1 too I suppose)
  2. run the neo4j check
  3. you should see the describe error in your logs

Describe the results you received:

no metrics were shown in the datadog metric explorer for neo4j

Describe the results you expected:

metrics showing up

Additional information you deem important (e.g. issue happens only occasionally):

I went ahead and extended the exception handling in the faulty code on neo4j/check.py (added a TypeError) and that has gotten the metrics into datadog as expected. Not sure if it's a proper fix but it got me going

doc['row'][1] can throw TypeError as well.

self.gauge(self.display.get(doc['row'][0].lower(),""), doc['row'][1], tags=tags)

In case doc['row'][1] is of type List, then float(value) will fail in base.py.
For example, in my case, it came as
... { "row": [ "MemoryPools", [] ], "meta": [ null ] } ...
from the HTTP call to db/data/transaction/commit.
And hence, it throws an error (TypeError).

This should be handled as well.

nvml check spams logs on non-GPU machines.

On non-GPU machines that lack the libnvidia-ml.so.1 library, the nvml check will fall with a very ugly stack trace that only spams logs.

2021-02-10 20:33:54 UTC | CORE | ERROR | (pkg/collector/runner/runner.go:292 in work) | Error running check nvml: [{"message": "NVML Shared Library Not Found", "traceback": "Traceback (most recent call last):
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/pynvml/nvml.py\", line 734, in _load_nvml_library
    nvml_lib = CDLL(\"libnvidia-ml.so.1\")
  File \"/opt/datadog-agent/embedded/lib/python3.8/ctypes/__init__.py\", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libnvidia-ml.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 876, in run
    self.check(instance)
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/nvml/nvml.py\", line 81, in check
    with NvmlInit():
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/nvml/nvml.py\", line 26, in __enter__
    NvmlCheck.N.nvmlInit()
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/pynvml/nvml.py\", line 742, in nvmlInit
    _load_nvml_library()
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/pynvml/nvml.py\", line 736, in _load_nvml_library
    check_return(NVML_ERROR_LIBRARY_NOT_FOUND)
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/pynvml/nvml.py\", line 366, in check_return
    raise NVMLError(ret)
pynvml.nvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found
"}]

It's much easier to use the same image and setup on all nodes, regardless of the hardware they might or might not have.
What's the quickest and cleanest way for a custom check to signal at runtime that it should be disabled? Presumably, this would occur in the constructor.

One way would be to mark the disabled state as true in the constructor and make check() consult that state on every call before doing anything.

Allow root users to run lighthouse/puppeteer integration.

Output of the info page

 lighthouse (1.0.0)
    ------------------
      Instance ID: lighthouse:Career_Explorer_HomePage:1f0b752b4ed13d19 [ERROR]
      Configuration Source: file:/etc/datadog-agent/conf.d/lighthouse.d/conf.yaml
      Total Runs: 4
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 26.326s
      Last Execution Date : 2020-04-15 16:15:28.000000 UTC
      Last Successful Execution Date : Never
      Error: ('', 'Unable to connect to Chrome\n', 1)
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py", line 713, in run
          self.check(instance)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/lighthouse/lighthouse.py", line 29, in check
          raise CheckException(json_string, error_message, exit_code)
      datadog_checks.base.errors.CheckException: ('', 'Unable to connect to Chrome\n', 1)

From trying to launch lighthouse manually:

  ChromeLauncher Waiting for browser................................................................................... +501ms
  ChromeLauncher Waiting for browser..................................................................................... +502ms
  ChromeLauncher Waiting for browser....................................................................................... +502ms
  ChromeLauncher Waiting for browser......................................................................................... +501ms
  ChromeLauncher Waiting for browser........................................................................................... +502ms
  ChromeLauncher Waiting for browser............................................................................................. +501ms
  ChromeLauncher Waiting for browser............................................................................................... +501ms
  ChromeLauncher Waiting for browser................................................................................................. +502ms
  ChromeLauncher Waiting for browser................................................................................................... +500ms
  ChromeLauncher Waiting for browser..................................................................................................... +502ms
  ChromeLauncher Waiting for browser....................................................................................................... +501ms
  ChromeLauncher:error connect ECONNREFUSED 127.0.0.1:32849 +1ms
  ChromeLauncher:error Logging contents of /tmp/lighthouse.Nbp10Pb/chrome-err.log +0ms
  ChromeLauncher:error [1453:1453:0415/161429.284687:ERROR:zygote_host_impl_linux.cc(89)] Running as root without --no-sandbox is not supported. See https://crbug.com/638180.

Additional environment details (Operating System, Cloud provider, etc):

  • Datadog agent image.
  • Docker

Steps to reproduce the issue:

  1. Install Datadog Agent via Dockerfile pulling from datadog/agent:latest.
  2. Install node.js/npm/puppeteer and lighthouse globally.
  3. Following remaining configuration steps.

Describe the results you received:

I'm unable to successfully start lighthouse and make a connection to Chromium.
To my understanding based on what I've googled, it sounds like this has to do with running the datadog agent as a privileged user. As root user I need to provide a --no-sandbox argument into puppeteer, but, there is no configuration to do this via the Datadog agent. There is also no environment variable that enables this.

The only current work around to my knowledge, is to run my entire Docker image as an unprivileged user. Unfortunately this is far too complex a solution for the benefits this integration gives me. I'd like a simpler path forward; hence it'd be helpful to have this configuration option.

Describe the results you expected:

Ideally I'd like this to work in the simplest way possible.

Additional information you deem important (e.g. issue happens only occasionally):

The installation instructions are missing quite a few potential rabbit holes when using the dockerized Datadog agent. NPM and Node are ripe with permissions issues on install, and it'd be great if those steps were clarified to help customers install more quickly.

can't see metrics for hbase_regionserver integration

**Output of the info page **

 sudo /etc/init.d/datadog-agent status
Datadog Agent (supervisor) is running all child processes  [  OK  ]

sudo /etc/init.d/datadog-agent status -v ( only for the hbase section ) 

 hbase_regionserver (5.16.0)
    ---------------------------
      - instance #hbase_regionserver-127.0.0.1-9010 [OK] collected 13 metrics
      - Collected 13 metrics, 0 events & 0 service checks

Additional environment details (Operating System, Cloud provider, etc):
cloud provider - AWS
OS : CentOS release 6.8 (Final)

Steps to reproduce the issue:

  1. just trying to get the hbase regionservers metrics .

Describe the results you received:
Empty metrics

Describe the results you expected:
get all the hbase metrics in my datadog account

Additional information you deem important (e.g. issue happens only occasionally):

Datadog Filebeat integration: Registry denied for datadog to read

Output of the info page

Redirecting to /bin/systemctl status datadog-agent.service
โ— datadog-agent.service - Datadog Agent
   Loaded: loaded (/usr/lib/systemd/system/datadog-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2020-04-27 20:25:32 UTC; 13h ago
 Main PID: 20068 (agent)
   CGroup: /system.slice/datadog-agent.service
           โ””โ”€20068 /opt/datadog-agent/bin/agent/agent run -p /opt/datadog-agent/run/agent.pid

Apr 28 09:58:49 HOSTNAME sudo[27519]: dd-agent : TTY=unknown ; PWD=/ ; USER=asterisk ; COMMAND=/sbin/asterisk -rx core ...annels
Apr 28 09:58:56 HOSTNAME agent[20068]: 2020-04-28 09:58:56 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:....json'
Apr 28 09:58:56 HOSTNAME agent[20068]: 2020-04-28 09:58:56 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:...l/6455
Apr 28 09:59:04 HOSTNAME sudo[27590]: dd-agent : TTY=unknown ; PWD=/ ; USER=asterisk ; COMMAND=/sbin/asterisk -rx core ...annels
Apr 28 09:59:11 HOSTNAME agent[20068]: 2020-04-28 09:59:11 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:....json'
Apr 28 09:59:11 HOSTNAME agent[20068]: 2020-04-28 09:59:11 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:...l/6455
Apr 28 09:59:19 HOSTNAME sudo[27664]: dd-agent : TTY=unknown ; PWD=/ ; USER=asterisk ; COMMAND=/sbin/asterisk -rx core ...annels
Apr 28 09:59:26 HOSTNAME agent[20068]: 2020-04-28 09:59:26 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:....json'
Apr 28 09:59:26 HOSTNAME agent[20068]: 2020-04-28 09:59:26 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:...l/6455
Apr 28 09:59:34 HOSTNAME sudo[27736]: dd-agent : TTY=unknown ; PWD=/ ; USER=asterisk ; COMMAND=/sbin/asterisk -rx core ...annels
Hint: Some lines were ellipsized, use -l to show in full.

Additional environment details (Operating System, Cloud provider, etc):
Cloud provider: AWS
OS: CentOS 7
Datadog agent version: Agent 6.13.0 - Commit: df8e880 - Serialization version: 4.7.1 - Go version: go1.11.5

Steps to reproduce the issue:

  1. Install Datadog agent
  2. Setup filebeat integration
  3. Tail -f /var/log/datadog-agent/agent.log

Describe the results you received:
The metrics are being sent to Datadog but for some reasons in the logs, there is a permission issue happening regarding filebeat registry not being allowed to be read.

2020-04-28 09:59:11 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:106 in LogMessage) | (filebeat.py:245) | Cannot read the registry log file at /var/lib/filebeat/registry/filebeat/data.json: [Errno 13] Permission denied: '/var/lib/filebeat/registry/filebeat/data.json'
2020-04-28 09:59:11 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:106 in LogMessage) | (filebeat.py:249) | You might be interesting in having a look at https://github.com/elastic/beats/pull/6455
2020-04-28 09:59:26 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:106 in LogMessage) | (filebeat.py:245) | Cannot read the registry log file at /var/lib/filebeat/registry/filebeat/data.json: [Errno 13] Permission denied: '/var/lib/filebeat/registry/filebeat/data.json'
2020-04-28 09:59:26 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:106 in LogMessage) | (filebeat.py:249) | You might be interesting in having a look at https://github.com/elastic/beats/pull/6455

Describe the results you expected:
The confusion part is filebeat metrics are being sent to Datadog despite of the fact the error is being seen in the logs. We've tried to set the file permission via our automation (SCM) but did not succeed and we've even used the Filebeat configuration registry permission in the yml file.

Additional information you deem important (e.g. issue happens only occasionally):
We've seen this in the following issue elastic/beats#6455 but could not get rid of the issue.

[nextcloud] not reporting any metrics

Output of the info page

===============
Agent (v6.11.3)
===============

  Status date: 2019-06-23 06:39:10.856875 AWST
  Agent start: 2019-06-23 06:33:30.917023 AWST
  Pid: 29150
  Python Version: 2.7.16
  Check Runners: 4
  Log Level: info

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    NTP offset: -1.58ms
    System UTC time: 2019-06-23 06:39:10.856875 AWST

  Host Info
  =========
    bootTime: 2019-06-16 04:21:33.000000 AWST
    kernelVersion: 4.9.0-9-amd64
    os: linux
    platform: debian
    platformFamily: debian
    platformVersion: 9.9
    procs: 172
    uptime: 170h11m59s
    virtualizationRole: host
    virtualizationSystem: kvm

  Hostnames
  =========
    hostname: do-nextcloud
    socket-fqdn: do-nextcloud.localdomain.
    socket-hostname: do-nextcloud
    hostname provider: os
    unused hostname providers:
      aws: not retrieving hostname from AWS: the host is not an ECS instance, and other providers already retrieve non-default hostnames
      configuration/environment: hostname is empty
      gce: unable to retrieve hostname from GCE: status code 404 trying to GET http://169.254.169.254/computeMetadata/v1/instance/hostname

=========
Collector
=========

  Running Checks
  ==============
    
    cpu
    ---
      Instance ID: cpu [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 23
      Metric Samples: Last Run: 6, Total: 132
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      
    
    disk (2.1.0)
    ------------
      Instance ID: disk:e5dffb8bef24336f [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 22
      Metric Samples: Last Run: 112, Total: 2,464
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 57ms
      
    
    file_handle
    -----------
      Instance ID: file_handle [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 23
      Metric Samples: Last Run: 5, Total: 115
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      
    
    io
    --
      Instance ID: io [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 22
      Metric Samples: Last Run: 91, Total: 1,939
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      
    
    load
    ----
      Instance ID: load [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 23
      Metric Samples: Last Run: 6, Total: 138
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      
    
    memory
    ------
      Instance ID: memory [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 22
      Metric Samples: Last Run: 17, Total: 374
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      
    
    network (1.10.0)
    ----------------
      Instance ID: network:e0204ad63d43c949 [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 23
      Metric Samples: Last Run: 26, Total: 598
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      
    
    nextcloud (0.0.1)
    -----------------
      Instance ID: nextcloud:10a92fa811c0c87b [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 23
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 23
      Average Execution Time : 24ms
      
    
    ntp
    ---
      Instance ID: ntp:b4579e02d1981c12 [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 22
      Metric Samples: Last Run: 1, Total: 22
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 22
      Average Execution Time : 0s
      
    
    uptime
    ------
      Instance ID: uptime [๏ฟฝ[32mOK๏ฟฝ[0m]
      Total Runs: 23
      Metric Samples: Last Run: 1, Total: 23
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      
========
JMXFetch
========

  Initialized checks
  ==================
    no checks
    
  Failed checks
  =============
    no checks
    
=========
Forwarder
=========

  Transactions
  ============
    CheckRunsV1: 22
    Dropped: 0
    DroppedOnInput: 0
    Events: 0
    HostMetadata: 0
    IntakeV1: 3
    Metadata: 0
    Requeued: 0
    Retried: 0
    RetryQueueSize: 0
    Series: 0
    ServiceChecks: 0
    SketchSeries: 0
    Success: 47
    TimeseriesV1: 22

  API Keys status
  ===============
    API key ending with e8c6e: API Key valid

==========
Endpoints
==========
  https://app.datadoghq.com - API Key ending with:
      - e8c6e

==========
Logs Agent
==========

  Logs Agent is not running

=========
Aggregator
=========
  Checks Metric Sample: 6,214
  Dogstatsd Metric Sample: 264
  Event: 1
  Events Flushed: 1
  Number Of Flushes: 22
  Series Flushed: 5,225
  Service Check: 250
  Service Checks Flushed: 264

=========
DogStatsD
=========
  Event Packets: 0
  Event Parse Errors: 0
  Metric Packets: 264
  Metric Parse Errors: 0
  Service Check Packets: 0
  Service Check Parse Errors: 0
  Udp Packet Reading Errors: 0
  Udp Packets: 265
  Uds Origin Detection Errors: 0
  Uds Packet Reading Errors: 0
  Uds Packets: 0

Additional environment details (Operating System, Cloud provider, etc):

Debian 9, Digitalocean

Steps to reproduce the issue:

  1. install datadog-agent
  2. install nextcloud integration (following steps from bind9 integration)

Describe the results you received:
Seems like the integration is installed, but according to status page no metrics are being reported, only the status page

Describe the results you expected:
metrics as described in nexcloud metadata.csv

Additional information you deem important (e.g. issue happens only occasionally):

Permission error when running ddev

Hi!

Cannot run ddev. Getting this:

Traceback (most recent call last):
File "/usr/bin/ddev", line 7, in
from datadog_checks.dev.tooling.cli import ddev
File "/home/ec2-user/.local/lib/python2.7/site-packages/datadog_checks/dev/tooling/cli.py", line 35
echo_warning(f'Unable to create config file located at {CONFIG_FILE}. Please check your permissions.')
^
SyntaxError: invalid syntax

Did the following:

  1. pip install "datadog-checks-dev[cli]"
  2. ddev -e release build aws_pricing
  3. Traceback (most recent call last):
    File "/usr/bin/ddev", line 7, in
    from datadog_checks.dev.tooling.cli import ddev
    File "/home/ec2-user/.local/lib/python2.7/site-packages/datadog_checks/dev/tooling/cli.py", line 35
    echo_warning(f'Unable to create config file located at {CONFIG_FILE}. Please check your permissions.')

here is where i put it:
/home/ec2-user/dd/integrations-extras-master

amazon linux 2 AMI

problem with setup_env

here is what i see when running rake setup_env:

Downloading https://pypi.io/packages/source/s/setuptools/setuptools-27.2.0.zip
Extracting in /var/folders/pk/q2qft9p95312_pm_q5z9_mk00000gn/T/tmptTZWQY
Traceback (most recent call last):
  File "/Users/mattw/Projects/DatadogRepos/integrations-extras-pagespeed/venv/ez_setup.py", line 426, in <module>
    sys.exit(main())
  File "/Users/mattw/Projects/DatadogRepos/integrations-extras-pagespeed/venv/ez_setup.py", line 423, in main
    return _install(archive, _build_install_args(options))
  File "/Users/mattw/Projects/DatadogRepos/integrations-extras-pagespeed/venv/ez_setup.py", line 56, in _install
    with archive_context(archive_filename):
  File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/Users/mattw/Projects/DatadogRepos/integrations-extras-pagespeed/venv/ez_setup.py", line 107, in archive_context
    with ContextualZipFile(filename) as archive:
  File "/Users/mattw/Projects/DatadogRepos/integrations-extras-pagespeed/venv/ez_setup.py", line 91, in __new__
    return zipfile.ZipFile(*args, **kwargs)
  File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/zipfile.py", line 770, in __init__
    self._RealGetContents()
  File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/zipfile.py", line 811, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file

[Filebeat] conf.yaml.example is invalid

The file conf.yaml.example for the filebeat integration is invalid.

If I understand the source code correctly it should be something like this:

init_config:

instances:
 # Full path to filebeat registry file
 - registry_file_path: /path/to/filebeat/registry

How to set 'instances: lists' in aws_pricing.d/conf.yaml ? (integration-extras/aws_pricing)

Hi
I'm trying to set custom integration(aws_pricing) at an aws instance(linux).
The aws_pricing had installed successfully by follow shell command: sudo -u dd-agent -s /bin/bash -c 'datadog-agent integration install -t datadog-aws_pricing==1.0.0'
but errors occured in conf.d/aws_pricing.d/conf.yaml when restart datadog-agent.
conf.yaml:

init_config:

instances:
  - i-04fbxxxxxxxxxxxxx
  - i-0654xxxxxxxxxxxxx
  - i-0da4xxxxxxxxxxxxx

error message:

  Config Errors
  ==============
    aws_pricing
    -----------
      yaml: unmarshal errors:
  line 4: cannot unmarshal !!str `i-04fb0...` into integration.RawMap
  line 5: cannot unmarshal !!str `i-06540...` into integration.RawMap
  line 6: cannot unmarshal !!str `i-0da45...` into integration.RawMap

How to setup conf.yaml file?
thanks.

integration-extras/aws_pricing

[nvml] broken on p2 (K80) instances

The check is making calls without attempting to catch exceptions. This fails on old K80 machines such as p2 instances, because some metrics are not supported:

2020-06-08 17:32:11 UTC | CORE | ERROR | (pkg/collector/runner/runner.go:292 in work) | Error running check nvml: [{"message": "Not Supported", "traceback": "Traceback (most recent call last):
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 820, in run
    self.check(instance)
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/nvml/nvml.py\", line 81, in check
    self.gather(instance)
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/nvml/nvml.py\", line 94, in gather
    self.gather_gpu(handle, tags)
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/nvml/nvml.py\", line 122, in gather_gpu
    consumption = NvmlCheck.N.nvmlDeviceGetTotalEnergyConsumption(handle)
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/pynvml/nvml.py\", line 1252, in nvmlDeviceGetTotalEnergyConsumption
    check_return(ret)
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/pynvml/nvml.py\", line 366, in check_return
    raise NVMLError(ret)
pynvml.nvml.NVMLError_NotSupported: Not Supported
"}]

Filebeat metric names

Hi!

Not all metrics names from Filebeat integration under the same prefix. We have metrics:
filebeat.*
libbeat.*
publish.events (very generic name without prefix)
registrar.* (if I see this in Metrics Explorer - I have no idea that this metrics from Filebeat)

My proposal: move all metrics under "filebeat." prefix.
AFAIK every core-integration follows this rule.

Filebeat using deprecated AgentCheck.increment and .decrement functions

Output of the info page

$ k exec -ti datadog-agent-snsfn bash
root@datadog-agent-snsfn:/# s6-svstat /var/run/s6/services/agent/
up (pid 349) 143 seconds

Additional environment details (Operating System, Cloud provider, etc):

AWS EKS

Steps to reproduce the issue:

  1. Install the Filebeat check in Datadog agent pod in Kubenretes
  2. Register Filebeat check by adding Datadog autodiscovery annotations to Filebeat pod
  3. Stream logs of Datadog agent pod

Describe the results you received:

Filebeat integration prints the following deprecation notice:

datadog-agent 2020-03-24 19:43:53 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:118 in LogMessage) | filebeat:e00f8fb4ebe0e9e6 | (base.py:536) | DEPRECATION NOTICE: `AgentCheck.increment`/`AgentCheck.decrement` are deprecated, please use `AgentCheck.gauge` or `AgentCheck.count` instead, with a different metric name

Describe the results you expected:

The Filebeat integration should be migrated to use the newer gauge and count functions, as there may be issues with continued use of increment and decrement.

Additional information you deem important (e.g. issue happens only occasionally):

Add offset to filebeat integration

It would be useful to add something like this to the gauge:

            self.gauge('filebeat.registry.offset', offset,
                       tags=['source:{0}'.format(source)])

Since the stats api only gives you the compressed output bytes and you might want the additional non-compressed bytes.

Unit of measure: Celsius

I am currently working on an integration for a hardware RAID controller that supports temperature monitoring, and as such I would find useful to be able to use a temperature unit in the metadata.csv's unit_name medatada.

Such an unit should be expressed in ยฐC, or Celsius.

Cannot install logstash integration

Hello there,

While trying to install the Logstash integration, I'm getting:

Error: Some errors prevented moving datadog-logstash configuration files: open /opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/logstash/data: no such file or directory

The installation does work, but this confuses Ansible or whatever is invoking the install.

Output of the info page

โ— datadog-agent.service - Datadog Agent
   Loaded: loaded (/lib/systemd/system/datadog-agent.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/datadog-agent.service.d
           โ””โ”€datadog-configure.conf
   Active: active (running) since Thu 2019-09-26 06:59:51 UTC; 31min ago
 Main PID: 2611 (agent)
    Tasks: 12
   Memory: 39.5M
      CPU: 11.177s
   CGroup: /system.slice/datadog-agent.service
           โ””โ”€2611 /opt/datadog-agent/bin/agent/agent run -p /opt/datadog-agent/run/agent.pid

    logstash (0.0.1)
    ----------------
      Instance ID: logstash:cebf59fff195f96e [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/logstash.d/logstash.yaml
      Total Runs: 1
      Metric Samples: Last Run: 219, Total: 219
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 1
      Average Execution Time : 1.979s

Additional environment details (Operating System, Cloud provider, etc):

AWS, Ubuntu 16.04

Steps to reproduce the issue:

  1. Build the wheel in the python:3-buster Docker image
  2. Deploy the whl
  3. datadog-agent integration install -w datadog_logstash-0.0.1-py2.py3-none-any.whl

Describe the results you received:

Error: Some errors prevented moving datadog-logstash configuration files: open /opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/logstash/data: no such file or directory

Describe the results you expected:

datadog-agent integration install should have succeeded.

[CI] `hbase_master` and `hbase_regionserver` tests broken

Hi @everpeace!

Currently the CI is broken for the hbase tests. Failing with the following error:

testCustomJMXMetric (test_hbase_master.TestHbase_master) ... 2017-07-05 14:07:21,923 | ERROR | App | Cannot connect to instance localhost:10101. java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.ConnectIOException: error during JRMP connection establishment; nested exception is: 
	java.io.EOFException]
java.io.IOException: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.ConnectIOException: error during JRMP connection establishment; nested exception is: 
	java.io.EOFException]
	at org.datadog.jmxfetch.Connection.connectWithTimeout(Connection.java:117)
	at org.datadog.jmxfetch.Connection.createConnection(Connection.java:61)
	at org.datadog.jmxfetch.RemoteConnection.<init>(RemoteConnection.java:56)
	at org.datadog.jmxfetch.ConnectionFactory.createConnection(ConnectionFactory.java:29)
	at org.datadog.jmxfetch.Instance.getConnection(Instance.java:162)
	at org.datadog.jmxfetch.Instance.init(Instance.java:173)
	at org.datadog.jmxfetch.App.init(App.java:511)
	at org.datadog.jmxfetch.App.main(App.java:115)

Are you aware of this issue? Was it working earlier and did it stop all of a sudden? If you know what may be going on, and could fix, that would be great. I know debugging Travis issues is a pain sometimes, so please let me know if I can help with anything. Also, if you don't have the bandwidth right now, let us know and I'll try to chip in where I may.

Thank you!

Python 2.7 Docker image SyntaxError: invalid syntax

Additional environment details (Operating System, Cloud provider, etc):
According to official documentation I can install extras via docker:
https://docs.datadoghq.com/agent/guide/community-integrations-installation-with-docker-agent/?tab=docker

Steps to reproduce the issue:

  1. Create Dockerfile with content:
FROM python:2.7 AS wheel_builder
WORKDIR /wheels
RUN pip install "datadog-checks-dev[cli]"
RUN git clone https://github.com/DataDog/integrations-extras.git
RUN ddev -d config set extras ./integrations-extras
RUN ddev -e release build gnatsd

FROM datadog/agent:6.16.1
COPY --from=wheel_builder /wheels/integrations-extras/gnatsd/dist/ /dist
RUN mkdir -p /dd
RUN /bin/cp /etc/datadog-agent/datadog-kubernetes.yaml /dd/datadog.yaml

RUN agent integration install -r -c /dd -w /dist/*.whl
RUN rm -r /dd
  1. Build image: docker build -t moleksyuk/dockers:datadog-gnatsd-6.16.1 .

Describe the results you received:

Sending build context to Docker daemon  112.4MB
Step 1/12 : FROM python:2.7 AS wheel_builder
 ---> 3c1fea91a131
Step 2/12 : WORKDIR /wheels
 ---> Using cache
 ---> 30f234577989
Step 3/12 : RUN pip install "datadog-checks-dev[cli]"
 ---> Using cache
 ---> 2177ba4a5fc4
Step 4/12 : RUN git clone https://github.com/DataDog/integrations-extras.git
 ---> Using cache
 ---> 059ce9e60d2b
Step 5/12 : RUN ddev -d config set extras ./integrations-extras
 ---> Running in 4fbc43393f8f
Traceback (most recent call last):
  File "/usr/local/bin/ddev", line 5, in <module>
    from datadog_checks.dev.tooling.cli import ddev
  File "/usr/local/lib/python2.7/site-packages/datadog_checks/dev/tooling/cli.py", line 35
    echo_warning(f'Unable to create config file located at `{CONFIG_FILE}`. Please check your permissions.')
                                                                                                          ^
SyntaxError: invalid syntax
The command '/bin/sh -c ddev -d config set extras ./integrations-extras' returned a non-zero code: 1

Describe the results you expected:
Docker images is built successfully with python 2.7

Additional information you deem important (e.g. issue happens only occasionally):
If you replace FROM python:2.7 AS wheel_builder to FROM python:3.7 AS wheel_builder evertyting works fine.

AttributeError: 'module' object has no attribute 'tracemalloc_enabled'

Output of the info page

[root@101b0f8edcb1 riak]# datadog-agent status
Getting the status from the agent.

===============
Agent (v6.13.0)
===============

  Status date: 2019-09-07 23:05:48.877481 UTC
  Agent start: 2019-09-07 23:04:39.704862 UTC
  Pid: 2125
  Go Version: go1.11.5
  Python Version: 2.7.16
  Check Runners: 4
  Log Level: info

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    NTP offset: -29.517ms
    System UTC time: 2019-09-07 23:05:48.877481 UTC

  Host Info
  =========
    bootTime: 2019-09-07 08:00:36.000000 UTC
    kernelVersion: 4.9.184-linuxkit
    os: linux
    platform: amazon
    platformFamily: rhel
    platformVersion: 2018.03
    procs: 69
    uptime: 15h4m6s
    virtualizationRole: guest
    virtualizationSystem: docker

  Hostnames
  =========
    hostname: 101b0f8edcb1
    socket-fqdn: 101b0f8edcb1
    socket-hostname: 101b0f8edcb1
    hostname provider: os
    unused hostname providers:
      aws: not retrieving hostname from AWS: the host is not an ECS instance, and other providers already retrieve non-default hostnames
      configuration/environment: hostname is empty
      gce: unable to retrieve hostname from GCE: Get http://169.254.169.254/computeMetadata/v1/instance/hostname: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

=========
Collector
=========



  Running Checks
  ==============

    cpu
    ---
      Instance ID: cpu [OK]
      Total Runs: 4
      Metric Samples: Last Run: 6, Total: 18
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s


    disk (2.4.0)
    ------------
      Instance ID: disk:e5dffb8bef24336f [ERROR]
      Total Runs: 4
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 1ms
      Error: 'module' object has no attribute 'tracemalloc_enabled'
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py", line 545, in run
          elif 'profile_memory' in self.init_config or datadog_agent.tracemalloc_enabled():
      AttributeError: 'module' object has no attribute 'tracemalloc_enabled'

    file_handle
    -----------
      Instance ID: file_handle [OK]
      Total Runs: 4
      Metric Samples: Last Run: 5, Total: 20
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 8ms


    io
    --
      Instance ID: io [OK]
      Total Runs: 4
      Metric Samples: Last Run: 65, Total: 215
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 1ms


    load
    ----
      Instance ID: load [OK]
      Total Runs: 5
      Metric Samples: Last Run: 6, Total: 30
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s


    memory
    ------
      Instance ID: memory [OK]
      Total Runs: 4
      Metric Samples: Last Run: 17, Total: 68
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 1ms


    network (1.11.0)
    ----------------
      Instance ID: network:e0204ad63d43c949 [ERROR]
      Total Runs: 5
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 3ms
      Error: 'module' object has no attribute 'tracemalloc_enabled'
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py", line 545, in run
          elif 'profile_memory' in self.init_config or datadog_agent.tracemalloc_enabled():
      AttributeError: 'module' object has no attribute 'tracemalloc_enabled'

    ntp
    ---
      Instance ID: ntp:b4579e02d1981c12 [OK]
      Total Runs: 4
      Metric Samples: Last Run: 1, Total: 4
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 4
      Average Execution Time : 2.585s


    riak_repl (0.0.3)
    -----------------
      Instance ID: riak_repl:1eac7da8b03c73b3 [ERROR]
      Total Runs: 5
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 1ms
      Error: 'module' object has no attribute 'tracemalloc_enabled'
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py", line 545, in run
          elif 'profile_memory' in self.init_config or datadog_agent.tracemalloc_enabled():
      AttributeError: 'module' object has no attribute 'tracemalloc_enabled'

    uptime
    ------
      Instance ID: uptime [OK]
      Total Runs: 5
      Metric Samples: Last Run: 1, Total: 5
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s

========
JMXFetch
========

  Initialized checks
  ==================
    no checks

  Failed checks
  =============
    no checks

=========
Forwarder
=========

  Transactions
  ============
    CheckRunsV1: 4
    Dropped: 0
    DroppedOnInput: 0
    Events: 0
    HostMetadata: 0
    IntakeV1: 2
    Metadata: 0
    Requeued: 0
    Retried: 0
    RetryQueueSize: 0
    Series: 0
    ServiceChecks: 0
    SketchSeries: 0
    Success: 10
    TimeseriesV1: 4

  API Keys status
  ===============
    API key ending with 9b6bc: API Key valid

==========
Endpoints
==========
  https://app.datadoghq.com - API Key ending with:
      - 9b6bc

==========
Logs Agent
==========

  Logs Agent is not running

=========
Aggregator
=========
  Checks Metric Sample: 441
  Dogstatsd Metric Sample: 1
  Event: 1
  Events Flushed: 1
  Number Of Flushes: 4
  Series Flushed: 261
  Service Check: 45
  Service Checks Flushed: 42

=========
DogStatsD
=========
  Event Packets: 0
  Event Parse Errors: 0
  Metric Packets: 0
  Metric Parse Errors: 0
  Service Check Packets: 0
  Service Check Parse Errors: 0
  Udp Bytes: 0
  Udp Packet Reading Errors: 0
  Udp Packets: 1
  Uds Bytes: 0
  Uds Origin Detection Errors: 0
  Uds Packet Reading Errors: 0
  Uds Packets: 0

Additional environment details (Operating System, Cloud provider, etc):

Steps to reproduce the issue:
1.
2.
3.

Describe the results you received:
Have been getting this error from the riak_repl check after upgrading to datadog-agent version 6.13.0. More specifically it seems to be from breaking changes in datadog-checks-base version 9.3.0. When the riak_repl check is enabled it errors and also causes disk and network checks to generate the same error.

Would like to understand the change to datadog-checks-base and what I need to modify on riak_repl to fix this issue.

Describe the results you expected:
no errors

Additional information you deem important (e.g. issue happens only occasionally):

[eventstore] Unable to install integration

I cannot get the eventstore integration installed correctly.
According to the README, it should just be a matter of installing the dd-check-eventstore, but I cannot find it anywhere. Not in the Datadog apt repository, nor on pip. Hence, running apt-get install dd-check-eventstore or pip install dd-check-eventstore renders no results.

As a fallback, I cloned this repository and ran python setup.py build and python setup.py install. But all that seems to do is place an .egg file in my Python lib folder and nothing else.

After a chat with Nicholas Devlin on Slack, he mentioned I should be seeing an eventstore.py file in the Agent's checks.d directory. This directory is empty for me. More details below.

Additional environment details (Operating System, Cloud provider, etc):

  • Google Cloud GCE VM instance running Ubuntu 18.04.2 LTS
  • Python 2.7
  • Agent 6.10.1 - Commit: 5e1bec3 - Serialization version: 4.7.1

Steps to reproduce the issue:

  1. Clone this git repository
  2. cd integrations-extras/eventstore
  3. python setup.py build
  4. python setup.py install

Describe the results you received:

When copying the files eventstore.py and metrics.py from the build dir manually, I get the following error when restarting the agent and running datadog-agent status:

Python Check Loader:
        Traceback (most recent call last):
  File "/etc/datadog-agent/checks.d/eventstore.py", line 13, in <module>
    from .metrics import ALL_METRICS
ValueError: Attempted relative import in non-package

Describe the results you expected:

The integration to be compiled/installed correctly.

/cc @xorima as package maintainer

Can't create a Development environment on Ubuntu 16.04

Hi guys,

Tried to follow the steps from your guide: https://docs.datadoghq.com/guides/integration_sdk/
I'm getting the following error:

 rake setup_env
rake aborted!
Gem::ConflictError: Unable to activate datadog-sdk-testing-0.7.4, because rake-12.1.0 conflicts with rake (~> 11.0)
/var/lib/gems/2.3.0/gems/datadog-sdk-testing-0.7.4/lib/tasks/sdk.rake:11:in `<top (required)>'
/mnt/datadog/rakefile:20:in `load'
/mnt/datadog/rakefile:20:in `<top (required)>'
/var/lib/gems/2.3.0/gems/rake-12.1.0/exe/rake:27:in `<top (required)>'
Gem::ConflictError: Unable to activate datadog-sdk-testing-0.7.4, because rake-12.1.0 conflicts with rake (~> 11.0)
/var/lib/gems/2.3.0/gems/datadog-sdk-testing-0.7.4/lib/tasks/sdk.rake:11:in `<top (required)>'
/mnt/datadog/rakefile:20:in `load'
/mnt/datadog/rakefile:20:in `<top (required)>'
/var/lib/gems/2.3.0/gems/rake-12.1.0/exe/rake:27:in `<top (required)>'
LoadError: cannot load such file -- ci/default
/var/lib/gems/2.3.0/gems/datadog-sdk-testing-0.7.4/lib/tasks/sdk.rake:11:in `<top (required)>'
/mnt/datadog/rakefile:20:in `load'
/mnt/datadog/rakefile:20:in `<top (required)>'
/var/lib/gems/2.3.0/gems/rake-12.1.0/exe/rake:27:in `<top (required)>'
(See full trace by running task with --trace)

Development envriroment details:
Vagrant box: bento/ubuntu-16.04

Commands to reproduce it:

  • sudo -i
  • apt-get install ruby-full
  • cd ~/integrations-extras
  • gem install bundler
  • bundle install
  • rake setup_env

[Filebeat] Error when parsing the registry file

I get the following error when I try to use the filebeat integration

Error shown when running agent status

filebeat
    --------
      Total Runs: 2
      Metrics: 0, Total Metrics: 0
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0
      Average Execution Time : 0ms
      Error: 'list' object has no attribute 'itervalues'
      Traceback (most recent call last):
        File "/opt/datadog-agent/bin/agent/dist/checks/__init__.py", line 332, in run
          self.check(copy.deepcopy(self.instances[0]))
        File "/etc/datadog-agent/checks.d/filebeat.py", line 30, in check
          for item in registry_contents.itervalues():
      AttributeError: 'list' object has no attribute 'itervalues'

System

Google Kubernetes Engine - Kubernetes 1.10.2-gk.3
OS: Container-Optimized OS (Chrome OS)
Datadog agent: 6.2.1-jmx

My registry file

The registry file that the filebeat integration tried to parse.

[
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 8062
    },
    "offset": 123177,
    "source": "/var/lib/docker/containers/3395bc9e8f76c040e2a7cb3d10ac00415084e62bad3ebd11bee8aaa6286e0612/3395bc9e8f76c040e2a7cb3d10ac00415084e62bad3ebd11bee8aaa6286e0612-json.log",
    "timestamp": "2018-06-07T12:48:41.184566453Z",
    "ttl": -1,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 415428
    },
    "offset": 869,
    "source": "/var/lib/docker/containers/df4b8d396dfad06a164e89dad7a1f8369201496518f8e8703cc11e05b30ab828/df4b8d396dfad06a164e89dad7a1f8369201496518f8e8703cc11e05b30ab828-json.log",
    "timestamp": "2018-06-07T09:58:43.408651902Z",
    "ttl": -2,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 8082
    },
    "offset": 0,
    "source": "/var/lib/docker/containers/7d07de469a0fbe34f43abce50c22537a1673207899e94875e4876eda8bb49504/7d07de469a0fbe34f43abce50c22537a1673207899e94875e4876eda8bb49504-json.log",
    "timestamp": "2018-06-07T12:46:32.196587691Z",
    "ttl": -1,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 8155
    },
    "offset": 0,
    "source": "/var/lib/docker/containers/ab9ccfe73d0c2779f8f7985d17551a083221157ed4e467707d4cd2d15612e128/ab9ccfe73d0c2779f8f7985d17551a083221157ed4e467707d4cd2d15612e128-json.log",
    "timestamp": "2018-06-07T09:39:56.976326151Z",
    "ttl": -2,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 410824
    },
    "offset": 376359,
    "source": "/var/lib/docker/containers/cd55e2c8163d620ba31d7c547c948075d7a71b7e3841576c02fa76b0a4034c17/cd55e2c8163d620ba31d7c547c948075d7a71b7e3841576c02fa76b0a4034c17-json.log",
    "timestamp": "2018-06-07T12:05:38.478044287Z",
    "ttl": -2,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 410890
    },
    "offset": 1581,
    "source": "/var/lib/docker/containers/fe61754255ee3d70587a24684405010365a55968a7fa5578e57bf6713b38772e/fe61754255ee3d70587a24684405010365a55968a7fa5578e57bf6713b38772e-json.log",
    "timestamp": "2018-06-07T12:46:33.193958253Z",
    "ttl": -1,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 411014
    },
    "offset": 1148421,
    "source": "/var/lib/docker/containers/42ec4df378748a7975399ab4eacebf1a10426e4bcbfc370781a77dc441c5d64d/42ec4df378748a7975399ab4eacebf1a10426e4bcbfc370781a77dc441c5d64d-json.log",
    "timestamp": "2018-06-07T12:50:50.193160614Z",
    "ttl": -1,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 411064
    },
    "offset": 1610,
    "source": "/var/lib/docker/containers/d97a48995bfc3324ec87bd8517bb624209eb1629a2d880730326aa7626154463/d97a48995bfc3324ec87bd8517bb624209eb1629a2d880730326aa7626154463-json.log",
    "timestamp": "2018-06-07T12:46:32.184737588Z",
    "ttl": -1,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 421663
    },
    "offset": 81940,
    "source": "/var/lib/docker/containers/84d7a8b98a0d99713a04e3f97bc7a914067166d78e8ace55da8bf05cf8ab0d58/84d7a8b98a0d99713a04e3f97bc7a914067166d78e8ace55da8bf05cf8ab0d58-json.log",
    "timestamp": "2018-06-07T12:50:55.72736397Z",
    "ttl": -1,
    "type": "docker"
  },
  {
    "FileStateOS": {
      "device": 2049,
      "inode": 673791
    },
    "offset": 149914,
    "source": "/var/lib/docker/containers/5517b5b6319e111f3d9feb7ccf5bbf08c645062dc345c62c9e765a6eadb3b75b/5517b5b6319e111f3d9feb7ccf5bbf08c645062dc345c62c9e765a6eadb3b75b-json.log",
    "timestamp": "2018-06-07T12:46:06.2639707Z",
    "ttl": -2,
    "type": "docker"
  }
]

rake version

I think you should pin the rake version to < 11.0 because this error

NoMethodError: undefined method `last_comment' for #<Rake::Application:0x00000002af9b30>

appears everywhere, even in the ci test which will cause failure.

Bad hbase_master Metrics Configuration

In the metrics configuration for the hbase_master integration, there is a problem with the string quoting for Datadog 6. This causes problems interfering with JMX when the integration is installed. Instructions for fixing the problem are in the file itself, but not in any supporting documentation:

#must wrap true and false in quotes for Agent 6

The error message that is generated is not particularly informative either:

2021-01-06 00:33:40 UTC | CORE | ERROR | (/root/.gimme/versions/go1.11.5.linux.amd64/src/net/http/server.go:1747 in func1) | Error from the agent http API server: http: panic serving 127.0.0.1:59930: interface conversion: interface {} is bool, not string
goroutine 57 [running]:
net/http.(*conn).serve.func1(0xc0000cb2c0)
	/root/.gimme/versions/go1.11.5.linux.amd64/src/net/http/server.go:1746 +0xd0
panic(0x2c1eb20, 0xc0002fa5a0)
	/root/.gimme/versions/go1.11.5.linux.amd64/src/runtime/panic.go:513 +0x1b9
github.com/DataDog/datadog-agent/pkg/util.GetJSONSerializableMap(0x2bf6ec0, 0xc00068b9b0, 0xc0004a4ba0, 0xb)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/pkg/util/common.go:124 +0x76d
github.com/DataDog/datadog-agent/pkg/util.GetJSONSerializableMap(0x2bf6ec0, 0xc00068b980, 0xc0004a4b50, 0x10)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/pkg/util/common.go:124 +0x173
github.com/DataDog/datadog-agent/pkg/util.GetJSONSerializableMap(0x2bf6ec0, 0xc00068b8f0, 0xc0004a4c00, 0x6)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/pkg/util/common.go:124 +0x173
github.com/DataDog/datadog-agent/pkg/util.GetJSONSerializableMap(0x2bf6ec0, 0xc00068b8c0, 0xc000969318, 0x7)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/pkg/util/common.go:124 +0x173
github.com/DataDog/datadog-agent/pkg/util.GetJSONSerializableMap(0x2bf6ec0, 0xc00068b890, 0x2bfc680, 0xc00068bc80)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/pkg/util/common.go:124 +0x173
github.com/DataDog/datadog-agent/pkg/util.GetJSONSerializableMap(0x2a50540, 0xc0000f9d60, 0xc000867c00, 0x17)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/pkg/util/common.go:137 +0x2ef
github.com/DataDog/datadog-agent/pkg/util.GetJSONSerializableMap(0x2bf6ec0, 0xc00068b3b0, 0xd80, 0x2a4f240)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/pkg/util/common.go:124 +0x173
github.com/DataDog/datadog-agent/cmd/agent/api/agent.getJMXConfigs(0x322ae60, 0xc00025e540, 0xc0004b3200)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/cmd/agent/api/agent/agent_jmx.go:58 +0x37a
github.com/DataDog/datadog-agent/cmd/agent/api/agent.componentConfigHandler(0x322ae60, 0xc00025e540, 0xc0004b3200)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/cmd/agent/api/agent/agent.go:107 +0x1e0
net/http.HandlerFunc.ServeHTTP(0x3010098, 0x322ae60, 0xc00025e540, 0xc0004b3200)
	/root/.gimme/versions/go1.11.5.linux.amd64/src/net/http/server.go:1964 +0x44
github.com/DataDog/datadog-agent/cmd/agent/api.validateToken.func1(0x322ae60, 0xc00025e540, 0xc0004b3200)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/cmd/agent/api/server.go:114 +0x99
net/http.HandlerFunc.ServeHTTP(0xc0000f95c0, 0x322ae60, 0xc00025e540, 0xc0004b3200)
	/root/.gimme/versions/go1.11.5.linux.amd64/src/net/http/server.go:1964 +0x44
github.com/DataDog/datadog-agent/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc000333ea0, 0x322ae60, 0xc00025e540, 0xc0004b3200)
	/.omnibus/src/datadog-agent/src/github.com/DataDog/datadog-agent/vendor/github.com/gorilla/mux/mux.go:162 +0xf1
net/http.serverHandler.ServeHTTP(0xc0002fe270, 0x322ae60, 0xc00025e540, 0xc0004b2e00)
	/root/.gimme/versions/go1.11.5.linux.amd64/src/net/http/server.go:2741 +0xab
net/http.(*conn).serve(0xc0000cb2c0, 0x322cd20, 0xc000686340)
	/root/.gimme/versions/go1.11.5.linux.amd64/src/net/http/server.go:1847 +0x646
created by net/http.(*Server).Serve
	/root/.gimme/versions/go1.11.5.linux.amd64/src/net/http/server.go:2851 +0x2f5

This error can be fixed by uncommenting the following line in metrics.yaml, but it took some investigation to realize this. Either metrics.yaml should be corrected so that this error does not occur, or the documentation should be updated for users of version 6.

A developper for SIOS Tecnology unable to install python wheel

Email 1:
Hello,

I'm Tadashi, a developper for SIOS Tecnology. Now I'm trying to create Datadog Intergration, but I've been failing the last one step.

I succeeded to create a wheel following https://docs.datadoghq.com/developers/integrations/new_check_howto/?tab=configurationtemplate,
But I've failed then next install command.

The error message is the following.

sudo -u dd-agent datadog-agent integration install -w datadog_appkeeper-1.0.0-py2.py3-none-any.whl

For your security, only use this to install wheels containing an Agent integration and coming from a known source. The Agent cannot perform any verification on local wheels.
WARNING: Requirement 'datadog_appkeeper-1.0.0-py2.py3-none-any.whl' looks like a filename, but the file does not exist
Processing ./datadog_appkeeper-1.0.0-py2.py3-none-any.whl
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/home/ec2-user/dd/integrations-extras/appkeeper/dist/datadog_appkeeper-1.0.0-py2.py3-none-any.whl'
Consider using the --user option or check the permissions.

It looks that datadog-agent integration install command internally calls pip or something and it outputs this message.
For this error, if pip, --user option may be the answer, but, datadog-agent integration install command doesn't seem to have such option.

Let me know how to clear this issue?

One more thing... if you have a technical FAQ, let me know the URL, please.

Email 2:
Thank you for your reply.

First, to create your integration tile, you need to clone the Integrations Extras repository
(https://github.com/DataDog/integrations-extras.git) to your Git.

Next you need to install Datadog's Developer Toolkit to create the proper scaffolding, so that you have the files that
you need--such as the README.md, metadata.csv, manifest.json, setup.py, etc.

I have already done these (cloning the Integrations-Extras repository, installing Toolkit, creating all assets).
Then, following the documentation, I did the following.

  1. Build to create a wheel

It has succeeded to create, so I think I could create all assets correctly.

  1. Install the Wheel

I've failed here. Why?

Finally, you just need to take those files and commit them to a new branch
of the Integrations Extras repository on your Git, and then push
that to the Integrations Extras GitHub repository.
You should then see your branch in
https://github.com/DataDog/integrations-extras/compare and you can create a pull request.

Does this mean that I do not need to do the build and the installation of Wheel above in my local environment?
Of course I will check the created things with unit tests and integration tests, but is it okay to submit a PR before actually checking them work well?

Regards,
Tadashi ([email protected])

`rake setup_env` issue

I'm getting an error setting up the environment. Looks like an underlying gem is failing?

geoff:integrations-extras geoffrey$ rake setup_env
/Users/geoffrey/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/datadog-sdk-testing-0.4.2/lib/tasks/sdk.rake
rake aborted!
NoMethodError: undefined method `last_comment' for #<Rake::Application:0x007fa970958b18>
/Users/geoffrey/dev/integrations-extras/Rakefile:21:in `load'
/Users/geoffrey/dev/integrations-extras/Rakefile:21:in `<top (required)>'
/Users/geoffrey/.rbenv/versions/2.2.2/bin/bundle:23:in `load'
/Users/geoffrey/.rbenv/versions/2.2.2/bin/bundle:23:in `<main>'
(See full trace by running task with --trace)

`setup_env` fails when system Python is not `2.7`

Using Archlinux, where /usr/bin/python is a symlink to /usr/bin/python3, the setup_env step is failing with the following error:

$> rake setup_env
Downloading https://pypi.io/packages/source/s/setuptools/setuptools-32.3.1.zip
Extracting in /tmp/tmp_yh69rdr
Now working in /tmp/tmp_yh69rdr/setuptools-32.3.1
Installing Setuptools
warning: no files found matching '*' under directory 'setuptools/_vendor'
Cloning into '/home/user/Work/integrations-extras/embedded/dd-agent'...
remote: Counting objects: 699, done.
remote: Compressing objects: 100% (641/641), done.
remote: Total 699 (delta 53), reused 311 (delta 21), pack-reused 0
Receiving objects: 100% (699/699), 6.97 MiB | 2.02 MiB/s, done.
Resolving deltas: 100% (53/53), done.
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-eh4fu0j5/supervisor/
sh: /home/user/Work/integrations-extras/venv/lib/python2.7/site-packages/datadog-agent.pth: No such file or directory

Apparently this is caused by the datadog-sdk-testing ruby gem calling python directly and not a more explicit version: https://github.com/DataDog/datadog-sdk-testing/blob/master/lib/tasks/sdk.rake#L22

This could be solved on the aforementioned gem by using python2 instead of python for each python call.

AFAIK other Linux distributions are making the move to python3 as the system default, so this issue might also impact other distributions.

Issues running on filebeat 6.3.0

It would appear that the client doesn't want to work on 6.3.0 now that you can fix the registry permissions. You now get this error in the output:

2018-06-20 18:07:24 UTC | ERROR | (runner.go:276 in work) | Error running check filebeat: [{"message": "'list' object has no attribute 'itervalues'", "traceback": "Traceback (most recent call last):\n File \"/opt/datadog-agent/bin/agent/dist/checks/__init__.py\", line 332, in run\n self.check(copy.deepcopy(self.instances[0]))\n File \"/etc/datadog-agent/checks.d/filebeat.py\", line 30, in check\n for item in registry_contents.itervalues():\nAttributeError: 'list' object has no attribute 'itervalues'\n"}]

Example of my registry:

[ { "FileStateOS": { "device": 51713, "inode": 33246 }, "type": "log", "ttl": -1, "timestamp": "2018-06-20T17:36:57.928067509Z", "offset": 0, "source": "/var/log/ansible-pull.log" }, { "FileStateOS": { "device": 51713, "inode": 161944 }, "type": "log", "ttl": -1, "timestamp": "2018-06-20T18:07:33.128583217Z", "offset": 9173, "source": "/var/log/datadog/trace-agent.log" } } ]

Management of Community Integrations

Hi,

I am currently one of the community maintainers for your integrations (specifically event store). I want to start a conversation around how do we know when issues or PR are added to the codebase for items we are meant to be a maintainer for, and additionally how do new contributors know who to inform or reach out to for issues on their integrations.

I would like to suggest we move to using the CODEOWNERS file as an easy way to find out who is the maintainer for what areas?

What are your thoughts on this?

nvml metrics missing pod labels

Steps to reproduce the issue:

  1. Run check in a Kubernetes cluster

Describe the results you received:
Metrics are annotated with labels coming from the node (the agent does it), as well as the gpu number and pod name/namespace (coming from the check itself)

Describe the results you expected:
The above, plus some/all the pod's labels and others that the core Kubernetes checks usually set (kube_service, etc.)

Additional information you deem important (e.g. issue happens only occasionally):
The pod name is nice, but it's hard to aggregate numbers without any of the pod's labels.
Yes, I know this is not trivial. This will require talking to the kubelet, something that nothing in extras does yet. Can we call any of the core check code?

Filebeat and dd-agent permission issues

So, I'm trying to use the filebeat integration and am running into an permissions problem. Filebeat by default runs as root because it basically will need to read every log file. And consequently the dd-agent runs as dd-agent. So I'm curious as to how people are dealing with the permissions issue for the registry file since it's root/root 0600 I'd love to use this integration but you can't just change that permission since filebeat with revert it a few seconds later.

[neo4j] No connection adapters were found

Output of the info page

=== Service Checks ===
[
  {
    "check": "neo4j.can_connect",
    "host_name": "i-REDACTED",
    "timestamp": 1555432622,
    "status": 2,
    "message": "Unable to fetch Neo4j stats: No connection adapters were found for 'REDACTED'",
    "tags": [
      "server_name:neo4j-pdoqa",
      "url:REDACTED"
    ]
  }
]
=========
Collector
=========

  Running Checks
  ==============

    neo4j (unversioned)
    -------------------
      Instance ID: neo4j:51fb41f6fab9b43e [ERROR]
      Total Runs: 1
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 1
      Average Execution Time : 1ms
      Error: Unable to fetch Neo4j stats: No connection adapters were found for 'REDACTED'
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py", line 774, in run
          self.check(copy.deepcopy(self.instances[0]))
        File "/etc/datadog-agent/checks.d/neo4j.py", line 106, in check
          raise CheckException(msg)
      CheckException: Unable to fetch Neo4j stats: No connection adapters were found for 'REDACTED'

Additional environment details (Operating System, Cloud provider, etc):
Linux CentOS

Steps to reproduce the issue:

  1. sudo datadog-agent check neo4j

Describe the results you received:
Error seen above.

Describe the results you expected:
No errors and integration working with DD.

Additional information you deem important (e.g. issue happens only occasionally):
Had a 401 error before, but that issue was fixed. The REDACTED info is just the url used. Not sure what No connection adapter were found mean.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.