awslabs / aws-embedded-metrics-node Goto Github PK

View Code? Open in Web Editor NEW

249.0 9.0 35.0 1.34 MB

Amazon CloudWatch Embedded Metric Format Client Library

Home Page: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format.html

License: Apache License 2.0

TypeScript 95.58% Shell 3.02% Dockerfile 0.66% JavaScript 0.73%

aws-embedded-metrics-node's People

Stargazers

Watchers

aws-embedded-metrics-node's Issues

Measuring a request

Sorry, I have two questions RE AWS EMF and this library:

Why isn't it part of the AWS JS SDK?
Use case: Measure a fetch. Lambda functions typically depend on external services and I am wondering if EMF logging is a good approach to tracking the performance of these dependencies. Or is that best left to some AWS X-ray instrumentation?

Are the metrics created standard or high resolution?

The AWS Publishing custom metrics page mentions standard and high resolution metrics. I have had a look through the documentation and source code for this package and can find no mention of either.

I am wanting to write a blog post about this package, but want to be clear to any reader what the granularity of the resulting metrics will be.

Thanks

Better support for FireLens

Add a FireLens environment that checks for FLUENT_HOST and uses that to configure the sink
FireLens does not need the LogGroup/LogStream fields in the metadata object since it is configured on the agent itself

aws-embedded-metrics-node/src/sinks/AgentSink.ts

Lines 92 to 95 in 6132122

context.meta.LogGroupName = this.logGroupName;

if (this.logStreamName) {

context.meta.LogStreamName = this.logStreamName;

}

We could do this by subclassing AgentSink into GenericAgentSink and CloudWatchAgentSink or by delegating configuration of the AgentSink to the environment.

Alternatively, we could just make LogGroup entirely optional for agents.

Support for default dimensions

Hi, sorry in advance if this is already something supported by the library and I couldn't find it.

What?

I'm trying to override the default dimensions to remove things that aren't useful to me, such as LogGroup and ServiceType.

I can't find a way to actually do this. I see from the Configuration in the README that I can set namespace which is also useful, but not enough.

I thought there'd be a similar way to handle Dimensions, but there aren't. Ideally, I'd like to be able to create a single MetricsLogger in my application and wire it in as needed. That logger should always have the dimensions I want as baseline, and some methods might add more.

I recognize that this wouldn't work for an environment variable configuration, but how about something like:

const { Configuration } = require("aws-embedded-metrics");

Configuration.dimensions = [{
    version: 'N+1' // Latest Lambda Version specific
  },
  {}  // Whole fleet metrics
];

This would allow me to alarm on both the "latest" Lambda version specific metrics, and the fleet as a whole as I'd be emitting metrics for both at the same rate. This is especially useful during deployments as I want to see how the newest version is doing.

An alternative would be to open up the constructor of the MetricsLogger. I see that it is technically public today, but we don't have access to the EnvironmentProvider (again, as far as I am able to tell).

Let me know what you think, and if there's already a way to do this I apologize in advance.

Runtime Validations

allowed:

putDimensions({ Method: "GET", StatusCode: "200" })

not allowed:

putDimensions({ Method: "GET", StatusCode: 200 })

Public API for getting metrics that have been set

I have a usecase where I want to get access to one or all the metrics that have been set with putMetric before sending a response back to the user. Specifically I'm planning on putting some of the metric values into server timing headers so that I can tie clientside behavior back to serverside metrics.

I'm found that while there is not a public getMetric method I can get access to previously set metrics by inspecting the data here

My question is whether this is an okay, somewhat stable method to use? Or if this is exclusively private/internal data that I should not be touching? If the latter, is there interest in adding formal getMetric or getProperty methods?

TcpClient socket writable typo

Hi! 👋

It seems that there is a typo in this library for a method on Socket.

Here is the diff that solved my problem:

diff --git a/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js b/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js
index 48c7a37..f49168b 100644
--- a/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js
+++ b/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js
@@ -78,7 +78,7 @@ class TcpClient {
     }
     waitForOpenConnection() {
         return __awaiter(this, void 0, void 0, function* () {
-            if (!this.socket.writeable || this.socket.readyState !== 'open') {
+            if (!this.socket.writable || this.socket.readyState !== 'open') {
                 yield this.establishConnection();
             }
         });

Add EKS example

Add full EKS example

metriscope returning something that's not a function

'use strict';
const { metricScope } = require("aws-embedded-metrics");
const myFunc = metricScope(metrics =>
async () => {
var domains = ["google.com"];
domains.forEach(function(domain){
var https = require('https');
var options = {
host: domain,
port: 443
};

            var req = https.request(options, function(res) {
                    req.end();
                    const days = Math.floor((new Date(res.connection.getPeerCertificate().valid_to) - new Date()) / 86400000)
                    console.log(days);
                    console.log()
            });

    });

});
exports.handler = myFunc();

metriscope is not returning a function, having a this error

{
"errorType": "Runtime.HandlerNotFound",
"errorMessage": "index.handler is not a function",
"trace": [
"Runtime.HandlerNotFound: index.handler is not a function",
" at Object.module.exports.load (/var/runtime/UserFunction.js:150:11)",
" at Object. (/var/runtime/index.js:43:30)",
" at Module._compile (internal/modules/cjs/loader.js:1015:30)",
" at Object.Module._extensions..js (internal/modules/cjs/loader.js:1035:10)",
" at Module.load (internal/modules/cjs/loader.js:879:32)",
" at Function.Module._load (internal/modules/cjs/loader.js:724:14)",
" at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:60:12)",
" at internal/main/run_main_module.js:17:47"
]
}

Refactor environment mapping out of the detector

See #19

Support resetting custom dimensions and related enhancements

Related issue:

This is the equivalent enhancements related to resetting custom dimensions in the Node library. Similar APIs and features should be implemented in all libraries for consistency. Specifially, the enhancements include:

Provide a resetDimensions(useDefault: boolean) API for resetting dimensions
Provide a feature flag to control whether flush() should preserve custom dimensions after each execution
Provide a setDimensions(useDefault: boolean, ...dimensionSets: Array<Record<string, string>>) API to control the behavior of default dimensions while setting new dimensions.

Allow namespace to be specifiable via environment variables

"no exported member 'mockLogger'" error when mocking the library in TypeScript

Context

I'm using this library to send metrics via the metricScope and I need to test the functionality. To do so I'm following the example described here https://github.com/awslabs/aws-embedded-metrics-node/blob/master/examples/testing/tests/module.jest.test.js.

The project is written in TypeScript and I use jest for unit testing.

Issue

This example https://github.com/awslabs/aws-embedded-metrics-node/blob/master/examples/testing/tests/module.jest.test.js fails in a TypeScript context with the following error:

Module '"aws-embedded-metrics"' has no exported member 'mockLogger'.ts(2305)

Question

How do I expose the mockLogger to the test function without making TypeScript complain?
Is there a different approach to mock the logger and expose it to the test?

Jest and ts-jest versions are incompatible

package.json contains:

    "jest": "^24.8.0",
    "npm-pack-zip": "^1.2.7",
    "prettier": "^1.19.1",
    "ts-jest": "^26.1.1",

npm 7's install, even with --production:

npm ERR! code ERESOLVE
npm ERR! ERESOLVE unable to resolve dependency tree
npm ERR! 
npm ERR! While resolving: [email protected]
npm ERR! Found: [email protected]
npm ERR! node_modules/jest
npm ERR!   dev jest@"^24.8.0" from the root project
npm ERR! 
npm ERR! Could not resolve dependency:
npm ERR! peer jest@">=26 <27" from [email protected]
npm ERR! node_modules/ts-jest
npm ERR!   dev ts-jest@"^26.1.1" from the root project
npm ERR! 
npm ERR! Fix the upstream dependency conflict, or retry
npm ERR! this command with --force, or --legacy-peer-deps
npm ERR! to accept an incorrect (and potentially broken) dependency resolution.

You need to bump Jest to 26 to fix this.

Allow Local environment detection (or have Local be the default environment)

We're using this library with our lambda function for creating embedded metrics, and it works fantastically for that use case, but we've been struggling a little bit with getting it all to play smoothly during local development. I'm aware that we can set AWS_EMF_ENVIRONMENT to Local at runtime to make the library use stdout, and we've added that to a few of our package.json scripts to make it easier for everyone, but there are still a few cases where there's no simple or easy route to setting this, or somebody does something a little different and forgets to add that environment variable (we work in a large monorepo project, so the number of people contributing who know many detail around EMF is fairly low). This leads to confusing error messages about UnhandledPromiseRejectionWarning: Error: connect ECONNREFUSED 0.0.0.0:25888 when running in Node 12 - and for anyone who is running in Node 14, it seems to actually just crash the local server.

Looking into the library, it looks like Agent is set up as the default environment, and the Local environment probe method is intentionally a no-op, which means that the only way to use the library in local environment mode is via this environment variable.

Is there a historical reason why it is this way? Is there no other way to detect if the environment should be Agent, and thus switch the default environment around, or having some other way to auto-detect if the environment should be Local (e.g. if NODE_ENV !== 'production')?

[Feature] "MetricsContext#add" (reduce metrics duplication)

Describe the user story

Currently, all metrics are filtered against all defined dimensions, resulting in metrics associated to unrelated dimensions with a consequent duplication, unless you flush the metric logger multiple times, resulting in separate JSON logs in CloudWatch.

As a developer I'd like to filter a group of metrics against different dimensions within the same metric logger context (same JSON payload)

Use Case

I'm currently working on a project where the server sends events rather than individual metrics. This event is a single JSON object containing all relevant metrics, dimensions and properties measured during the course of it (the metric logger is flushed only once per event instance).

As an example, let's take a pretty common event for web applications and call it page-request, which is triggered any time a web page is requested by a user. Let's assume the collected metrics and dimensions are the followings:

RequestCount
Counts the number of HTTP requests the server receives from the user. This metric is used to calculate the RPS.

Dimensions: PageType
Unit: Count
Aggregations: Sum

ResponseTime
The response time in milliseconds.

Dimensions: PageType
Unit: Milliseconds
Aggregations: Avg, 50th percentile, 95th percentile, 99th percentile

UpstreamRequestCount
Counts the the number of HTTP request the app performs towards its upstream services.

Dimensions: Client
Unit: Count
Aggregations: Sum

Where:

PageType: is the type of page requested by the user (e.g. home, player, etc)
Client: the name of the upstream service

From the example, RequestCount and ResponseTime share the same dimension, whereas UpstreamRequestCount is applied to a different one.

Let's write an example of metric logger which is called once immediately after the HTTP response has been sent to the user

const namespace = config.get('namespace');

export const logPageRequest = metricScope(metrics => {
  return async pageRequestEvent => {
    const {
      requestCount,
      responseTime,
      upstreamRequestCount,
      pageType,
      client
    } = pageRequestEvent;
    metrics.setNamespace(namespace);
    metrics.putMetric('RequestCount', requestCount, Unit.Count);
    metrics.putMetric('ResponseTime', responseTime, Unit.Milliseconds);
    metrics.putMetric('UpstreamRequestCount', upstreamRequestCount, Unit.Count);
    metrics.setDimensions(
      { PageType: pageType },
      { Client: client }
    );
  };
});

This example generates the following JSON log

and the following metrics are extracted

We can see that UpstreamRequestCount is also applied to the PageType dimension and RequestCount and ResponseTime to Client, effectively generating unnecessary new metrics (3 in this example).

Describe the outcome you'd like

According to the previous example, I'd like to filter the UpstreamRequestCount by Client only and RequestCount and ResponseTime by PageType, resulting in the following metrics

Here, the PageType group contains only the metrics that we want to apply, same thing for Client.

According to the EMF specification it is possible to add multiple CloudWatchMetrics objects

"CloudWatchMetrics": [
  {
    ... ...
  },
  {
    ... ...
  }
]

in order to define different groups of metrics that we want to apply to different dimensions. If we consider the previous example once again, we need to generate a JSON payload like the following

where the two metrics sharing the same dimensions are defined within the same CloudWatchMetrics object.

Generally speaking, each CloudWatchMetrics object contains metric that are filtered by the same group of dimensions.

Describe the solution you are proposing

To do so, I'm proposing to add a new method to the MetricsContext interface called add. The method will accept only one parameter which is an object defined in the next section.

Syntax

{
    "Name": String,
    "Value": Number,
    "Unit": String,
    "Metrics": [ MetricItem, ... ],
    "Dimensions": Object
}

Properties

Name

The metric name.

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: String

Value

The metric value.

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number

Unit

The metric unit (e.g. Unit.Count, Unit.Milliseconds, etc.)

Required: only if Metrics is undefined or an empty array, optional otherwise.
Type: Number

Metrics

An array of objects (see MetricItem type).

Required: only if Name, Value and Unit are undefined, optional otherwise.
Type: Array

Dimensions

The dimensions to filter the defined metrics by. This objects is a map of key/value pairs that stores the name and value of the dimension. Each property value must be of type String.

Required: yes
Type: Object

Types

MetricItem

{
  "Name": String,
  "Value": Number
  "Unit": String
}

Considering the previous example, our metric logger will look like

const namespace = config.get('namespace');

export const logPageRequest = metricScope(metrics => {
  return async pageRequestEvent => {
    const {
      requestCount,
      responseTime,
      upstreamRequestCount,
      pageType,
      client
    } = pageRequestEvent;
    metrics.setNamespace(namespace);
    metrics.add({
      Metrics: [
        {
          Name: 'RequestCount',
          Value: requestCount,
          Unit: Unit.Count
        },
        {
          Name: 'ResponseTime',
          Value: responseTime,
          Unit: Unit.Milliseconds
        }
      ],
      Dimensions: { PageType: pageType }
    });
  };
});

When we have one metric, we can either do

    metrics.add({
      Metrics: [
        {
          Name: 'UpstreamRequestCount',
          Value: upstreamRequestCount,
          Unit: Unit.Count
        }
      ],
      Dimensions: { Client: client }
    });

    metrics.add({
      Name: 'UpstreamRequestCount',
      Value: upstreamRequestCount,
      Unit: Unit.Count,
      Dimensions: { Client: client }
    });

Any other considerations about the solution

We could have modified the LogSerializer to optionally generate the multiple CloudWatchMetrics objects by means of a flag, but currently there is no association between group of metrics sharing the same dimensions. To achieve that, we could have modified the internal data structure by adding a mapping between them, resulting in a new method anyway to allow the user to express this relationship via the public API.

By creating a brand new method we keep the current data structure as is and the API back compatible with the previous version. The new method will have a separate data structure to allow the LogSerializer to easily understand how to serialise it.

Add a CircularBuffer in AgentSink

Description

Currently, if the agent is down or has not started, metrics can be dropped. It's currently up to the caller of logger.flush to handle retries. There are 2 options:

Backpressure the caller of logger.flush. This could negatively impact request latencies.
On error, enqueue to a circular buffer. The trick here is we will need to retry this queue on an interval which changes the model from an async/await to a purely async one. This is a departure from the current design and will need to be turned on via feature flag.

The symptoms of this are:

The first metrics during initialization of the app may not appear
The following error message will be in your app logs:

(node:1) UnhandledPromiseRejectionWarning: Error: connect ECONNREFUSED 172.17.0.2:25888
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1106:14)

Tasks

Add type AgentSinkOptions with
- RetryStrategy parameter where the default value is None for backwards compatibility with a single option to start with: ExponentialBackoffRetryStrategy (see also: https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/)
- AsyncBehavior parameter that controls whether the call should block or not. In the former case we keep the current behavior and in the latter we return immediately, enqueuing to the retry buffer on failure.
Change AgentSink's constructor to constructor(options: AgentSinkOptions, ISerializer: serializer).
Add RetryStrategies which the AgentSink uses based on its configuration. NoRetry propagates errors back to the caller of flush which maintains current behavior today. ExponentialRetry (which can be configured by the application) will block flush on the first attempt, enqueuing to a CircularBuffer (whose size is also configurable) on failures.
On startup, setInterval will be set to check the size of the CircularBuffer and retry failed requests asynchronously.
Add shutdown method to gracefully shutdown and block on any outstanding requests.

Example Usage

AWS_EMF_AGENT_RETRY_STRATEGY="ExponentialBackoff"
// or
Configuration.agentRetryStrategy = RetryStrategy.ExponentialBackoff;
// or 
Configuration.agentRetryStrategy = (...) => customRetryStratgy();

// ...
await logger.flush();
// execution control is returned when logs have been successfully flushed or enqueued for retry

Open Question

Should we change logger.flush() to enqueue and return immediately? This would allow us to make flush() a synchronous operation in all cases.

`handler(...) is not a function` error on local dev server

I will be trying to run this on EC2 (it seems from the examples it can be done). However when I run my dev server locally i see the error handler(...) is not a function

I can actually see the event logged to the console with stdout I see the handler error when trying to load a page in the browser.

This is my setup;

## CW Custom Metrics Configuration
AWS_EMF_SERVICE_NAME=AppName
AWS_EMF_LOG_GROUP_NAME=AppServer
AWS_EMF_ENVIRONMENT=Local

// custom CW metrics
const { metricScope, Unit } = require('aws-embedded-metrics');

const sendCustomMetric = metricScope(metrics => {
  async (metric, status, pagetype, url) => {
    metrics.putDimensions({ PageType: pagetype, StausCode: status });
    metrics.putMetric(metric, 1, Unit.Count);
    metrics.setProperty('URL', url);
  };
});

await sendCustomMetric('200_Response', status, pageType, url);

Is it net.Socket.writeable or net.Socket.writable?

aws-embedded-metrics-node/src/sinks/connections/TcpClient.ts

Line 82 in 5c02526

if (!this.socket.writeable || this.socket.readyState !== 'open') {

I was just looking at this code for something else and haven't had time to confirm this issue, but it looks like we have a typo here ("writeable" Vs. "writable").

putMetric doesn't flush after 100 elements are added

The README mentions that "If more metric values are added than are supported by the format, the logger will be flushed to allow for new metric values to be captured." but this doesn't seem true if I trust the output of my program using the library and the code I could read

This leads output to go above 100 leading to ignored entries.

metrics.setProperty("RequestId", context.requestId) utility

https://github.com/kaihendry/yt-aws-emf/blob/main/hello-world/app.js#L48

Hi, if I set the requestID in my EMF, how do I trace a putMetric back to the aws request ID?

RFC: Typed Metric Interfaces

Counters

Expose 3 methods for publishing counters.

metrics.increment('Increment');
metrics.decrement('Decrement');
metrics.count('Count', 10);

{ "Increment": 1, "Decrement": -1, "Count": 10 }

Multiple calls to the same key will be recorded as separate entries. This allows us to preserve the sample count.

metrics.increment('Key');
metrics.increment('Key');
metrics.decrement('Key');

{ "Key": [ 1, 1, -1 ] }

Alternatively, we can use the PMD syntax:

{ 
  "Key": {
      "type": "dist",
      "buckets": "explicit",
      "values": [ 1, -1 ],
      "counts": [ 2, 1 ] 
}

Gauges

metrics.gauge('key', 10);

Timers

metrics.time('key', 10);
timedMetricScope('operation', metrics => {
   // do things and track how long it takes...
})

Histograms: TBD

metrics.histo('key', 10);

{ 
  "Key": {
    "type": "dist",
    "buckets": "explicit",
    "values": [ 1 ],
    "counts": [ 10 ]
  } 
}

Programmatically Environment override doesn't work

Summary

According to #19 you can now override the environment detection by either using

const { Configuration } = require("aws-embedded-metrics");
Configuration.environmentOverride = "Local";

export AWS_EMF_ENVIRONMENT=Local

When I export the environment variable it works and I can see the agent logging to stdout, when I try the first approach, I still get the following error:

(node:783) UnhandledPromiseRejectionWarning: Error: connect ECONNREFUSED 0.0.0.0:25888
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
(node:783) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 13)
(node:783) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Details

I've got the following helper module

import { metricScope, Unit, Configuration } from 'aws-embedded-metrics';
import config from 'config';

const environment = config.get('environment');
const namespace = config.get('namespace');

Configuration.logGroupName = namespace;
Configuration.environmentOverride = 'Local';

export const pageRequestLogger = metrics => {
  return async pageRequestEvent => {
    const {
      requestCount,
      errorCount,
      responseTime,
      pageType,
      event,
      processPid,
      request,
      response,
      log,
      logTrace
    } = pageRequestEvent;
    metrics.setNamespace(namespace);
    metrics.putMetric('RequestCount', requestCount, Unit.Count);
    metrics.putMetric('ErrorCount', errorCount, Unit.Count);
    metrics.putMetric('ResponseTime', responseTime, Unit.Milliseconds);
    metrics.setDimensions({ PageType: pageType });

    metrics.setProperty('event', event);
    metrics.setProperty('processPid', processPid);
    metrics.setProperty('request', request);
    metrics.setProperty('response', response);
    metrics.setProperty('log', log);
    metrics.setProperty('logTrace', logTrace);
  };
};

export const logPageRequest = metricScope(pageRequestLogger);

Environment

The code runs on a Centos7 Docker container
Node: 12.18.0
NPM: 6.14.4

Disallow duplicate dimension sets.

There is no need for duplicate dimension sets like the following.

{
  "Dimensions": [
    [ "A", "B"],
    [ "A", "B"],
    [ "B", "A"],
  ]
}

This would re-create the same metric 3 times and is equivalent to:

{
  "Dimensions": [ [ "A", "B"] ]
}

This is needed for #14. It allows for things like the following while also allowing for re-use of the logger instance.

const doWork = metricScope(metrics => () => {
  metrics.putDimensions(dimensions);
  // ...
});

// act
doWork();
doWork();

logger.putMetric(metricKey, 0);
await logger.flush();

logger.putMetric(metricKey, 1);
await logger.flush();

Memory leak in TcpClient

It looks like there's a memory leak in TcpClient#sendMessage, making this SDK dangerous for long-running processes.

I wrote some HTTP client code that uses this SDK and makes a request to a simple server every 100ms. I launched the client on ECS, and I saw the memory grow at a steady rate until the task crashed. For a period I ran a revised version of the task that does not use the EMF SDK (~16:00-20:00 in the graph below), and during that period memory did not grow — so I know that the EMF SDK is the culprit.

I ran the server and client locally with node --inspect to see if I could track down the leak. What I found is JSArrayBufferData growing with every snapshot and never cleaning up. Looking at the list of retainers, I see that the TcpClient seems to be assigning an event listener that is never cleaned.

This seems to be the responsible code:

aws-embedded-metrics-node/src/sinks/connections/TcpClient.ts

Lines 47 to 58 in 8bc9002

 const onSendError = (err: Error): void => { 

 LOG('Failed to write', err); 

 reject(err); 

 }; 

 const wasFlushedToKernel = this.socket.once('error', onSendError).write(message, (err?: Error) => { 

 if (!err) { 

 LOG('Write succeeded'); 

 resolve(); 

 } else { 

 onSendError(err); 

 } 

 });

this.socket.once('error', onSendError), specifically.

I don't understand the purpose served by that listener. If this.socket.write fails, there's already code to run onSendError. Can we remove that listener completely? Or is there some edge case that it is meant to address?

Another option is to add this.socket.removeListener('once', onSendError) inside the callback for this.socket.write.

Both of those fixes appeared to cure the memory leak in my local runs. JSArrayBufferData stopped growing indefinitely.

Here's my client code, for reference:

'use strict';

require('https').globalAgent.keepAlive = true;
const http = require('http');
const { metricScope } = require('aws-embedded-metrics');

const POLL_TIME = 100;

const serverHost = process.env.ServerHost;
const serverPort = process.env.ServerPort;

const sendRequest = metricScope((metrics) => async () => {
  metrics.setProperty('Role', 'Client');
  metrics.putMetric('ReqCount', 1);

  return new Promise((resolve, reject) => {
    const start = process.hrtime.bigint();
    const handleRes = (res) => {
      res.on('data', () => {});
      res.once('error', (error) => {
        metrics.putMetric('ResError', 1);
        metrics.setProperty('StatusCode', res.statusCode);
        metrics.setProperty('ResErrorData', error);
        reject(error);
      });
      res.on('end', () => {
        const end = process.hrtime.bigint();
        const elapsedMs = Number(end - start) * 1e-6;
        const elapsedMsRounded = Math.round(elapsedMs * 100) / 100; // Round to 2 decimal places.
        metrics.putMetric('ResponseTime', elapsedMsRounded, 'Milliseconds');
        metrics.putMetric('ResSuccess', 1);
        metrics.setProperty('StatusCode', res.statusCode);
        resolve();
      });
    };

    const baseReqOptions = {
      method: 'GET',
      host: serverHost,
      port: serverPort
    };
    metrics.setProperty('ReqOptions', baseReqOptions);

    const req = http.request({ ...baseReqOptions }, handleRes);

    req.once('error', (error) => {
      metrics.putMetric('ReqError', 1);
      metrics.setProperty('ReqErrorData', error);
      reject(error);
    });
    req.end();
  });
});

async function main() {
  setInterval(() => {
    sendRequest().catch((error) => console.error(error));
  }, POLL_TIME);
}

exports.main = main;

if (require.main === module) {
  main().catch((error) => {
    console.error(error);
    process.exit(1);
  });

  process.on('SIGTERM', () => process.exit(0));
}

cc @rclark @springmeyer @danmactough

Produce metrics with different dimensions from single log

With the raw embedded metrics format, a single structured log can produce multiple metrics with different dimensions. Here's an example:

{
  "_aws": {
    "Timestamp": 1574109732004,
    "CloudWatchMetrics": [
      {
        "Namespace": "lambda-function-metrics",
        "Dimensions": [["dimension1"]],
        "Metrics": [
          {
            "Name": "metric1"
          }
        ]
      },
      {
        "Namespace": "lambda-function-metrics",
        "Dimensions": [["dimension2"]],
        "Metrics": [
          {
            "Name": "metric2"
          }
        ]
      }
    ]
  },
  "dimension1": "value1",
  "dimension2": "value2",
  "metric1": 100,
  "metric2": 200
}

Currently, this is not possible with the EMF SDK for Node.js. I would need to create a new MetricLogger for each distinct list of dimensions. This issue is a feature request to add an interface to the SDK that enables this kind of configuration.

The reason I'd prefer to produce these metrics from a single log line, instead of multiple, is so that I can create just one structured log per "unit of work," as described in this great AWS Builder's Library article.

Incompatible with Node 14: "Socket is closed" error

When I try to use this library in a Docker container running Node 14, I hit the following error:

events.js:291
      throw er; // Unhandled 'error' event
      ^
Error [ERR_SOCKET_CLOSED]: Socket is closed
    at Socket._writeGeneric (net.js:774:8)
    at Socket._write (net.js:796:8)
    at writeOrBuffer (_stream_writable.js:352:12)
    at Socket.Writable.write (_stream_writable.js:303:10)
    at /usr/local/src/prauthoxy-platform/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js:57:56
    at new Promise (<anonymous>)
    at TcpClient.<anonymous> (/usr/local/src/prauthoxy-platform/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js:52:19)
    at Generator.next (<anonymous>)
    at fulfilled (/usr/local/src/prauthoxy-platform/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js:18:58)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
Emitted 'error' event on Socket instance at:
    at emitErrorNT (internal/streams/destroy.js:106:8)
    at errorOrDestroy (internal/streams/destroy.js:167:7)
    at onwriteError (_stream_writable.js:391:3)
    at processTicksAndRejections (internal/process/task_queues.js:82:21) {
  code: 'ERR_SOCKET_CLOSED'
}

The exact same code usage works just fine on Node 12.

[RFC] Remove LogGroupName as default dimension

This issue is to receive feedback on whether or not users want LogGroup as a default dimensions on your metrics. This was originally intended to enable deep-linking from metrics to the EMF events, but we no longer believe this is the correct approach to creating this linkage. This is a breaking change, so we want to hear your feedback.

aws-embedded-metrics-node/src/logger/MetricsLogger.ts

Lines 142 to 147 in 2ec8a84

 const defaultDimensions = { 

 // LogGroup name will entirely depend on the environment since there 

 // are some cases where the LogGroup cannot be configured (e.g. Lambda) 

 LogGroup: environment.getLogGroupName(), 

 ServiceName: Configuration.serviceName || environment.getName(), 

 ServiceType: Configuration.serviceType || environment.getType(),

Unable to set service type, service name and log stream name

Problems

The variables AWS_EMF_SERVICE_TYPE and AWS_EMF_SERVICE_NAME are not included in the logged metric
The log stream in AWS_EMF_LOG_STREAM_NAME is ignored

Config

Running on ECS with Fargate (version 1.3.0)
The agent is in a sidecar
Using [email protected], Node 12

These are the container definitions:

ContainerDefinitions:
      # [...]
    - Name: app
      Image: redacted
      LogConfiguration:
          LogDriver: awslogs
          Options:
              awslogs-group: !Ref ContainerLogGroup
              awslogs-region: !Ref AWS::Region
              awslogs-stream-prefix: app
      Environment:
          - Name: AWS_EMF_SERVICE_NAME
            Value: !Sub ${AWS::StackName}-app
          - Name: AWS_EMF_SERVICE_TYPE
            Value: "NodeJS-API"
          - Name: AWS_EMF_LOG_GROUP_NAME
            Value: !Ref ContainerLogGroup
          - Name: AWS_EMF_LOG_STREAM_NAME
            Value: metrics
          - Name: AWS_EMF_NAMESPACE
            Value: !Sub ${AWS::StackName}
          - Name: AWS_EMF_ENABLE_DEBUG_LOGGING
            Value: true
    - Name: agent
      Image: amazon/cloudwatch-agent:latest
      LogConfiguration:
          LogDriver: awslogs
          Options:
              awslogs-group: !Ref ContainerLogGroup
              awslogs-region: !Ref AWS::Region
              awslogs-stream-prefix: agent
      Secrets:
          - Name: CW_CONFIG_CONTENT
            ValueFrom: !Ref CloudWatchAgentConfigArn

This is the agent config:

{
  "logs": {
    "metrics_collected": {
      "emf": {}
    }
  }
}

This is what the agent sidecar logs:

// Log stream: agent/agent/227b1b1f66744e318edb2d5e9bb57e2d

2020/11/09 16:31:03 I! 2020/11/09 16:31:03 E! ec2metadata is not available
--
2020/11/09 16:31:03 I! attempt to access ECS task metadata to determine whether I'm running in ECS.
I! Detected the instance is ECS
2020/11/09 16:31:03 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json ...
/opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json does not exist or cannot read. Skipping it.
Cannot access /etc/cwagentconfig: lstat /etc/cwagentconfig: no such file or directory2020/11/09 16:31:03 unable to scan config dir /etc/cwagentconfig with error: lstat /etc/cwagentconfig: no such file or directory
2020/11/09 16:31:03 Reading json config from from environment variable CW_CONFIG_CONTENT.
Valid Json input schema.
I! detect region from ecs
No csm configuration found.
No metric configuration found.
Configuration validation first phase succeeded
 
2020/11/09 16:31:03 I! Config has been translated into TOML /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
2020-11-09T16:31:03Z I! Starting AmazonCloudWatchAgent 1.247346.0
2020-11-09T16:31:03Z I! Loaded inputs: socket_listener socket_listener
2020-11-09T16:31:03Z I! Loaded aggregators:
2020-11-09T16:31:03Z I! Loaded processors:
2020-11-09T16:31:03Z I! Loaded outputs: cloudwatchlogs
2020-11-09T16:31:03Z I! Tags enabled:
2020-11-09T16:31:03Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"", Flush Interval:1s
2020-11-09T16:31:03Z I! [inputs.socket_listener] Listening on udp://[::]:25888
2020-11-09T16:31:03Z I! [inputs.socket_listener] Listening on tcp://[::]:25888
2020-11-09T16:31:03Z I! [logagent] starting
2020-11-09T16:31:03Z I! [logagent] found plugin cloudwatchlogs is a log backend

This is what the app container logs:

// Log stream: app/app/227b1b1f66744e318edb2d5e9bb57e2d

Received default dimensions {
  LogGroup: 'redactedLogGroupName',
  ServiceName: 'my-service-name',
  ServiceType: 'NodeJS-API'
}
Sending {} events to socket. 1
opening connection with socket in state:  closed
TcpClient connected. { host: '0.0.0.0', port: 25888, protocol: 'tcp:' }
Write succeeded

However, the metric gets logged to another log stream (not the one in AWS_EMF_LOG_STREAM_NAME) and it does not include the AWS_EMF_SERVICE_TYPE or AWS_EMF_SERVICE_NAME.

// Log stream: arn_aws_ecs_eu-north-1_redactedAccountId_task/redactedClusterName/227b1b1f66744e318edb2d5e9bb57e2d

{
    "Endpoint": "POST /authentication/v5/login/email",
    "Method": "POST",
    "Path": "/authentication/v5/login/email",
    "StatusCode": 200,
    "IP": "redactedIp",
    "UserAgent": "Amazon CloudFront",
    "containerId": "ip-redactedIp.eu-north-1.compute.internal",
    "createdAt": "2020-11-09T16:31:04.591709491Z",
    "startedAt": "2020-11-09T16:31:05.226439Z",
    "image": "redactedAccountId.dkr.ecr.eu-north-1.amazonaws.com/redactedImage",
    "cluster": "arn:aws:ecs:eu-north-1:redactedAccountId:cluster/redactedClusterName",
    "taskArn": "arn:aws:ecs:eu-north-1:redactedAccountId:task/redactedClusterName/227b1b1f66744e318edb2d5e9bb57e2d",
    "_aws": {
        "Timestamp": 1604996779353,
        "LogGroupName": "redactedLogGroupName",
        "CloudWatchMetrics": [
            {
                "Dimensions": [
                    [
                        "Endpoint"
                    ]
                ],
                "Metrics": [
                    {
                        "Name": "Latency",
                        "Unit": "Milliseconds"
                    },
                    {
                        "Name": "Success",
                        "Unit": "Count"
                    }
                ],
                "Namespace": "redactedNamespace"
            }
        ]
    },
    "Latency": 1415.994263,
    "Success": 1
}

Automatically flush for > 100 metrics per key

Currently, if putMetric is called > 100 times, it will fail silently on the backend. We should automatically flush client-side if this limit is hit.

Collect ECS Agent Metdata

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-introspection.html

"Cannot call write after a stream was destroyed" on ECS

I noticed many messages like the following on a production ECS system, after Upgrading from node 14 to node 16.15.0
Roundabout 1/3 of all writes fail with this message.

{ "message": "Cannot call write after a stream was destroyed", "name": "Error", "stack": "Error [ERR_STREAM_DESTROYED]: Cannot call write after a stream was destroyed\n at new NodeError (node:internal/errors:372:5)\n at _write (node:internal/streams/writable:321:11)\n at Socket.Writable.write (node:internal/streams/writable:334:10)\n at /thingregistry/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js:58:56\n at new Promise (<anonymous>)\n at TcpClient.<anonymous> (/thingregistry/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js:53:19)\n at Generator.next (<anonymous>)\n at fulfilled (/thingregistry/node_modules/aws-embedded-metrics/lib/sinks/connections/TcpClient.js:19:58)\n at runMicrotasks (<anonymous>)\n at processTicksAndRejections (node:internal/process/task_queues:96:5)", "code": "ERR_STREAM_DESTROYED" }

I could not find any obvious issue in the code.

MetricsLogger.new does not copy shouldUseDefaultDimensions value

The createCopyWithContext function does not copy over the shouldUseDefaultDimensions property to the new context, resulting in the default dimensions being included when they should not be.

Either, the default dimensions should be copied to the new context depending on the value of shouldUseDefaultDimensions

public createCopyWithContext(): MetricsContext {
  return new MetricsContext(
    this.namespace,
    Object.assign({}, this.properties),
    Object.assign([], this.dimensions),
    this.shouldUseDefaultDimensions == true ? this.defaultDimensions : [],
  );
}

Or also copy the shouldUseDefaultDimensions into the new context

public createCopyWithContext(): MetricsContext {
  return new MetricsContext(
    this.namespace,
    Object.assign({}, this.properties),
    Object.assign([], this.dimensions),
    this.defaultDimensions,
    this.shouldUseDefaultDimensions,
  );
}

Undocumented release 3.0.0

I noticed the release of an undocumented change as new major version on npm in the last 24 hours.

dist
.tarball: https://bahnhub.tech.rz.db.de:443/artifactory/api/npm/default-npm-3rdparty/aws-embedded-metrics/-/aws-embedded-metrics-3.0.0.tgz
.shasum: 0ecbd9e1411195ceef289109853ea4ac9e71626d
.integrity: sha512-4SsOynlnrdT9C8NzQFLqyIz/5g/sYPYsKF1yh+VcnIIMliXHt1CHZS4Gw0Gd0IDZLPS+sWrL6jyAzPvsaog7sg==

Securitywise the change does not look critical, still having a release with a change not mentioned in the Release Notes seems to be a red flag. https://github.com/awslabs/aws-embedded-metrics-node/releases

Add option to disable agent communication during unit tests

I'd like to block the SDK from trying to communicate with a CloudWatch Agent during certain unit tests. I don't see a documented way to do this right now.

Ideally, I'd also be able to run assertions against the information that the SDK would have logged. (Did the dimensions, properties, and metrics logged during this test match my expectations?) So one possible approach to this problem could be to "report" logs to some in-memory object that can be inspected, instead of the CloudWatch Agent. At a minimum, though, I'd like to be able turn off Agent communication, instead of having to mock the module's API whenever I want to run my code without an Agent.

String metric values are not rejected, but do not work.

I've found that it's possible to accidentally call putMetric with a string value, and this value is saved and outputted to cloudwatch as a quoted string, which of course doesn't work as a metric.

The typescript code has type checking that putMetric is called with a number, but this doesn't help if you're writing Javascript, and of course doesn't help at runtime.

It can be subtle and hard to spot what is going on, as if you accidentally push a numeric metric that is typed as a string, then it just ends up quoted in the generated JSON, which is hard to spot until it doesn't work, e.g. putMetric('someKey', '200').

The library should explicitly convert all values it's sent to Number() so that if someone accidentally sends string values then they'll be dealt with correctly if they can be easily converted to numeric.

Missing metric

https://github.com/kaihendry/yt-aws-emf/blob/fc846351857cad52341e12a9eebeb1d405ab3742/hello-world/app.js#L40

START RequestId: 6210176a-a746-425b-843b-ac8e3bba7eb4 Version: $LATEST
2021-10-27T14:14:17.648Z	6210176a-a746-425b-843b-ac8e3bba7eb4	INFO	{"level":"info","msg":"Starting request","context":{"callbackWaitsForEmptyEventLoop":true,"functionVersion":"$LATEST","functionName":"HelloWorldFunction","memoryLimitInMB":"128","logGroupName":"aws/lambda/HelloWorldFunction","logStreamName":"$LATEST","invokedFunctionArn":"","awsRequestId":"6210176a-a746-425b-843b-ac8e3bba7eb4"}}
2021-10-27T14:14:17.652Z	6210176a-a746-425b-843b-ac8e3bba7eb4	INFO	https://httpstat.us/200?sleep=0
2021-10-27T14:14:18.037Z	6210176a-a746-425b-843b-ac8e3bba7eb4	INFO	{"level":"info","msg":"called","urlWithParams":"https://httpstat.us/200?sleep=0","duration":383}
} ServiceType: 'AWS::Lambda::Function'6a-a746-425b-843b-ac8e3bba7eb4	INFO	Received default dimensions {
2021-10-27T14:14:18.044Z	6210176a-a746-425b-843b-ac8e3bba7eb4	INFO	{"url":"https://httpstat.us/200?sleep=0","status":"200","executionEnvironment":"AWS_Lambda_nodejs14.x","memorySize":"128","functionVersion":"$LATEST","logStreamId":"$LATEST","_aws":{"Timestamp":1635344057646,"CloudWatchMetrics":[{"Dimensions":[["url","status"]],"Metrics":[{"Name":"Size","Unit":"Bytes"},{"Name":"Success","Unit":"Milliseconds"}],"Namespace":"yt-emf1"}]},"Size":26,"Success":383}
END RequestId: 6210176a-a746-425b-843b-ac8e3bba7eb4
REPORT RequestId: 6210176a-a746-425b-843b-ac8e3bba7eb4	Init Duration: 0.21 ms	Duration: 755.33 ms	Billed Duration: 800 ms	Memory Size: 128 MB	Max Memory Used: 128 MB	
{"statusCode":200,"body":"{\"message\":{\"code\":200,\"description\":\"OK\"}}"

I do not understand why size is missing from my namespace. Any ideas? Thank you in advance

Calling flush would reset the namespace as well

Documentation for flush()

flush()
Flushes the current MetricsContext to the configured sink and resets all properties, dimensions and metric values. The namespace and default dimensions will be preserved across flushes.

Namespace is set on the context and at the end of flush, context is set back to empty.

Either the documentation or the behavior of flush should be updated.

Canary dockerfile needs updating

Canary currently runs on Node.js v10 but needs to be running on v16.

`DefaultEnvironment` fails to flush logs as it tries to use `AgentSink`

Hi,

This is more of a question than a bug.

Context:
We have added embedded metrics to our lambda stack that's using serverless and serverless-offline to run locally.

Issue:
When run locally (without AWS_LAMBDA_FUNCTION_NAME set) it defaults to DefaultEnvironment and kills the serverless-offline thread when .flush() is called because of unhandled promise rejection on another thread. It seems that default environment expect an AgentSink to be present. Is that expected?

Steps to reproduce:

  const defaultEnv = new DefaultEnvironment();
  const defaultLogger = createLogger(() => Promise.resolve(defaultEnv));
  defaultLogger.flush()

Application crashes when there's no connection to CloudwatchAgent

When running in a test environment (Which is similar to a setup on EC2 with a CloudwatchAgent) the whole node.js application crashes when it's unable to push metrics to the Agent and I'm unable to catch the error.

Steps to reproduce

Setup an environment similar to EC2 setup
Do not start a CloudWatchAgent
Try to push some metrics to the Agent (which of course fails)

Expected behavior

Library emits an error which can be caught to handle it appropriately OR handle the error with a log statement (Both are fine with me, but maybe in general it's better to let the caller handle the error)

Actual behavior

You get a UnhandledPromiseRejectionWarning: Error: connect ECONNREFUSED 0.0.0.0:25888
The moment the Agent starts you will see a Error: read ECONNRESET and the application closes

Root Cause
I tried to narrow down the problem and it seems that

aws-embedded-metrics-node/src/logger/MetricsLogger.ts

Line 50 in 2ec8a84

sink.accept(this.context);

is missing an await. At least if I add an await here the exception is caught and the Log (https://github.com/awslabs/aws-embedded-metrics-node/blob/master/src/logger/MetricScope.ts#L34) is emitted.

System Overview

Running the CloudWatchAgent in a docker container
Node v12.16.1
aws-embedded-metrics-node 2.0.3

Error [ERR_STREAM_DESTROYED]: Cannot call write after a stream was destroyed

This issue might be still related to
#120. @markkuhn

After we bumped up to the latest v2.0.6, still see the same error in Cloudwatch.

The issue is only in our stacks with node v16, no issue for stacks in node v14.

Agent endpoint when outside AWS?

I'm exploring using embedded metrics and would like to have some metrics sourced from a CLI tool that we have built for our dev community (deployment metrics, and cli tool usage etc).

The default AgentSink is presumably auto-resolved on EC2's and detects a Lambda env to write to STDOUT.. however, is it a feasible use case to submit these outside AWS Services?

I'm able to submit metrics using the aws-cli, but I keep getting connection refused for obvious reasons: TCP Client received error Error: connect ECONNREFUSED 0.0.0.0:25888

For context I'm on Direct Connect.

Would there need to be a custom Sink that uses the put-log-events API to achieve this? Or is there a tcp endpoint that I could configure?

Noop If Empty Metrics?

I'm using the following code to have only one dimension.

metrics.setDimensions();

In some cases for my Lambda there will not be any metrics. However, I have noticed in my log that an empty metric is published. For example:

{"executionEnvironment":"AWS_Lambda_nodejs12.x","memorySize":"1792","functionVersion":"$LATEST","logStreamId":"2020/07/12/[$LATEST]9a0092bb2cf76b6b90c46bf429a32aef","traceId":"Root=1-dc99d00f-c079a84d433534434534ef0d;Parent=91ed514f1e5c03b2;Sampled=1","_aws":{"Timestamp":1594571647673,"CloudWatchMetrics":[{"Dimensions":[],"Metrics":[],"Namespace":"MyApp"}]}}

Should this library check for dimension/metric presence before sending the output?

Add extended pre-release integration tests

See #32. Prior to releasing new versions, we need to run an extended bake test to validate there are no performance regressions.

[Feature] [development] new configuration to filter properties out of the serialised JSON log on the stdout.

Describe the user story

Currently, when I set AWS_EMF_ENVIRONMENT=Local and run the app, the agent logs to the stdout which is the expected behaviour.

For small logs this is fine, but when it comes to bigger projects the local sandbox terminal is flooded with tons of serialised mammoth JSON objects.

As a developer, I'd like to propose a new configuration that defines which properties of the JSON object can be logged by the agent so that I can dynamically customise the amount of details printed on the stdout. This functionality should be only available in development (i.e. when AWS_EMF_ENVIRONMENT is set to Local).

Use case

Let's say we have a log structure like the following:

{
  "PageType": "player",
  "event": {
    "id": "9bac0a47-1623-410d-bcd2-f03aa1283669",
    "source": "server",
    "trigger": "user",
    "type": "page-request"
  },
  "logTrace": [
    "[UpstreamName.apifeed]: Empty data, returning with graceful degradation."
  ],
  "processPid": 872,
  "requestPath": "/path/to/the/resource",
  "requestHeaders": {

  },
  "requestHeadersList": [

  ],
  "hasCookie": false,
  "cookieLength": 0,
  "cookieList": [],
  "responseStatus": 200,
  "responseHeaders": {

  },
  "upstreams": [
    {
      "name": "UpstreamName",
      "endpoint": "apifeed",
      "attempts": [
        {
          "cache": {
            "hit": true,
            "miss": false,
            "stale": false,
            "error": false,
            "timeout": false,
            "revalidate": false,
            "revalidateError": false
          },
          "response": {
            "headers": {

            },
            "body": {

            },
            "status": 200,
            "time": 86
          },
          "id": 1
        }
      ],
      "attemptCount": 1,
      "retryCount": 0,
      "requestCount": 0,
      "requestErrorCount": 0,
      "response5xxCount": 0,
      "response4xxCount": 0,
      "response3xxCount": 0,
      "response2xxCount": 0,
      "response1xxCount": 0,
      "responseInvalidCount": 0,
      "cacheAudit": [
        [
          "hit"
        ]
      ],
      "cacheHitCount": 1,
      "cacheMissCount": 0,
      "cacheStaleCount": 0,
      "cacheErrorCount": 0,
      "cacheTimeoutCount": 0,
      "cacheRevalidateCount": 0,
      "cacheRevalidateErrorCount": 0,
      "responseTime": 10
    }
  ],
  "imageId": "ami-someid",
  "instanceId": "i-someid",
  "instanceType": "some.instancetype",
  "privateIP": "127.0.0.1",
  "availabilityZone": "some-aws-region",
  "_aws": {
    "Timestamp": 2693848470655,
    "LogGroupName": "/example/live/player/app",
    "CloudWatchMetrics": [
      {
        "Dimensions": [
          [
            "PageType"
          ]
        ],
        "Metrics": [
          {
            "Name": "RequestCount",
            "Unit": "Count"
          },
          {
            "Name": "ResponseTime",
            "Unit": "Milliseconds"
          },
          {
            "Name": "ErrorCount",
            "Unit": "Count"
          },
          {
            "Name": "PageNotFoundCount",
            "Unit": "Count"
          }
        ],
        "Namespace": "/example/live/player/app"
      }
    ]
  },
  "RequestCount": 1,
  "ResponseTime": 211,
  "ErrorCount": 0,
  "PageNotFoundCount": 0
}

and I'm running the app locally. This object is clearly big to print out and if you think that it is serialised and logged once every time a new requests is performed by the user, you can imagine how busy the terminal will look like.

[UPDATE] the following section has been "quoted" to highlight that an amendment of the following requests has been added in the comments. The section remains here in the description for visibility.

Let's say I only want to log some textual information and not everything else. If you notice, among all these useful in production (but noisy in development) properties, there is an array called logTrace, I'd like to be able to do something similar to:
// in process
const { Configuration } = require("aws-embedded-metrics");
Configuration.somePropertyName = ['logTrace'];

// environment
AWS_EMF_SOME_PROPERTY_NAME="logTrace"
and on the terminal, printing something like the following:
{ "logTrace": [ "[UpstreamName.apifeed]: Empty data, returning with graceful degradation." ] }

Details

The new configuration should have the following requirements:

We can define multiple properties (comma separated for the environment variable, or using an array for the in-code variable)

We can define nested properties a-la lodash#get (e.g. "event.type") by using the dot-notation.

The agent will print a flattened object where all properties appear at the root and the key is the name of the property. If a selected key is nested (e.g. event.type) the property will use the dot-notation as a key name.

The new configuration is only active when AWS_EMF_ENVIRONMENT=Local so that it only apply in development.

If only one property is selected and its value is a string, only log the string. This way if you have a property that you use to log as you used to do before it will be pretty much similar to what you had.

I'm not particularly opinionated on the name of the config. I can give a couple examples but I'm open to other suggestions: AWS_EMF_ALLOWED_PROPERTIES, AWS_EMF_FILTERED_PROPERTIES, AWS_EMF_LOCAL_LOG_STRUCTURE, AWS_EMF_LOCAL_LOG_PROPERTIES (or some permutation). The last two environment variables suggest what they are about, and only work locally.

putMetric to support addition of custom metrics in CloudWatch

I have a use case where I want to enable high resolution metrics in CloudWatch. This is done via setting the 'StorageResolution' parameter in put-metric-data API to a value from 1 to 60. Currently putMetric only has these parameters (metricName, value, unit) and support for publishing custom metrics is missing.

Links - Publishing Custom Metrics, put-metric-data api

Allow setting Timestamp of a metric.

There is currently no way to set MetricsContext.meta.Timestamp via metricsScope.

Assuming the rules for PutMetricData also apply, it is valid to submit metrics that are not equal to new Date().

I am currently working around this be reimplementing metricScope(), instantiating MetricsContext and MetricsLogger directly, and passing MetricsContext to the given handler, so they can do context.meta.Timestamp = ....

For my purposes passing a single Date to metricScope would suffice.

I'd be happy to put together a PR if you think setting the Timestamp should be possible.

ES Module and Validator package

Are there plans to publish an ESM and tree shaken version of the package?

EMF is ~300kb without compression with 200kb just for the new validator dependency, most of which is not being used.

With SDK V3 being supported in Node18 on Lambda, EMF is now 90% of my bundle.

K8s Environment Detection

https://kubernetes.io/docs/concepts/services-networking/service/#discovering-services

Example environment variables for an eks pod named eks-demo.

  KUBERNETES_SERVICE_PORT: '443',
  KUBERNETES_PORT: 'tcp://10.100.0.1:443',
  EKS_DEMO_PORT_80_TCP_PORT: '80',
  NODE_VERSION: '10.16.0',
  HOSTNAME: 'eks-demo-55f57f865b-l7tcs',
  EKS_DEMO_PORT_80_TCP_PROTO: 'tcp',
  YARN_VERSION: '1.16.0',
  SHLVL: '1',
  HOME: '/root',
  EKS_DEMO_PORT_80_TCP: 'tcp://10.100.122.110:80',
  AWS_EMF_ENABLE_DEBUG_LOGGING: 'true',
  KUBERNETES_PORT_443_TCP_ADDR: '10.100.0.1',
  PATH:
   '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
  AWS_EMF_AGENT_ENDPOINT: 'tcp://127.0.0.1:25888',
  KUBERNETES_PORT_443_TCP_PORT: '443',
  KUBERNETES_PORT_443_TCP_PROTO: 'tcp',
  EKS_DEMO_SERVICE_HOST: '10.100.122.110',
  KUBERNETES_PORT_443_TCP: 'tcp://10.100.0.1:443',
  KUBERNETES_SERVICE_PORT_HTTPS: '443',
  KUBERNETES_SERVICE_HOST: '10.100.0.1',
  EKS_DEMO_SERVICE_PORT: '80',
  EKS_DEMO_PORT: 'tcp://10.100.122.110:80',
  PWD: '/app/src'

	context.meta.LogGroupName = this.logGroupName;
	if (this.logStreamName) {
	context.meta.LogStreamName = this.logStreamName;
	}

	const onSendError = (err: Error): void => {
	LOG('Failed to write', err);
	reject(err);
	};
	const wasFlushedToKernel = this.socket.once('error', onSendError).write(message, (err?: Error) => {
	if (!err) {
	LOG('Write succeeded');
	resolve();
	} else {
	onSendError(err);
	}
	});

	const defaultDimensions = {
	// LogGroup name will entirely depend on the environment since there
	// are some cases where the LogGroup cannot be configured (e.g. Lambda)
	LogGroup: environment.getLogGroupName(),
	ServiceName: Configuration.serviceName \|\| environment.getName(),
	ServiceType: Configuration.serviceType \|\| environment.getType(),

awslabs / aws-embedded-metrics-node Goto Github PK

aws-embedded-metrics-node's People

Stargazers

Watchers

Forkers

aws-embedded-metrics-node's Issues

What?

Context

Issue

Question

Describe the user story

Use Case

Describe the outcome you'd like

Describe the solution you are proposing

Syntax

Properties

Name

Value

Unit

Metrics

Dimensions

Types

MetricItem

Any other considerations about the solution

Description

Tasks

Example Usage

Open Question

Counters

Gauges

Timers

Histograms: TBD

Summary

Details

Environment

Describe the user story

Use case

Details

Recommend Projects

Recommend Topics

Recommend Org

Jobs