GithubHelp home page GithubHelp logo

nginx / agent Goto Github PK

View Code? Open in Web Editor NEW
241.0 34.0 63.0 46.12 MB

NGINX Agent provides an administrative entry point to remotely manage, configure and collect metrics and events from NGINX instances

Home Page: https://docs.nginx.com/nginx-agent/

License: Apache License 2.0

Makefile 2.09% Go 93.66% Shell 2.13% Dockerfile 2.08% HTML 0.03%
nginx agent api metrics metrics-gathering nginx-configuration nginx-server good-first-issue monitoring-tool observability

agent's People

Contributors

achawla2012 avatar adubhlaoich avatar aphralg avatar craigell avatar dareste avatar dean-coakley avatar defanator avatar dekobon avatar dependabot[bot] avatar dhurley avatar edarzins avatar github-actions[bot] avatar jcahilltorre avatar jputrino avatar mohamed-gougam avatar mrajagopal avatar mtbchef avatar nginx-nickc avatar nickchen avatar nkashiv avatar ochriso avatar oliveromahony avatar pdabelf5 avatar sanathkumarbs avatar spencerugbo avatar sylwang avatar travisamartin avatar u5surf avatar wicklander-bryant avatar yluf5 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

agent's Issues

Add option to ignore sensitive info while parsing nginx configs

Currently the agent uploads all the Nginx config files that are in the config_dirs path.

However a user may not be comfortable with sending their Nginx configs incase it contains sensitive information they want to protect.

The proposed enhancement is to add an option to sanitize the contents of the Configs using Crossplane's Ignore Directives like:

  • 'ssl_certificate_key'
  • 'ssl_client_certificate'
  • 'ssl_password_file'
  • 'ssl_stapling_file'
  • 'ssl_trusted_certificate'
  • 'auth_basic_user_file'
  • 'secure_link_secret'
    Before uploading them to the controller.

How do I retrieve the NGINX config?

I see there is an API end-point for applying NGINX configuration (PUT /nginx/config/), but how do I go about getting the current configuration? NIM is able to do this so I assume its posssible somehow?

How to forward petition to NGINX from Windows IIS

Hello,

So I currently am hosting a website through IIS in a Windows Server 2019, this website is utilizing port 80 and 443.

However on this same server I am attempting to serve a Django Application, for this I am utilizing waitress + nginx.

`# mysite_nginx.conf

# configuration of the server
server {
    # the port your site will be served on
    listen      80;
    # the domain name it will serve for
    server_name <server external ip>; # substitute your machine's IP address or FQDN
    charset     utf-8;

    # max upload size
    client_max_body_size 75M;   # adjust to taste

    # Finally, send all non-media requests to the Django server.
    location / {
        proxy_pass http://<server-internal-IP>:8000; # See output from runserver.py
    }
}`

I have done the configurations and everything is setup, however when I try to launch nginx I get the following prompt
nginx: [emerg] bind() to 0.0.0.0:80 failed (10013: An attempt was made to access a socket in a way forbidden by its access permissions)

This makes sense since both port 443 or 80 are being used by said website, I was browsing for solutions and they mentioned a way to "forward" or "upstream" the request towards whatever port nginx is to be using.

However I don't quite understand how let the server know when to forward the petition (when I want something from my Django) or when not to (when I'm just browsing my website).

Thanks in advance

Uninstalling agent duplicates messages

Description:
When uninstalling the agent, the output of the uninstall displays duplicate messages.

Reproduce:
Uninstall agent using sudo apt purge nginx-agent

Expected Result:
Logs show no duplicate messages.

Actual Result:
Logs show duplicate messages, see output below.

Output of sudo apt purge nginx-agent:

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages will be REMOVED:
  nginx-agent*
0 upgraded, 0 newly installed, 1 to remove and 61 not upgraded.
After this operation, 26.2 MB disk space will be freed.
Do you want to continue? [Y/n] y
(Reading database ... 68488 files and directories currently installed.)
Removing nginx-agent (2.25.0) ...
Stop and disable nginx-agent service
Running daemon-reload
Removing run directory
(Reading database ... 68486 files and directories currently installed.)
Purging configuration files for nginx-agent (2.25.0) ...
Stop and disable nginx-agent service
Running daemon-reload
Removing run directory
dpkg: warning: while removing nginx-agent, directory '/var/log/nginx-agent' not empty so not removed
dpkg: warning: while removing nginx-agent, directory '/etc/nginx-agent' not empty so not removed

"Failed to get disk metrics, permission denied" log message is unhelpful

This message is output from src/core/metrics/sources/disk.go:43. What appears to be going on here is that we loop through each mount point and collect disk usage metrics based on each one. However, if the operation fails for a given mount point there we do not output for which mount point an error was encountered.

As such, I'd suggest a change where we replace line 43 with the following:

c.logger.Log(fmt.Sprintf("Failed to get disk metrics for mount point %s, %v", part.Mountpoint, err))

If everyone is amenable, I'll submit a PR for this.

Nginx agent returns invalid metrics after the first correct result.

nginx agent works perfectly when I first install it and try getting metric at http://localhost:8081/metrics/, but just seconds later I'd get the error response below when trying getting metrics again:

"
An error has occurred while serving metrics:
106 error(s) occurred:

  • collected metric "nginx_upstream_response_failed" { ...} was collected before with the same name and label values
    ...
    "

I tried to search for help but there are barely no answers. Is this a bug or just something caused by my system setting? I install the nginx agent following the installation tutorial strictly.

How to config multiple grpcs proxy in same location with different headers?

environment:
macos: 12.7
nginx: 1.25.3

/usr/local/etc/nginx/nginx.conf :

   location /yeying.api.user.User/ {
       grpc_set_header Content-Type application/grpc;
       grpc_socket_keepalive on;
       if ($http_app_name = "store") {
           grpc_pass grpcs://localhost:9202;
           grpc_ssl_trusted_certificate /path1/cert/ca-cert.pem;
           grpc_ssl_certificate /path1/cert/client-cert.pem;
           grpc_ssl_certificate_key /path1/cert/client-key.pem;
           grpc_ssl_verify on;
           grpc_ssl_server_name on; 
           break;
       }

       if ($http_app_name = "robot") {
           grpc_pass grpcs://www.robot.pub:9002;
           grpc_set_header Te trailers;
           grpc_ssl_trusted_certificate /path2/cert/ca-cert.pem;
           grpc_ssl_certificate /path2/cert/client-cert.pem;
           grpc_ssl_certificate_key /path2/cert/client-key.pem;
           grpc_ssl_verify on;
           grpc_ssl_server_name on;
           break;
       }

       if ($http_app_name = "node") {
           grpc_pass grpcs://localhost:9103;
           grpc_ssl_trusted_certificate /path3/cert/ca-cert.pem;
           grpc_ssl_certificate /path3/cert/client-cert.pem;
           grpc_ssl_certificate_key /path3/cert/client-key.pem;
           grpc_ssl_verify on;
           grpc_ssl_server_name on;
           break;
       }
    }

meet error:
023/12/19 09:15:23 [emerg] 44181#0: "grpc_ssl_trusted_certificate" directive is not allowed here in /usr/local/etc/nginx/nginx.conf:65

Agent doesn't handle hostnames with special characters

I have an NGINX instance with hostname test-example.com
I tried deploying the nginx-agent package using the curl script from my NMS host, and the agent starts successfully. I can see instance details under Instance Manager -> Instances, but when I try to edit the configuration, it gives this error:

Oops. Something went wrong.
URIError: malformed URI sequence

If I change the hostname to something without special characters (e.g. testexample.com), everything works fine.

Fix Intermittently Failing Unit Test

The unit test on the pipeline fails intermittently with the error below. It seems to be the commander_test.go that is failing.

Example of pipeline failing: https://github.com/nginx/agent/actions/runs/4436873229/jobs/7785851614

Error:

time="2023-03-16T12:12:39Z" level=info msg="Commander received meta:<message_id:\"1234\" > , <nil>"
Recv Command: <nil>
Recv Command Error: rpc error: code = Canceled desc = context canceled
time="2023-03-16T12:12:39Z" level=info msg="Commander received <nil>, rpc error: code = Canceled desc = grpc: the client connection is closing"
time="2023-03-16T12:12:39Z" level=error msg="Commander Channel Recv: error communicating with bufnet, code=Canceled, message=grpc: the client connection is closing"
time="2023-03-16T12:12:39Z" level=info msg="Commander Channel Recv: retrying to connect to bufnet"
time="2023-03-16T12:12:39Z" level=warning msg="Error closing old grpc connection: rpc error: code = Canceled desc = grpc: the client connection is closing"
time="2023-03-16T12:12:39Z" level=error msg="Unable to create command channel: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing closed\""
time="2023-03-16T12:12:39Z" level=info msg="Commander retrying to connect to bufnet"
Recv Command: meta:<message_id:"1234" > 
Recv Command: <nil>
Recv Command Error: rpc error: code = Canceled desc = context canceled```

Agent doesn't fail on Start when there is a typo in nginx-agent.conf

I'm using Agent with NGINX Management Suite. When I have a typo (syntax error) in nginx-agent.conf, restarting Agent with " systemctl restart nginx-agent" doesnt notify that there is an error.
eg:

nap_monitoring:
  collector_buffer_size: 50000
  processor_buffer_size: 50000
  syslog_ip: "127.0.0.1     ### Note the missing " at the end
  syslog_port: 514

Looking at systemctl status nginx-agent also doesnt show an indication that there is a syntax error

[centos@nap-centos ~]$ sudo systemctl status nginx-agent

● nginx-agent.service - NGINX Agent
   Loaded: loaded (/etc/systemd/system/nginx-agent.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Tue 2023-01-17 11:08:03 UTC; 4s ago
     Docs: https://github.com/nginx/agent#readme
  Process: 2441 ExecStop=/bin/sleep 3 (code=exited, status=0/SUCCESS)
  Process: 2440 ExecStop=/bin/kill -2 $MAINPID (code=exited, status=0/SUCCESS)
  Process: 2448 ExecStart=/usr/bin/nginx-agent (code=exited, status=1/FAILURE)
  Process: 2445 ExecStartPre=/bin/mkdir -p /var/log/nginx-agent (code=exited, status=0/SUCCESS)
  Process: 2444 ExecStartPre=/bin/mkdir -p /var/run/nginx-agent (code=exited, status=0/SUCCESS)
 Main PID: 2448 (code=exited, status=1/FAILURE) 

Access log pattern is not being set if access_log directive is set before log_format

For example:

http {
    include       mime.types;
    default_type  application/octet-stream;
    access_log  /opt/nginx/logs/access.log main;

    log_format 'main' '$remote_addr - $remote_user [$time_local] "$request" '
                  '$status $body_bytes_sent "$http_referer" '
                  '"$http_user_agent" "$http_x_forwarded_for" '
                  '"$bytes_sent" "$request_length" "$request_time" '
                  '"$gzip_ratio" $server_protocol ';
...

This will result in the log_format pattern not being properly compiled.

agent is not reading some system statistics correctly in rootless container

We are observing the following errors when running nginx-agent in a rootless container:

time="2023-01-31T08:40:56Z" level=warning msg="Log level is info"
time="2023-01-31T08:40:56Z" level=info msg="setting displayName to nginx-agent-0b9264a946b3"
time="2023-01-31T08:40:56Z" level=info msg="NGINX Agent v2.22.1 at a0f380fa with pid 4, clientID=bf225f89-f4a9-3768-b747-d3ba66c8c177 name=nginx-agent-0b9264a946b3 features=[features_registration features_nginx-config features_nginx-ssl-config features_nginx-counting features_nginx-config-async features_metrics features_metrics-throttle features_dataplane-status features_process-watcher features_file-watcher features_activity-events features_agent-api]"
time="2023-01-31T08:40:56Z" level=info msg="Attempting to run command: /usr/sbin/nginx with args -V"
time="2023-01-31T08:40:56Z" level=info msg="Agent API not configured"
time="2023-01-31T08:40:56Z" level=info msg="Commander initializing"
time="2023-01-31T08:40:56Z" level=info msg="FileWatcher initializing"
time="2023-01-31T08:40:56Z" level=info msg="FileWatchThrottle initializing"
time="2023-01-31T08:40:56Z" level=info msg="MetricsSender initializing"
time="2023-01-31T08:40:56Z" level=info msg="NginxBinary initializing"
time="2023-01-31T08:40:56Z" level=info msg="OneTimeRegistration initializing"
time="2023-01-31T08:40:56Z" level=info msg="Registering bf225f89-f4a9-3768-b747-d3ba66c8c177"
time="2023-01-31T08:40:56Z" level=info msg="Metrics initializing"
time="2023-01-31T08:40:56Z" level=info msg="MetricsThrottle initializing"
time="2023-01-31T08:40:56Z" level=info msg="DataPlaneStatus initializing"
time="2023-01-31T08:40:56Z" level=info msg="Metrics waiting for handshake to be completed"
time="2023-01-31T08:40:56Z" level=info msg="MetricsThrottle waiting for report ready"
time="2023-01-31T08:40:56Z" level=info msg="ProcessWatcher initializing"
time="2023-01-31T08:40:56Z" level=info msg="Extensions initializing"
time="2023-01-31T08:40:56Z" level=info msg="Events initializing"
time="2023-01-31T08:40:56Z" level=info msg="NGINX Counter initializing { false unix:/var/run/nginx-agent/nginx.sock 6}"
time="2023-01-31T08:40:56Z" level=info msg="OneTimeRegistration completed"
time="2023-01-31T08:40:56Z" level=warning msg="The NGINX API is not configured. Please configure it to collect NGINX metrics."
time="2023-01-31T08:40:56Z" level=info msg="Commander received agent_connect_response:<agent_config:<configs:<configs:<system_id:\"bf225f89-f4a9-3768-b747-d3ba66c8c177\" nginx_id:\"b636d4376dea15405589692d3c5d3869ff3a9b26b0e7bb4bb1aa7e658ace1437\" > > > status:<statusCode:CONNECT_OK > > , <nil>"
time="2023-01-31T08:40:56Z" level=info msg="config command &{agent_config:<configs:<configs:<system_id:\"bf225f89-f4a9-3768-b747-d3ba66c8c177\" nginx_id:\"b636d4376dea15405589692d3c5d3869ff3a9b26b0e7bb4bb1aa7e658ace1437\" > > > status:<statusCode:CONNECT_OK > }"
time="2023-01-31T08:40:56Z" level=info msg="Upload: Sending data chunk data 0 (messageId=69c14f0f-eacd-4876-a219-bc82c9bc0c05)"
time="2023-01-31T08:40:56Z" level=info msg="Upload: Sending data chunk data 1 (messageId=69c14f0f-eacd-4876-a219-bc82c9bc0c05)"
time="2023-01-31T08:40:56Z" level=info msg="Upload sending done 69c14f0f-eacd-4876-a219-bc82c9bc0c05 (chunks=2)"
time="2023-01-31T08:41:11Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:41:26Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:41:41Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:41:56Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:42:11Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:42:26Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:42:41Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:42:56Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:43:11Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:43:26Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:43:41Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:43:56Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:44:11Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:44:26Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:44:41Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:44:56Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:45:11Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:45:26Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:45:41Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:45:56Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:46:11Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:46:26Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:46:41Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:46:56Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:47:11Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:47:26Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:47:41Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:47:56Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:48:11Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
time="2023-01-31T08:48:26Z" level=warning msg="Unable to collect container.cpu metrics, open /sys/fs/cgroup/cpu.max: no such file or directory"
[..]

The /sys/fs/cgroup/cpu.max is definitely absent inside a container. Some details below:

host (Fedora 37):
$ ps waux | grep nginx-agent
builder   227608  0.0  0.0   6952  1992 ?        Ss   16:22   0:00 /usr/bin/conmon --api-version 1 -c a5dfd793fa73a7f3c6543e677f4540184473995e2aa80bc199a5cd824a657e1e -u a5dfd793fa73a7f3c6543e677f4540184473995e2aa80bc199a5cd824a657e1e -r /usr/bin/crun -b /run/user/9999/containers/storage/overlay-containers/a5dfd793fa73a7f3c6543e677f4540184473995e2aa80bc199a5cd824a657e1e/userdata -p /run/user/9999/containers/overlay-containers/a5dfd793fa73a7f3c6543e677f4540184473995e2aa80bc199a5cd824a657e1e/userdata/pidfile -n e2e-nginx-agent --exit-dir /run/user/9999/libpod/tmp/exits --full-attach -s -l journald --log-level warning --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/9999/containers/overlay-containers/a5dfd793fa73a7f3c6543e677f4540184473995e2aa80bc199a5cd824a657e1e/userdata/oci-log --conmon-pidfile /run/user/9999/containers/overlay-containers/a5dfd793fa73a7f3c6543e677f4540184473995e2aa80bc199a5cd824a657e1e/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /run/user/9999/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/9999/containers --exit-command-arg --log-level --exit-command-arg warning --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/9999/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /run/user/9999/containers/storage/volumes --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --storage-opt --exit-command-arg overlay.mount_program=/usr/bin/fuse-overlayfs --exit-command-arg --storage-opt --exit-command-arg overlay.mountopt=nodev,metacopy=on --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg a5dfd793fa73a7f3c6543e677f4540184473995e2aa80bc199a5cd824a657e1e
builder   227614 23.5  0.0 731564 22388 ?        Sl   16:22   0:17 nginx-agent
builder   227709  0.1  0.1 2211424 39792 pts/0   Sl+  16:23   0:00 podman logs -f e2e-nginx-agent
builder   228032  0.0  0.0   6040  1780 pts/2    S+   16:24   0:00 grep --color=auto nginx-agent

$ cat /proc/227614/cgroup 
0::/user.slice/user-9999.slice/[email protected]/user.slice/libpod-a5dfd793fa73a7f3c6543e677f4540184473995e2aa80bc199a5cd824a657e1e.scope/container

container (image created by "make image" from agent's repository):
root@nginx-agent-51c5a52f01de:/agent# ls -l /sys/fs/cgroup/
total 0
-r--r--r--. 1 root root 0 Jan 31 16:22 cgroup.controllers
-r--r--r--. 1 root root 0 Jan 31 16:22 cgroup.events
-rw-r--r--. 1 root root 0 Jan 31 16:22 cgroup.freeze
--w-------. 1 root root 0 Jan 31 16:22 cgroup.kill
-rw-r--r--. 1 root root 0 Jan 31 16:22 cgroup.max.depth
-rw-r--r--. 1 root root 0 Jan 31 16:22 cgroup.max.descendants
-rw-r--r--. 1 root root 0 Jan 31 16:22 cgroup.pressure
-rw-r--r--. 1 root root 0 Jan 31 16:22 cgroup.procs
-r--r--r--. 1 root root 0 Jan 31 16:22 cgroup.stat
-rw-r--r--. 1 root root 0 Jan 31 16:22 cgroup.subtree_control
-rw-r--r--. 1 root root 0 Jan 31 16:22 cgroup.threads
-rw-r--r--. 1 root root 0 Jan 31 16:22 cgroup.type
-rw-r--r--. 1 root root 0 Jan 31 16:22 cpu.pressure
-r--r--r--. 1 root root 0 Jan 31 16:22 cpu.stat
-rw-r--r--. 1 root root 0 Jan 31 16:22 io.pressure
-rw-r--r--. 1 root root 0 Jan 31 16:22 irq.pressure
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.current
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.events
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.events.local
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.high
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.low
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.max
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.min
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.numa_stat
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.oom.group
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.peak
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.pressure
--w-------. 1 root root 0 Jan 31 16:22 memory.reclaim
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.stat
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.swap.current
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.swap.events
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.swap.high
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.swap.max
-r--r--r--. 1 root root 0 Jan 31 16:22 memory.zswap.current
-rw-r--r--. 1 root root 0 Jan 31 16:22 memory.zswap.max
-r--r--r--. 1 root root 0 Jan 31 16:22 pids.current
-r--r--r--. 1 root root 0 Jan 31 16:22 pids.events
-rw-r--r--. 1 root root 0 Jan 31 16:22 pids.max
-r--r--r--. 1 root root 0 Jan 31 16:22 pids.peak

Everything is fine when nginx-agent is running under real root, either under Docker or privileged podman.

However, the linked issue here is that it seems like the agent is failing to collect some other metrics due to absence of /sys/fs/cgroup/cpu.max - in particular, we are not seeing any disk stats in this scenario, while disk devices are 100% accessible by unprivileged users in a container; we compare with Amplify agent that uses psutil to collect system metrics, and it is able to do this when running unprivileged.

agent can not handle syslog entries for error_log and access_log in nginx configuration

When either error_log or access_log are configured with syslog logging, agent treats syslog arguments as a file name and throws the error:

time="2023-01-27T05:45:46Z" level=warning msg="NGINX Access log syslog:server=127.0.0.1,nohostname,tag=nginx_access is not readable or is disabled. Please make it readable and enabled in order for NGINX metrics to be collected."
time="2023-01-27T05:45:46Z" level=warning msg="NGINX Error log syslog:server=127.0.0.1,nohostname,tag=nginx_error is not readable or is disabled. Please make it readable and enabled in order for NGINX metrics to be collected."

nginx-agent packages do not have logrotate configuration

nginx-agent Linux packages do not have any logrotate configuration bundled in, which makes it inconvenient to use in long-running environments:

root@xxx:/var/log/nginx-agent# head agent.log
time="2023-03-23T14:58:47Z" level=info msg="setting displayName to xxx"
time="2023-03-23T14:58:47Z" level=info msg="NGINX Agent v2.24.1 at 2c252a29 with pid 674405, clientID=a96bc6ad-2dc2-3734-b917-af370470bd1d name=xxx features=[features_registration features_nginx-config features_nginx-ssl-config features_nginx-counting features_nginx-config-async features_metrics features_metrics-throttle features_dataplane-status features_process-watcher features_file-watcher features_activity-events features_agent-api]"
time="2023-03-23T14:58:47Z" level=debug msg="Commander connecting to 127.0.0.1:9001"
time="2023-03-23T14:58:47Z" level=debug msg="Creating commander client"
time="2023-03-23T14:58:47Z" level=debug msg="Metric Reporter connecting to 127.0.0.1:9001"
time="2023-03-23T14:58:47Z" level=debug msg="Creating metric reporter client"
time="2023-03-23T14:58:47Z" level=debug msg="Commander receive loop starting"
time="2023-03-23T14:58:47Z" level=debug msg="Error getting default network interface, interface with default destination not found"
time="2023-03-23T14:58:47Z" level=debug msg="Reading CPU information for dataplane host"
time="2023-03-23T14:58:47Z" level=info msg="Attempting to run command: /usr/sbin/nginx with args -V"

root@xxx:/var/log/nginx-agent# tail agent.log
time="2023-05-09T05:47:28Z" level=info msg="ProcessWatcher is wrapping up"
time="2023-05-09T05:47:28Z" level=info msg="Extensions is wrapping up"
time="2023-05-09T05:47:28Z" level=info msg="Events is wrapping up"
time="2023-05-09T05:47:28Z" level=info msg="Commander received <nil>, rpc error: code = Canceled desc = context canceled"
time="2023-05-09T05:47:28Z" level=error msg="Commander Channel Recv: error communicating with 127.0.0.1:9001, code=Canceled, message=context canceled"
time="2023-05-09T05:47:28Z" level=info msg="Commander Channel Recv: retrying to connect to 127.0.0.1:9001"
time="2023-05-09T05:47:28Z" level=debug msg="Creating commander client"
time="2023-05-09T05:47:28Z" level=info msg="Agent API is wrapping up"
time="2023-05-09T05:47:28Z" level=warning msg="Unable to accept from NGINX counter socket"
time="2023-05-09T05:47:28Z" level=info msg="NGINX Counter is wrapping up"

root@xxx:/var/log/nginx-agent# du -hs agent.log
447M	agent.log

Adding a simple one (like e.g. https://github.com/nginxinc/nginx-amplify-agent/blob/master/etc/logrotate.d/amplify-agent) + explicit dependency on logrotate package would make things better here.

Agent segfaults intermittently when it receives a SIGTERM or SIGINT

On Linux, when I issue a TERM or INT signal to NGINX Agent, I frequently see that Agent has segfaulted instead of exiting gracefully. Upon error, the following is frequently outputted:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x11a4b4c]

goroutine 81 [running]:
main.handleSignals.func1()
        /home/elijah/Development/Open_Source/nginx-agent/main.go:139 +0x24c
created by main.handleSignals
        /home/elijah/Development/Open_Source/nginx-agent/main.go:125 +0x251

As best as I can tell, this is caused by not checking if cmder is nil before main.go:133.

gogo protobuf is deprecated

I have insight into why it was included in the project, but it's now tech debt and isn't necessary.

A great presentation by the project creator on lessons learned: https://www.youtube.com/watch?v=HTIltI0NuNg

I've updated a project to no longer use it and would be willing to do the needful and submit a PR if there was interest in having the work be done.

path to agent-dynamic.conf should be configurable

If the file agent-dynamic.conf file does not exist in one of the default paths, such as /var/lib/nginx-agent/, the Agent will fail to start up, with the following error:

$ ./build/nginx-agent --config-dirs /opt/nginx-agent/
WARN[0000] Unable to read dynamic config (/var/lib/nginx-agent/agent-dynamic.conf), got the following error: stat /var/lib/nginx-agent/agent-dynamic.conf: no such file or directory 
INFO[0000] Writing the following file to disk: /var/lib/nginx-agent/agent-dynamic.conf 
FATA[0000] Unable to load properties from config files (/home/elijah/Development/Open_Source/nginx-agent/nginx-agent.conf, /var/lib/nginx-agent/agent-dynamic.conf) - error attempting to create directory for dynamic config (/var/lib/nginx-agent/), got the following error: mkdir /var/lib/nginx-agent/: permission denied 

Perhaps this is by design, but it is unclear if the path of agent-dynamic.conf is allowed to be configurable.

Agent populates Incorrect OS release information in System details

Agent populates release information in system details with incorrect information.

Actual:

"release": {
        "name": "debian", // (Mismatch)
        "version_id": "20.04",
        "version": "5.15.0-1028-aws", // (Mismatch)
        "codename": "linux", // (Mismatch)
        "id": "ubuntu"
      },

Expected:

"release": {
        "name": "Ubuntu",
        "version_id": "20.04",
        "version": "20.04.5 LTS (Focal Fossa)",
        "codename": "focal",
        "id": "ubuntu"
      },

agent can not handle "access_log off" in nginx configuration

If access_log off; is present in nginx configuration, agent treats the off part as a file name and throws the error:

time="2023-01-27T05:45:46Z" level=warning msg="NGINX Access log off is not readable or is disabled. Please make it readable and enabled in order for NGINX metrics to be collected."

The off is a special value that actually disables access log in a given block: https://nginx.org/en/docs/http/ngx_http_log_module.html#access_log

Nginx agent restarts after new installation

When check the status of the agent it shows
systemctl status nginx-agent • nginx-agent.service - NGINX Agent Loaded: loaded (/etc/systemd/system/ nginx-agent.service; enabled; vendor preset: disabled) Active: activating (auto-restart) (Result: exit-code) since Tue 2023-02-28 21:26:48 CST; 2s ago Docs: https://github.com/nginx/ agent#readme Process: 10364 ExecStart./usr/bininginx-agent (code=exited, status=2) Process: 10361 ExecStartPre=/bin/mkdir -p /var/log/nginx-agent (code=exited, status=0/SUCCESS) Process: 10359 ExecStartPre=/bin/mkdir -p /var/run/nginx-agent (code=exited, status=0/SUCCESS) Main PID: 10364 (code=exited, status=2)

in the agent.log file only shows when agent send data to Instance Manager
time="2023-02-28T21:34:06-06:00" level=info msg="setting displayName to -----"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 0 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 1 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 2 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 3 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 4 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 5 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 6 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 7 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 8 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 9 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 10 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 11 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 12 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload: Sending data chunk data 13 (messageId=83e8ff10-a4f0-4ba2-90da-a0add60139e5)"
time="2023-02-28T21:34:06-06:00" level=info msg="Upload sending done 83e8ff10-a4f0-4ba2-90da-a0add60139e5 (chunks=14)"

Prometheus metric values does not match value from NGINX Plus metrics API

I have NGINX Plus running in a Docker container, with nginx-agent installed:

root@0873e2df46e4:/# nginx -v
nginx version: nginx/1.23.2 (nginx-plus-r28)
root@0873e2df46e4:/# nginx-agent -v
nginx-agent version v2.22.1-a0f380fa

With the NGINX Plus metrics API, I can see request count of 8 for a server zone

root@0873e2df46e4:/# curl localhost:8080/api//6/http/server_zones/
{"status_page":{"processing":0,"requests":8,"responses":{"1xx":0,"2xx":8,"3xx":0,"4xx":0,"5xx":0,"total":8},"discarded":0,"received":4844,"sent":204308}}

But when accessing the nginx-agent metrics endpoint, I get a request count of 0

root@0873e2df46e4:/# curl -s localhost:8081/metrics/ | grep status_page | grep request_count
plus_http_request_count{display_name="0873e2df46e4",hostname="0873e2df46e4",instance_group="",nginx_id="b636d4376dea15405589692d3c5d3869ff3a9b26b0e7bb4bb1aa7e658ace1437",server_zone="status_page",system_id="ad4e5cea-3309-39d4-ad48-4688716211e6",system_tags=""} 0

Below are the configuration files I'm using

#nginx-agent.conf
log:
  level: debug
  path: /var/log/nginx-agent/
nginx:
  exclude_logs: ""
  socket: "unix:/var/run/nginx-agent/nginx.sock"
dataplane:
  status:
    poll_interval: 30s
    report_interval: 24h
metrics:
  bulk_size: 20
  report_interval: 1m
  collection_interval: 15s
  mode: aggregated
config_dirs: "/etc/nginx:/usr/local/etc/nginx:/usr/share/nginx/modules:/etc/nms"
api:
  port: 8081
# nginx.conf
user  nginx;
worker_processes  auto;
error_log  /var/log/nginx/error.log notice;
pid    /var/run/nginx.pid;
events {
  worker_connections  1024;
}
http {
  log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
            '$status $body_bytes_sent "$http_referer" '
            '"$http_user_agent" "$http_x_forwarded_for"';
  access_log  /var/log/nginx/access.log  main;

  upstream httpbin {
    server httpbin.org;
  }

  server {
    status_zone status_page;
    listen     80 default_server;

    location / {
      proxy_pass http://httpbin/;
    }
  }

  server {
    listen     8080;

    location /api/ {
      api write=on;
      allow all;
    }
  }
}

Securing access to the agent API

Just a quick couple of questions, we have a number of Load balancers that we frequently add/remove customers from, and thus this looks great for us as we can just generate an NGINX conf and upload it via the agent tool!

Could you outline the steps to secure the Agent? Or is it simply install the agent on an instance that already has SSH keys associated with it, clone the repo and point the config at my local SSH keys to connect?

I couldn't quite see anywhere that detailed this in the docs so wanted to double-check that the agent is secure by default before we started testing it out!

Bypass Nginx Cache

Hello,
I am hosting web app using Nginx and setting up static file in nginx directory. It is getting executed and hosted successfully.
When in new push i am building React application and saving files in nginx directory, then it is showing me blank white page, it will be because of the cookies saved in browser. When i am using CTRL + F5 (refreshes the page by clearing the cached content of the page) then i am able to see site.

  1. Is there a way i can remove browzer cookies through nginx ? If yes, how to do that.
  2. Is there a way that i can pass request to origin server everytime? If yes, how to do that.
    Is there any other way please let me know.
    Thanks in advance

"include" directive is not allowed within an "if" block

The following valid nginx config is mistakenly marked as illegal by the Agent:

server {
    listen       80 default_server;
    server_name  localhost;

    location / {

        if ($request_method = 'OPTIONS') 
        {
          include conf.d/some_file;
          return 204;
        }

        root   /usr/share/nginx/html;
        index  index.html index.htm;
    }
}

The agent fails to write the configuration with:

time="2023-10-17T08:47:26Z" level=info msg="Updating NGINX config"
time="2023-10-17T08:47:26Z" level=info msg="Attempting to run command: /usr/sbin/nginx with args -t -c /etc/nginx/nginx.conf"
time="2023-10-17T08:47:26Z" level=info msg="Config validated:\nnginx: the configuration file /etc/nginx/nginx.conf syntax is ok\nnginx: configuration file /etc/nginx/nginx.conf test is successful\n"
time="2023-10-17T08:47:26Z" level=error msg="Config apply failed (write): error running nginx -t -c /etc/nginx/nginx.conf:\n error reading config from /etc/nginx/nginx.conf, error: \"include\" directive is not allowed here in /etc/nginx/conf.d/default.conf:11"

This is caused by a bug in nginx-go-crossplane (see nginxinc/nginx-go-crossplane#72). Once we have a new nginx-go-crossplane version, Agent will have to point to that one to solve this validation issue.

NGINX agent cannot find the default network interface on Linux hosts.

Description:
The NGINX agent is incorrectly reporting the loopback interface as the default destination for network traffic.
When the agent reads /proc/net/route it always skips over the first line of route data. This causes it to miss the default route in many cases and it will report the default interface as loopback.

Reproduce:
Run the agent on any Linux host with debug logging level. You will see the error message:
time="2023-05-18T05:25:01Z" level=debug msg="Error getting default network interface, interface with default destination not found"

Expected Result:
There is no error in the agent log about not being able to find the default destination and the agent reports the correct default interface.

Actual Result:
There is an error message about not finding the default destination and the agent reports 'lo' as the default interface.

Update GetNetOverflow function to use nstat instead of netstat

nstat is installed on linux by default so we can use that instead of relying on netstat.

To get the network overflow run the following command:

$ nstat -az | grep -i TcpExtListenOverflows

which will return the following output:

TcpExtListenOverflows           0                  0.0

The NGINX Agent should parse the particular output and populate the float64 :

func GetNetOverflow() (float64, error) {

Ensure consistent output - Rocky Linux 9 (x86_64) and AlmaLinux 8 (x86_64), Ubuntu 22, RHEL 8, SUSE 15, FreeBSD 13 (pasted in the PR output)

Print a JSON representation of the NginxConfig proto message in debug logs

At present, the debug and trace log levels have the ability to print out the NginxConfig protobuf message that is sent to the agent when a change is submitted:

log.Debugf("WriteConfig start %v", config)

However, the string representation of such object is dictated by the proto.CompactTextString() function, that produces a serialized output that is hard to parse and read (specially for binary message fields), which makes debugging hard.

The proposed enhancement is to print a JSON-formatted version of the message, which would facilitate the debugging process.

Increase coverage in NGINX config unit tests

There are a few areas of improvement for the unit tests for parsing nginx configs with ssl directives and cert files that could be improved:

  1. Should test other ssl directives than just ssl_certificate since there are several that are supposed to add aux files.

  2. Should provide a way to ensure the various cert metadata are being determined properly

    1. Currently, the tests use the same exact code as the production code to parse the Validity.NotBefore, Validity.NotAfter, SerialNumber, Fingerprint, SubjectKeyIdentifier, AuthorityKeyIdentifier, which means the values are not being compared against expected knowns - they are being compared to themselves

    2. See here where these properties are simply being applied to the expected data before asserting equality:

      config_helpers_test.go

  3. Consider testing other supported algorithms than just RSA - like DSA, ECDSA, EdDSA and/or ECIES

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.