GithubHelp home page GithubHelp logo

Comments (22)

tecoboot avatar tecoboot commented on July 1, 2024

I tested with key lifetime 36. This results in more frequent nl errors (issue #22) and faster memory consumption.

As a bypass, I'll use key lifetime 3600000.

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

Tested with valgrind. Full output in attached file.
Errors:

==2526== Invalid read of size 4
==2526== at 0x4238574: nlmsg_hdr (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x804D795: peer_created (meshd-nl80211.c:1196)
==2526== by 0x80534CC: create_candidate (sae.c:1354)
==2526== by 0x8054DBC: process_mgmt_frame (sae.c:1917)
==2526== by 0x804BF92: new_candidate_handler (meshd-nl80211.c:519)
==2526== by 0x804C9F1: event_handler (meshd-nl80211.c:785)
==2526== by 0x423A6F8: nl_recvmsgs_report (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x423AAD2: nl_recvmsgs (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x423AB11: nl_recvmsgs_default (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x804B46D: srv_handler_wrapper (meshd-nl80211.c:248)
==2526== by 0x8056E32: srv_main_loop (service.c:460)
==2526== by 0x804E6AF: main (meshd-nl80211.c:1579)
==2526== Address 0x4496e74 is 44 bytes inside a block of size 56 free'd
==2526== at 0x402A3A8: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==2526== by 0x42386AA: nlmsg_free (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x804E84B: send_nlmsg (nlutils.c:77)
==2526== by 0x804D784: peer_created (meshd-nl80211.c:1194)
==2526== by 0x80534CC: create_candidate (sae.c:1354)
==2526== by 0x8054DBC: process_mgmt_frame (sae.c:1917)
==2526== by 0x804BF92: new_candidate_handler (meshd-nl80211.c:519)
==2526== by 0x804C9F1: event_handler (meshd-nl80211.c:785)
==2526== by 0x423A6F8: nl_recvmsgs_report (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x423AAD2: nl_recvmsgs (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x423AB11: nl_recvmsgs_default (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x804B46D: srv_handler_wrapper (meshd-nl80211.c:248)
==2526==

meshd-nl80211.valgrind.15jan2016.txt

from authsae.

bcopeland avatar bcopeland commented on July 1, 2024

On Tue, Jan 05, 2016 at 06:47:08AM -0800, Teco Boot wrote:

Tested with valgrind. Full output in attached file.
Errors:

==2526== Invalid read of size 4
==2526== at 0x4238574: nlmsg_hdr (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==2526== by 0x804D795: peer_created (meshd-nl80211.c:1196)
[...]
==2526== Address 0x4496e74 is 44 bytes inside a block of size 56 free'd

Thanks, that's because of this:

ret = send_nlmsg(nlcfg.nl_sock, msg);
sae_debug(MESHD_DEBUG, "new peer candidate (seq num=%d)\n",
          nlmsg_hdr(msg)->nlmsg_seq);

send_nlmsg already freed msg so we shouldn't access it again. The order
of that debug and send should be swapped.

Regarding the memory leak: have you checked whether we are removing old
keys from the kernel? I.e. could we be amassing lots of no longer
useful keys after rekey operations rather than leaking something in

authsae itself?

Bob Copeland %% http://bobcopeland.com/

from authsae.

fhuberts avatar fhuberts commented on July 1, 2024

Teco, you should run with the following extra valgrind options:

--track-origins=yes -v --leak-check=full --show-reachable=yes

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

Still having errors.
Attached file with verbose output.
valgrind-15jan2016-2.txt

==10885== Invalid read of size 4
==10885== at 0x4238574: nlmsg_hdr (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==10885== by 0x804B5BA: tx_frame (meshd-nl80211.c:282)
==10885== by 0x804B64C: meshd_write_mgmt (meshd-nl80211.c:296)
==10885== by 0x8051E02: request_token (sae.c:949)
==10885== by 0x8055094: process_mgmt_frame (sae.c:1970)
==10885== by 0x804C96E: event_handler (meshd-nl80211.c:778)
==10885== by 0x423A6F8: nl_recvmsgs_report (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==10885== by 0x423AAD2: nl_recvmsgs (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==10885== by 0x423AB11: nl_recvmsgs_default (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==10885== by 0x804B46D: srv_handler_wrapper (meshd-nl80211.c:248)
==10885== by 0x8056D73: srv_main_loop (service.c:460)
==10885== by 0x804E6C2: main (meshd-nl80211.c:1583)
==10885== Address 0x46ee014 is 44 bytes inside a block of size 56 free'd
==10885== at 0x402A3A8: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==10885== by 0x42386AA: nlmsg_free (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==10885== by 0x804E85E: send_nlmsg (nlutils.c:77)
==10885== by 0x804B5A9: tx_frame (meshd-nl80211.c:280)
==10885== by 0x804B64C: meshd_write_mgmt (meshd-nl80211.c:296)
==10885== by 0x8051E02: request_token (sae.c:949)
==10885== by 0x8055094: process_mgmt_frame (sae.c:1970)
==10885== by 0x804C96E: event_handler (meshd-nl80211.c:778)
==10885== by 0x423A6F8: nl_recvmsgs_report (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==10885== by 0x423AAD2: nl_recvmsgs (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==10885== by 0x423AB11: nl_recvmsgs_default (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==10885== by 0x804B46D: srv_handler_wrapper (meshd-nl80211.c:248)

from authsae.

bcopeland avatar bcopeland commented on July 1, 2024

On Tue, Jan 05, 2016 at 07:35:37AM -0800, Teco Boot wrote:

Still having errors.
Attached file with verbose output.
valgrind-15jan2016-2.txt

==10885== Invalid read of size 4

Same thing here... and everywhere... ugh.

ret = send_nlmsg(nlcfg->nl_sock, msg);
sae_debug(MESHD_DEBUG, "tx frame (seq num=%d)\n",
        nlmsg_hdr(msg)->nlmsg_seq);

Bob Copeland %% http://bobcopeland.com/

from authsae.

fhuberts avatar fhuberts commented on July 1, 2024

The output now nicely shows the leaks as well (beside the invalid reads)

from authsae.

bcopeland avatar bcopeland commented on July 1, 2024

I'm not sure we can just reorder them -- I don't think nlmsg_seq is assigned until it is sent. So, I guess it would make sense to finally pull out the nlmsg_free() from inside send_nlmsg().

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

But this "Invalid read" reported by valgrind is another issue, right? It is not the memory leak.

from authsae.

fhuberts avatar fhuberts commented on July 1, 2024

Indeed, this is the result of reading memory that is freed

from authsae.

fhuberts avatar fhuberts commented on July 1, 2024

@bcopeland For context: Teco and I work together

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

The number of "keys" folders in /sys/kernel/debug/ieee80211/phy0/ does not increase over time. I see two older keys, perhaps the multicast and management frame keys. The other keys with higher numbers are refreshed.

root@R-221:# ll /sys/kernel/debug/ieee80211/phy0/keys/
total 0
drwxr-xr-x 2 root root 0 Jan 5 16:07 287
drwxr-xr-x 2 root root 0 Jan 5 16:07 288
drwxr-xr-x 2 root root 0 Jan 5 16:07 289
drwxr-xr-x 2 root root 0 Jan 5 16:07 290
drwxr-xr-x 2 root root 0 Jan 5 16:07 291
drwxr-xr-x 2 root root 0 Jan 5 16:07 292
drwxr-xr-x 2 root root 0 Jan 5 16:07 299
drwxr-xr-x 2 root root 0 Jan 5 16:07 300
drwxr-xr-x 2 root root 0 Jan 5 16:07 301
drwxr-xr-x 2 root root 0 Jan 5 16:07 302
drwxr-xr-x 2 root root 0 Jan 5 16:07 303
drwxr-xr-x 2 root root 0 Jan 5 16:07 304
drwxr-xr-x 2 root root 0 Jan 5 16:07 305
drwxr-xr-x 2 root root 0 Jan 5 16:07 306
drwxr-xr-x 2 root root 0 Jan 5 16:07 307
drwxr-xr-x 2 root root 0 Jan 5 16:07 308
drwxr-xr-x 2 root root 0 Jan 5 16:07 309
drwxr-xr-x 2 root root 0 Jan 5 16:07 310
drwxr-xr-x 2 root root 0 Jan 5 16:07 311
drwxr-xr-x 2 root root 0 Jan 5 16:07 312
drwxr-xr-x 2 root root 0 Jan 5 16:07 313
drwxr-xr-x 2 root root 0 Jan 5 16:07 314
drwxr-xr-x 2 root root 0 Jan 5 16:07 315
drwxr-xr-x 2 root root 0 Jan 5 16:07 316
drwxr-xr-x 2 root root 0 Jan 5 16:03 84
drwxr-xr-x 2 root root 0 Jan 5 16:03 85
root@R-221:
#

from authsae.

bcopeland avatar bcopeland commented on July 1, 2024

On Tue, Jan 05, 2016 at 08:01:48AM -0800, Teco Boot wrote:

But this "Invalid read" reported by valgrind is another issue, right? It is not the memory leak.

This is one of the leaks:

==14221== 9,280 bytes in 80 blocks are definitely lost in loss record 105 of 106
==14221== at 0x40291CC: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==14221== by 0x408BD19: ??? (in /usr/lib/i386-linux-gnu/i586/libcrypto.so.1.0.0)
==14221== by 0x408C30C: CRYPTO_malloc (in /usr/lib/i386-linux-gnu/i586/libcrypto.so.1.0.0)
==14221== by 0x4105EED: EVP_MD_CTX_copy_ex (in /usr/lib/i386-linux-gnu/i586/libcrypto.so.1.0.0)
==14221== by 0x40994C7: HMAC_Init_ex (in /usr/lib/i386-linux-gnu/i586/libcrypto.so.1.0.0)
==14221== by 0x4099783: HMAC_Init (in /usr/lib/i386-linux-gnu/i586/libcrypto.so.1.0.0)
==14221== by 0x80525D1: assign_group_to_peer (sae.c:1112) ==14221== by 0x8054D33: process_mgmt_frame (sae.c:1918)
==14221== by 0x804BF92: new_candidate_handler (meshd-nl80211.c:519) ==14221== by 0x804CA04: event_handler (meshd-nl80211.c:789)
==14221== by 0x423A6F8: nl_recvmsgs_report (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)
==14221== by 0x423AAD2: nl_recvmsgs (in /lib/i386-linux-gnu/libnl-3.so.200.19.0)

Looks like (at least) anywhere H_Init is used, there is no corresponding
HMAC_CTX_cleanup() call. And a lot of this stuff has changed in recent
openssl versions as well.

Bob Copeland %% http://bobcopeland.com/

from authsae.

fhuberts avatar fhuberts commented on July 1, 2024

I'm working on a patch to fix the invalid accesses

from authsae.

fhuberts avatar fhuberts commented on July 1, 2024

I've submitted 2 PRs

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

I would like to rerun the valgrind test with merged fixes: my cmd_19 fix and Ferry his fixes. I expect the cmd_18 issue and the key update leak.

FYI, with lifetime 30 and a handful of peers it takes less than 24h for crashing.

from authsae.

fhuberts avatar fhuberts commented on July 1, 2024

I whipped up a patch to add ctx cleanups where they were missing.
According to the manpage on my system (openssl 1.0.2e) for HMAC_* this is needed to release resources.

This patch is utterly untested by me, but please try it to see if it makes a difference.

0001-Fix-memory-leaks.patch.txt

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

Results before patching, increase is about 6MB/hour. Single node with lifetime=36.
I'll test proposed patches.

Jan 7 19:00:02 AHR-175-173 meshd-log: root 22310 1.9 1.7 5988 4344 pts/0 S 18:45 0:16 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 7 20:00:02 AHR-175-173 meshd-log: root 22310 2.0 4.0 11996 10352 ? S 18:45 1:30 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 7 21:00:01 AHR-175-173 meshd-log: root 22310 2.0 6.4 18076 16428 ? S 18:45 2:42 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 7 22:00:02 AHR-175-173 meshd-log: root 22310 2.0 8.8 24140 22492 ? S 18:45 3:54 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 7 23:00:02 AHR-175-173 meshd-log: root 22310 1.9 11.1 29980 28336 ? S 18:45 5:05 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 8 00:00:02 AHR-175-173 meshd-log: root 22310 1.9 13.4 35880 34228 ? S Jan07 6:16 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 8 01:00:02 AHR-175-173 meshd-log: root 22310 1.9 15.8 41932 40288 ? S Jan07 7:29 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 8 02:00:02 AHR-175-173 meshd-log: root 22310 1.9 18.1 47864 46228 ? R Jan07 8:40 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 8 03:00:02 AHR-175-173 meshd-log: root 22310 1.9 20.5 53932 52288 ? S Jan07 9:52 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 8 04:00:02 AHR-175-173 meshd-log: root 22310 1.9 22.8 59844 58200 ? S Jan07 11:03 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 8 05:00:02 AHR-175-173 meshd-log: root 22310 1.9 25.2 65908 64260 ? S Jan07 12:16 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 8 06:00:02 AHR-175-173 meshd-log: root 22310 1.9 27.5 71808 70112 ? S Jan07 13:27 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg
Jan 8 07:00:02 AHR-175-173 meshd-log: root 22310 1.9 29.7 77840 75728 ? S Jan07 14:39 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

For the record, valgrind output (before recent patches).
valgrind-08jan2016-1.txt

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

Results after patching. It looks much better.
==29972== LEAK SUMMARY:
==29972== definitely lost: 0 bytes in 0 blocks
==29972== indirectly lost: 0 bytes in 0 blocks
==29972== possibly lost: 4,048 bytes in 201 blocks
==29972== still reachable: 39,036 bytes in 244 blocks
==29972== suppressed: 0 bytes in 0 blocks
==29972==
==29972== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
==29972== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)

valgrind-08jan2016-2.txt

from authsae.

fhuberts avatar fhuberts commented on July 1, 2024

my patch is now PR #29

from authsae.

tecoboot avatar tecoboot commented on July 1, 2024

I've handful of nodes up and running for a week or so. Put in stress test mode (lifetime=36).
No memory leak, or not by key updates.

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 25292 0.2 0.0 4436 2172 ? S Jan08 28:00 /usr/local/bin/meshd-nl80211 -c /usr/local/etc/authsae.cfg

from authsae.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.