GithubHelp home page GithubHelp logo

Comments (14)

justinfoote avatar justinfoote commented on July 2, 2024

Interesting!
Can you give me some more details about the test that is failing?

And do you see an error message in your newrelic_agent.log that starts like this:

[2019-10-02 11:29:41 -0700 {hostname} (91275)] ERROR : Exception during Tracer...

from newrelic-ruby-agent.

justinfoote avatar justinfoote commented on July 2, 2024

Hey @cyrilchampier, if this is still an issue, please let me know here. Otherwise, I'll close this issue tomorrow.

from newrelic-ruby-agent.

cyrilchampier avatar cyrilchampier commented on July 2, 2024

Hi, sorry was in holidays.
I do not remember how to reproduce this, but yes, I have this kind of logs:

[2019-10-31 14:06:48 +0100 dcyrils-macbook-pro.home (12006)] INFO : Installing notification based Action View instrumentation
[2019-10-31 14:06:48 +0100 dcyrils-macbook-pro.home (12006)] INFO : Installing notifications based Action Cable instrumentation
[2019-10-31 14:06:48 +0100 dcyrils-macbook-pro.home (12006)] INFO : Finished instrumentation
[2019-10-31 14:06:49 +0100 dcyrils-macbook-pro.home (12006)] INFO : Doing deferred dependency-detection before Rack startup
[2022-02-10 17:04:41 +0100 dcyrils-macbook-pro.home (12006)] WARN : Agent received a ForceDisconnectException from the server, disconnecting. (410: Gone)
[2022-02-10 17:05:20 +0100 dcyrils-macbook-pro.home (12006)] ERROR : Exception during Tracer.start_external_request_segment
[2022-02-10 17:05:20 +0100 dcyrils-macbook-pro.home (12006)] ERROR : RuntimeError: failed to get urandom
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : newrelic_developer_mode does not have a corresponding configuration setting (developer_mode does not exist).
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : Run `rake newrelic:config:docs` or visit https://newrelic.com/docs/ruby/ruby-agent-configuration to see a list of available configuration settings.
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : Starting the New Relic agent version 6.7.0.359 in "test" environment.
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : To prevent agent startup add a NEW_RELIC_AGENT_ENABLED=false environment variable or modify the "test" section of your newrelic.yml.
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : Reading configuration from config/newrelic.yml (/Users/dcyril/src/doctolib)
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : Environment: test

Please remark the dates:
during this Capybara test, I use Timecop to change current date.
It might have an impact?

from newrelic-ruby-agent.

justinfoote avatar justinfoote commented on July 2, 2024

This line makes me think that the problem is in the availability of /dev/urandom:

[2022-02-10 17:05:20 +0100 dcyrils-macbook-pro.home (12006)] ERROR : RuntimeError: failed to get urandom

Looks like there's a but in ruby >= 2.5.1 that causes this error when using SecureRandom with high concurrency: https://bugs.ruby-lang.org/issues/14716

I can see the code path that causes the agent issue you're seeing, and I can try to put together a fix for it. I think the fix will simply expose the underlying error.

from newrelic-ruby-agent.

cyrilchampier avatar cyrilchampier commented on July 2, 2024

Maybe the fix would be to warn instead of throw?
From an user point of view, I would like newrelic to be completely invisible.
I would like it cannot modify my app (or test) behaviour in any case.

from newrelic-ruby-agent.

justinfoote avatar justinfoote commented on July 2, 2024

I think the fix is likely a little more complicated than that. The problem occurs when we generate a random id for a segment. And it only occurs very, very rarely. (Which is why you've only seen it once, and we've never received a report of the problem before).

When this exception occurs, the segment initialize method returns nil. We could check whether the segment is nil in this case, but that leaves many other places in our code that assume that a segment will never be nil immediately after initialization.

I've taken some notes of the problem, and I'm going to discuss with the other engineers on the team before we decide the appropriate way to handle it. We'll update here when we have a plan.

Thanks again for bringing the problem to our attention!

from newrelic-ruby-agent.

cyrilchampier avatar cyrilchampier commented on July 2, 2024

thanks!

from newrelic-ruby-agent.

cyrilchampier avatar cyrilchampier commented on July 2, 2024

maybe this log can help you:

Errno::EMFILE: Failed to open TCP connection to 127.0.0.1:9517 (Too many open files - socket(2) for "127.0.0.1" port 9517)
from /Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/net/http.rb:949:in `rescue in block in connect'
Caused by Errno::EMFILE: Too many open files - socket(2) for "127.0.0.1" port 9517
from /Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/socksify-1.7.1/lib/socksify.rb:178:in `initialize'
[2] pry(#<ZipperIntegration::WithZipperExtension::ZipperMessagesListViewRevampedTest::zipper agenda mapping>)> Unexpected error while processing request: failed to get urandom
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/securerandom.rb:106:in `urandom'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/securerandom.rb:106:in `gen_random_urandom'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/securerandom.rb:138:in `random_bytes'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/securerandom.rb:236:in `uuid'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/request_id.rb:40:in `internal_request_id'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/request_id.rb:35:in `make_request_id'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/request_id.rb:26:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/rack-2.0.7/lib/rack/method_override.rb:22:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/rack-2.0.7/lib/rack/runtime.rb:22:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-5.2.3/lib/active_support/cache/strategy/local_cache_middleware.rb:29:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/executor.rb:14:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/static.rb:127:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/rack-2.0.7/lib/rack/sendfile.rb:111:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/rack-cors-1.0.6/lib/rack/cors.rb:98:in `call'
	/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'

from newrelic-ruby-agent.

justinfoote avatar justinfoote commented on July 2, 2024

This is interesting. Does this happen often for you? Does it always follow an error like Too many open files?

from newrelic-ruby-agent.

cyrilchampier avatar cyrilchampier commented on July 2, 2024

I would say once a week for 2-3 months now. But one it's stuck, my only option is to comment theses 2 lines:
https://github.com/newrelic/rpm/blob/c9a467d22e6d4b893be33be330d258a125cf813a/lib/new_relic/agent/instrumentation/net.rb#L46-L47

And I recently discovered this "Too many open files", so I do not know.

from newrelic-ruby-agent.

justinfoote avatar justinfoote commented on July 2, 2024

OK, I have a repro. The issue at play here is that your ruby interpreter is using all of its allotted file descriptors, which means it is unable to read from /dev/urandom. This can be reproduced with the following script:

file_descriptors = []
begin
  while true do
    file_descriptors << IO.sysopen(__FILE__)
    SecureRandom.hex(8)
  end
ensure
  puts "failed at #{file_descriptors.length}"
end

This is because we're using SecureRandom, which passes through to Random.urandom, which looks like this (note the call to rb_raise):

random_raw_seed(VALUE self, VALUE size)
{
    long n = NUM2ULONG(size);
    VALUE buf = rb_str_new(0, n);
    if (n == 0) return buf;
    if (fill_random_bytes(RSTRING_PTR(buf), n, FALSE))
        rb_raise(rb_eRuntimeError, "failed to get urandom");
    return buf;
}

We don't actually need this random number to be secure, though. In fact, we're not using SecureRandom to generate the guid of a Transaction. See: https://github.com/newrelic/rpm/blob/master/lib/new_relic/agent/transaction.rb#L913

This method uses Random.rand, which does not rely on urandom. I've used my repro to verify that Random.rand does not raise an exception when it runs out of file descriptors.

We'll refactor so that our Segment constructor can use the same generate_guid as the Transaction.

Thanks for bringing this to our attention!

from newrelic-ruby-agent.

cyrilchampier avatar cyrilchampier commented on July 2, 2024

thanks for finding the reproduction!

from newrelic-ruby-agent.

rachelleahklein avatar rachelleahklein commented on July 2, 2024

Hi @cyrilchampier, just wanted to let you know this issue has been fixed in the recently released version 6.8.0 of the agent. Thank you again!

from newrelic-ruby-agent.

cyrilchampier avatar cyrilchampier commented on July 2, 2024

Thanks a lot!

from newrelic-ruby-agent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.