Comments (14)
Interesting!
Can you give me some more details about the test that is failing?
And do you see an error message in your newrelic_agent.log that starts like this:
[2019-10-02 11:29:41 -0700 {hostname} (91275)] ERROR : Exception during Tracer...
from newrelic-ruby-agent.
Hey @cyrilchampier, if this is still an issue, please let me know here. Otherwise, I'll close this issue tomorrow.
from newrelic-ruby-agent.
Hi, sorry was in holidays.
I do not remember how to reproduce this, but yes, I have this kind of logs:
[2019-10-31 14:06:48 +0100 dcyrils-macbook-pro.home (12006)] INFO : Installing notification based Action View instrumentation
[2019-10-31 14:06:48 +0100 dcyrils-macbook-pro.home (12006)] INFO : Installing notifications based Action Cable instrumentation
[2019-10-31 14:06:48 +0100 dcyrils-macbook-pro.home (12006)] INFO : Finished instrumentation
[2019-10-31 14:06:49 +0100 dcyrils-macbook-pro.home (12006)] INFO : Doing deferred dependency-detection before Rack startup
[2022-02-10 17:04:41 +0100 dcyrils-macbook-pro.home (12006)] WARN : Agent received a ForceDisconnectException from the server, disconnecting. (410: Gone)
[2022-02-10 17:05:20 +0100 dcyrils-macbook-pro.home (12006)] ERROR : Exception during Tracer.start_external_request_segment
[2022-02-10 17:05:20 +0100 dcyrils-macbook-pro.home (12006)] ERROR : RuntimeError: failed to get urandom
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : newrelic_developer_mode does not have a corresponding configuration setting (developer_mode does not exist).
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : Run `rake newrelic:config:docs` or visit https://newrelic.com/docs/ruby/ruby-agent-configuration to see a list of available configuration settings.
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : Starting the New Relic agent version 6.7.0.359 in "test" environment.
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : To prevent agent startup add a NEW_RELIC_AGENT_ENABLED=false environment variable or modify the "test" section of your newrelic.yml.
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : Reading configuration from config/newrelic.yml (/Users/dcyril/src/doctolib)
[2019-10-31 14:08:34 +0100 dcyrils-macbook-pro.home (12231)] INFO : Environment: test
Please remark the dates:
during this Capybara test, I use Timecop to change current date.
It might have an impact?
from newrelic-ruby-agent.
This line makes me think that the problem is in the availability of /dev/urandom
:
[2022-02-10 17:05:20 +0100 dcyrils-macbook-pro.home (12006)] ERROR : RuntimeError: failed to get urandom
Looks like there's a but in ruby >= 2.5.1 that causes this error when using SecureRandom with high concurrency: https://bugs.ruby-lang.org/issues/14716
I can see the code path that causes the agent issue you're seeing, and I can try to put together a fix for it. I think the fix will simply expose the underlying error.
from newrelic-ruby-agent.
Maybe the fix would be to warn instead of throw?
From an user point of view, I would like newrelic to be completely invisible.
I would like it cannot modify my app (or test) behaviour in any case.
from newrelic-ruby-agent.
I think the fix is likely a little more complicated than that. The problem occurs when we generate a random id for a segment. And it only occurs very, very rarely. (Which is why you've only seen it once, and we've never received a report of the problem before).
When this exception occurs, the segment initialize method returns nil. We could check whether the segment is nil in this case, but that leaves many other places in our code that assume that a segment will never be nil immediately after initialization.
I've taken some notes of the problem, and I'm going to discuss with the other engineers on the team before we decide the appropriate way to handle it. We'll update here when we have a plan.
Thanks again for bringing the problem to our attention!
from newrelic-ruby-agent.
thanks!
from newrelic-ruby-agent.
maybe this log can help you:
Errno::EMFILE: Failed to open TCP connection to 127.0.0.1:9517 (Too many open files - socket(2) for "127.0.0.1" port 9517)
from /Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/net/http.rb:949:in `rescue in block in connect'
Caused by Errno::EMFILE: Too many open files - socket(2) for "127.0.0.1" port 9517
from /Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/socksify-1.7.1/lib/socksify.rb:178:in `initialize'
[2] pry(#<ZipperIntegration::WithZipperExtension::ZipperMessagesListViewRevampedTest::zipper agenda mapping>)> Unexpected error while processing request: failed to get urandom
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/securerandom.rb:106:in `urandom'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/securerandom.rb:106:in `gen_random_urandom'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/securerandom.rb:138:in `random_bytes'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/2.6.0/securerandom.rb:236:in `uuid'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/request_id.rb:40:in `internal_request_id'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/request_id.rb:35:in `make_request_id'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/request_id.rb:26:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/rack-2.0.7/lib/rack/method_override.rb:22:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/rack-2.0.7/lib/rack/runtime.rb:22:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-5.2.3/lib/active_support/cache/strategy/local_cache_middleware.rb:29:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/executor.rb:14:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/actionpack-5.2.3/lib/action_dispatch/middleware/static.rb:127:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/rack-2.0.7/lib/rack/sendfile.rb:111:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/rack-cors-1.0.6/lib/rack/cors.rb:98:in `call'
/Users/dcyril/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/newrelic_rpm-6.7.0.359/lib/new_relic/agent/instrumentation/middleware_tracing.rb:99:in `call'
from newrelic-ruby-agent.
This is interesting. Does this happen often for you? Does it always follow an error like Too many open files
?
from newrelic-ruby-agent.
I would say once a week for 2-3 months now. But one it's stuck, my only option is to comment theses 2 lines:
https://github.com/newrelic/rpm/blob/c9a467d22e6d4b893be33be330d258a125cf813a/lib/new_relic/agent/instrumentation/net.rb#L46-L47
And I recently discovered this "Too many open files", so I do not know.
from newrelic-ruby-agent.
OK, I have a repro. The issue at play here is that your ruby interpreter is using all of its allotted file descriptors, which means it is unable to read from /dev/urandom
. This can be reproduced with the following script:
file_descriptors = []
begin
while true do
file_descriptors << IO.sysopen(__FILE__)
SecureRandom.hex(8)
end
ensure
puts "failed at #{file_descriptors.length}"
end
This is because we're using SecureRandom
, which passes through to Random.urandom
, which looks like this (note the call to rb_raise
):
random_raw_seed(VALUE self, VALUE size)
{
long n = NUM2ULONG(size);
VALUE buf = rb_str_new(0, n);
if (n == 0) return buf;
if (fill_random_bytes(RSTRING_PTR(buf), n, FALSE))
rb_raise(rb_eRuntimeError, "failed to get urandom");
return buf;
}
We don't actually need this random number to be secure, though. In fact, we're not using SecureRandom
to generate the guid of a Transaction. See: https://github.com/newrelic/rpm/blob/master/lib/new_relic/agent/transaction.rb#L913
This method uses Random.rand
, which does not rely on urandom. I've used my repro to verify that Random.rand
does not raise an exception when it runs out of file descriptors.
We'll refactor so that our Segment constructor can use the same generate_guid
as the Transaction.
Thanks for bringing this to our attention!
from newrelic-ruby-agent.
thanks for finding the reproduction!
from newrelic-ruby-agent.
Hi @cyrilchampier, just wanted to let you know this issue has been fixed in the recently released version 6.8.0 of the agent. Thank you again!
from newrelic-ruby-agent.
Thanks a lot!
from newrelic-ruby-agent.
Related Issues (20)
- Active Support notifications: spike/design work for an overhaul that will potentially require a major agent version bump HOT 1
- k8s: Create docker init container HOT 1
- k8s: Documentation HOT 1
- serverless: wrapper script can't find the agent when another Ruby layer is involved HOT 2
- severless: GHA and Dockerfile issues HOT 3
- serverless: Contribute RubyGems.org polling to our internal cron based layer auto-publisher repo HOT 2
- Prep k8s operator repo for Ruby HOT 2
- Add supportability metric for agent version HOT 2
- Remove support for memcached and memcache-client HOT 1
- CI: Improve CI failure output HOT 1
- k8s: Spike on a suitable `require` and load order solution to power the Go based operator HOT 2
- "rake console" target requires Pry despite Pry being unavailable by default HOT 1
- CI: Introduce Ruby 3.4-preview1 as a Ruby instance HOT 2
- serverless: rework the Ruby CI/CD to leverage GitHub Actions and drop Docker HOT 2
- serverless: CI/CD - separate out the "build" process from the "publish" process HOT 2
- Instrument Mongoid HOT 1
- serverless: Investigate a reported requirement that the NR Ruby layer be the first layer HOT 1
- Add support for Natalie Ruby HOT 1
- Support non-Gemfile based Kamal deployments HOT 1
- Spike: Investigate `Nested` controller traces HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from newrelic-ruby-agent.