GithubHelp home page GithubHelp logo

getaddrinfo about sshkit HOT 5 CLOSED

Cervenka avatar Cervenka commented on June 25, 2024
getaddrinfo

from sshkit.

Comments (5)

creadone avatar creadone commented on June 25, 2024 1

Solved. I need more sleep.

nodes = %w[ 'subdomain.domain.com' ] => nodes = %w[ subdomain.domain.com ]

from sshkit.

leehambley avatar leehambley commented on June 25, 2024

We recently accepted some PRs to deal with large numbers of servers, so your ~100 count isn't exceptional in that regard.

Might I suggest you add a simple ping task, and try running things on a loop to get a harmless reproduction case, then you can run down some debugging options, such as clearing your DNS cache before, hard-coding the IPs in to your /etc/hosts files, etc, etc.

Your RUBY_VERSION can be significant here too, older Rubies, as a rule are less good at networking, but all rubies have been very good for at least 3-4 years, if not since the 2.0 release.

from sshkit.

Cervenka avatar Cervenka commented on June 25, 2024

TLDR: I think I will try to reproduce this issue in Ruby (without sshkit) next.

Thank you for your input so far!

Having the IPs hard-coded in /etc/hosts does help. That has been my workaround for a while.
When previously trying to replicate the DNS resolution issue I was not able to do so using other tools.

I had this bash script running since yesterday without issues as well.

while true
 do
     
  date
  seq 1 100 | parallel --tag ping -c 1  www{}.oursite.com | grep 'Unknown'
  sleep 15
 done

I did run into the same issue again just now when trying to deploy. The script above was running - so the resolved IPs should still have been cached by the OS. Here two resolves failed at the same time.

#<Thread:0x00007fa45e2d4f60@/Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/runners/parallel.rb:10 run> terminated with exception (report_on_exception is true):
Traceback (most recent call last):
	17: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/runners/parallel.rb:12:in `block (2 levels) in execute'
	16: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:31:in `run'
	15: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:31:in `instance_exec'
	14: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/capistrano-3.11.0/lib/capistrano/scm/tasks/git.rake:8:in `block (3 levels) in eval_rakefile'
	13: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:80:in `execute'
	12: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:148:in `create_command_and_execute'
	11: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:148:in `tap'
	10: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:148:in `block in create_command_and_execute'
	 9: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/netssh.rb:130:in `execute_command'
	 8: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/netssh.rb:177:in `with_ssh'
	 7: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/connection_pool.rb:63:in `with'
	 6: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/connection_pool.rb:63:in `call'
	 5: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/net-ssh-5.2.0/lib/net/ssh.rb:246:in `start'
	 4: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/net-ssh-5.2.0/lib/net/ssh.rb:246:in `new'
	 3: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/net-ssh-5.2.0/lib/net/ssh/transport/session.rb:73:in `initialize'
	 2: from /Users/flo/.rvm/rubies/ruby-2.6.3/lib/ruby/2.6.0/socket.rb:631:in `tcp'
	 1: from /Users/flo/.rvm/rubies/ruby-2.6.3/lib/ruby/2.6.0/socket.rb:227:in `foreach'
/Users/flo/.rvm/rubies/ruby-2.6.3/lib/ruby/2.6.0/socket.rb:227:in `getaddrinfo': getaddrinfo: nodename nor servname provided, or not known (SocketError)
	1: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/runners/parallel.rb:11:in `block (2 levels) in execute'
/Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/runners/parallel.rb:15:in `rescue in block (2 levels) in execute': Exception while executing as [email protected]: getaddrinfo: nodename nor servname provided, or not known (SSHKit::Runner::ExecuteError)
#<Thread:0x00007fa45e426fd0@/Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/runners/parallel.rb:10 run> terminated with exception (report_on_exception is true):
Traceback (most recent call last):
	17: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/runners/parallel.rb:12:in `block (2 levels) in execute'
	16: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:31:in `run'
	15: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:31:in `instance_exec'
	14: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/capistrano-3.11.0/lib/capistrano/scm/tasks/git.rake:8:in `block (3 levels) in eval_rakefile'
	13: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:80:in `execute'
	12: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:148:in `create_command_and_execute'
	11: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:148:in `tap'
	10: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/abstract.rb:148:in `block in create_command_and_execute'
	 9: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/netssh.rb:130:in `execute_command'
	 8: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/netssh.rb:177:in `with_ssh'
	 7: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/connection_pool.rb:63:in `with'
	 6: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/backends/connection_pool.rb:63:in `call'
	 5: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/net-ssh-5.2.0/lib/net/ssh.rb:246:in `start'
	 4: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/net-ssh-5.2.0/lib/net/ssh.rb:246:in `new'
	 3: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/net-ssh-5.2.0/lib/net/ssh/transport/session.rb:73:in `initialize'
	 2: from /Users/flo/.rvm/rubies/ruby-2.6.3/lib/ruby/2.6.0/socket.rb:631:in `tcp'
	 1: from /Users/flo/.rvm/rubies/ruby-2.6.3/lib/ruby/2.6.0/socket.rb:227:in `foreach'
/Users/flo/.rvm/rubies/ruby-2.6.3/lib/ruby/2.6.0/socket.rb:227:in `getaddrinfo': getaddrinfo: nodename nor servname provided, or not known (SocketError)
	1: from /Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/runners/parallel.rb:11:in `block (2 levels) in execute'
/Users/flo/.rvm/gems/ruby-2.6.3/gems/sshkit-1.20.0/lib/sshkit/runners/parallel.rb:15:in `rescue in block (2 levels) in execute': Exception while executing as [email protected]: getaddrinfo: nodename nor servname provided, or not known (SSHKit::Runner::ExecuteError)

I think I will try to reproduce this issue in Ruby (without sshkit) next.

from sshkit.

Cervenka avatar Cervenka commented on June 25, 2024

I have tried to reproduce this in other ways (including below script) but have not been able to reproduce this issue besides when using capistrano to deploy (which uses sshkit).

Also I have tried switching DNS-server. Hardcoding the hosts in /etc/hosts fixes the issue for me so it seems.

# frozen_string_literal: true

require 'socket'

loop do
  puts Time.now

  threads = []

  (1..100).each do |i|
    threads << Thread.new do
      addr = "www#{i}.oursite.com"
      begin
        addrinfo = Socket.getaddrinfo(addr, 'https', nil, Socket::SOCK_STREAM)
      rescue Exception => e
        puts "#{addr} #{Time.now}", e, ''
      end
    end
  end

  threads.each(&:join)

  sleep 20
end

from sshkit.

creadone avatar creadone commented on June 25, 2024

I have the same problem but with single server. Tried some tests:

Success

require 'socket'
Socket.getaddrinfo('subdomain.domain.com', 80, nil, Socket::SOCK_STREAM)

Success

require 'net/ssh'

Net::SSH.start('subdomain.domain.com', 'sshuser') do |ssh|
  ssh.exec 'touch ~/test.txt'
end

Fail with the same stacktrace as Cervenka

require 'sshkit'
require 'sshkit/dsl'
include SSHKit::DSL

SSHKit::Backend::Netssh.configure do |ssh|
  ssh.connection_timeout = 5
  ssh.ssh_options = {
    user:         'sshuser',
    keys:         %w[~/.ssh/id_rsa],
    auth_methods: %w[ publickey ]
  }
end

nodes = %w[ 'subdomain.domain.com' ]

on nodes do |node|
  output = capture :ls, '-l'
  puts output
end

Also

  1. Flushed and checked DNS in loop — everything is ok, nothing suspicious.
  2. Tried with IP — fail, the same exception.
  3. Hardcoding the hosts in /etc/hosts not fixes.

Do you have any ideas where to dig deeper?

ruby 3.1.1p18 (2022-02-18 revision 53f5fc4236) [x86_64-darwin21]
sshkit (1.21.3)

from sshkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.