GithubHelp home page GithubHelp logo

em_aws's People

Contributors

johnkchow avatar joshmckin avatar kybishop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

em_aws's Issues

Non-threadsafe libraries and EventMachine

I see you're using the AWS autoloader. I highly recommend you drop in:

AWS.eager_autoload!

As well, as otherwise the AWS library itself isn't threadsafe. Since EventMachine can't/doesn't run in a non-threaded environment (a unicorn deployment, for instance), this problem is basically impossible to avoid, and I had a couple heisenbugs on this one. The line of code doesn't change the functionality of the library in any way, but it's a little obscure and took me a while to find even after I figured out what the problem was.

Since it's something that must be in all the projects that use this library, it might save some people the confusion of the undefined method exceptions deep within the dark depths of Amazon's code. There might be an issue with you calling it too soon if people are using other patches to the aws-sdk, but I haven't seen any troubles.

Thanks for sharing the cool code. This isn't an issue so much as a suggestion (maybe the best solution is me issuing a pull request for a change to the readme?)

Connection Pooling

There's a comment that says it defaults a connection pool to size 5. Is this actually happening? I'm having an incredibly hard time finding where this is implemented, as it isn't here and it isn't in em-http-request.

Method to determine when an async request has finished

Hi Josh,
great package. My question is about how people are generally using em_aws. In my application, I need to know when an upload to S3 has succeeded in order to kick-off the next task.

I would like to perform many of uploads asynchronously and in parallel and then associate callbacks to kick off the next tasks.

Is this feasible with the library? It looks like line 152/fetch_response does not support this type of use...

The other option is to wrap the standard aws-sdk s3.write calls using EM:defer. You might have considered doing similar things; assuming i only need ~20 concurrent uploads, would this be a feasible solution in your opinion?

Crash when using Sinatra's stream method.

Thanks for the gem. Works great in most of my cases. One scenario when streaming a file in sinatra causes the following error:

(eval):11:in yield' : can't yield from root fiber ( FiberError ) from (eval):11:inhead'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/em_aws-0.1.3/lib/aws/core/http/em_http_handler.rb:94:in fetch_response' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/em_aws-0.1.3/lib/aws/core/http/em_http_handler.rb:124:inhandle_it'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/em_aws-0.1.3/lib/aws/core/http/em_http_handler.rb:99:in handle' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:224:inblock in make_sync_request'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:235:in retry_server_errors' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:219:inmake_sync_request'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:400:in block (2 levels) in client_request' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:289:inlog_client_request'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:373:in block in client_request' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:271:inreturn_or_raise'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/core/client.rb:372:in client_request' from (eval):3:inhead_object'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/s3/s3_object.rb:94:in head' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/aws-sdk-1.5.2/lib/aws/s3/s3_object.rb:71:inexists?'
from /Users/philipwi/CloudDev/storage_server/storage.rb:86:in stream' from /Users/philipwi/CloudDev/storage_server/myapp.rb:155:inblock (2 levels) in class:StorageApp'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/sinatra-1.3.2/lib/sinatra/base.rb:296:in block in stream' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/sinatra-1.3.2/lib/sinatra/base.rb:264:incall'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/sinatra-1.3.2/lib/sinatra/base.rb:264:in block in each' from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:1012:incall'
from /Users/philipwi/.rvm/gems/ruby-1.9.3-p0@storageserver/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:1012:in `block in spawn_threadpool'

Do you have any ideas?

CRC32 integrity check failed with DynamoDB

Hi,

I have a problem with em_aws and DynamoDB. Here is what is happening:

require 'em-synchrony'
require 'aws-sdk'
require 'aws/core/http/em_http_handler'

AWS.config({
  access_key_id: .............,
  secret_access_key: ............,
  http_handler: AWS::Http::EMHttpHandler.new()
})

dynamo_db = AWS::DynamoDB.new
dynamo_db.tables.each {|table| puts table.name }

results in:


Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/client.rb:318:in `return_or_raise': CRC32 integrity check failed (AWS::DynamoDB::Errors::CRC32CheckFailed)
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/client.rb:419:in `client_request'
    from (eval):3:in `list_tables'
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/dynamo_db/table_collection.rb:121:in `_each_item'
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/collection/with_limit_and_next_token.rb:54:in `_each_batch'
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/collection.rb:82:in `each_batch'
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.1/lib/aws/core/collection.rb:49:in `each'
    from testdynamodb.rb:12:in `<main>'

Somehow CRC32 fails. The above code runs OK if the dynamo_db_crc32 option is added to AWS.config

AWS.config({
  access_key_id: .............,
  secret_access_key: ............,
  http_handler: AWS::Http::EMHttpHandler.new(),
  dynamo_db_crc32: false
})

It would be great if this issue can be fixed! Thanks,

Michal

PS em_aws is great! It made my projects so much simpler!

HTTP Handler does not respect request.read_timeout

This is important when long polling takes place, e.g. when using AWS::SimpleForkflow:: DecisionTaskCollection#poll and probably other services like SQS.

What happens is that right now the handler uses default timeout settings, which makes it timeout after ~ 15s:

[AWS SimpleWorkflow 0 15.294478 0 retries] poll_for_activity_task(:domain=>"streamza-dev1",:identity=>"mf:8254",:task_list=>{:name=>"download-list"})  

What should happen is that the handler should respect the read_timeout on the request object and set the inactivity_timeout on the EM::HttpRequest object. Timeout on polling requests equals.

My very rough fix to this is (in em_http_handler.rb):

# Builds and attempts the request. Occasionally under load em-http-request
# returns a status of 0 with nil for header and body, in such situations
# we retry as many times as status_0_retries is set. If our retries exceed
# status_0_retries we assume there is a network error
def process_request(request,response,async=false,retries=0,&read_block)      
  method = "a#{request.http_method}".downcase.to_sym  # aget, apost, aput, adelete, ahead
  opts = fetch_request_options(request)
  opts[:async] = (async || opts[:async])
  opts[:inactivity_timeout] = request.read_timeout
  puts opts
  url = fetch_url(request)
  begin
    http_response = fetch_response(url,method,opts,&read_block) 
    unless opts[:async]
      response.status = http_response.response_header.status.to_i
      if response.status == 0
        if retries <= status_0_retries.to_i
          process_request(request,response,(retries + 1),&read_block)
        else
          response.network_error = true  
        end
      else
        response.headers = fetch_response_headers(http_response)
        response.body = http_response.response
      end
    end
  rescue Timeout::Error => error
    response.network_error = error
  rescue *EM_PASS_THROUGH_ERRORS => error
    raise error
  rescue Exception => error
    response.network_error = error
  end
  nil
end
end

and

def fetch_response(url,method,opts={},&read_block)
  inactivity_timeout = opts.delete :inactivity_timeout
  if @pool
    @pool.run(url) do |connection|
      req = connection.send(method, {:keepalive => true}.merge(opts))
      req.stream &read_block if block_given?
      return  EM::Synchrony.sync req unless opts[:async]
    end
  else
    req = EM::HttpRequest.new(url, :inactivity_timeout => inactivity_timeout).send(method,opts)
    req.stream &read_block if block_given?
    return  EM::Synchrony.sync req unless opts[:async]
  end
  nil
end

There are just 3 lines changed. I have not figured out how to re-configure @pool, but I hope this gives a good hint.

HTTP Handler does not respect port settings

Hi,

It looks like HttpHandler does not respect port settings. This is important when working with e.g. fake_dynamo which runs locally on a port different than 443 or 80.

The current version does not work with an example config:

AWS.config({
  dynamo_db_endpoint: 'localhost',
  dynamo_db_port: '4567',
  use_ssl: false
})

Having peeked into the original handler I would suggest a simple solution in aws/core/http/em_http_handler.rb

def fetch_url(request)
  (request.use_ssl? ? "https" : "http") + "://#{request.host}:#{request.port}"
end

The port is already set on the request. I have not done extensive tests, but it seems to work for both original AWS services and local fake_dynamo.

S3Object#exists? fails when no object

Hi,

I came across the following bug. Given the config:

AWS.config({
  access_key_id: CONFIG.aws.api_key,
  secret_access_key: CONFIG.aws.secret,
  http_handler: AWS::Http::EMHttpHandler.new
})

the following fails if and object does NOT exist:

AWS::S3.new.buckets[mybucket].objects['this-does-not-exist'].exists?
AWS::Errors::Base: 
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.3/lib/aws/core/client.rb:318:in `return_or_raise'
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.3/lib/aws/core/client.rb:419:in `client_request'
    from (eval):3:in `head_object'
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.3/lib/aws/s3/s3_object.rb:294:in `head'
    from /Users/michalf/.rvm/gems/ruby-1.9.3-p327/gems/aws-sdk-1.8.1.3/lib/aws/s3/s3_object.rb:271:in `exists?'
    from (irb):5
    from /Users/michalf/.rvm/rubies/ruby-1.9.3-p327/bin/irb:16:in `<main>'

I have not yet time to look into this, but this seems critical to our project. It works with the default HTTP handler.

Thanks in advance!
Michal

Callback sample

Seeing this line in your sample code:

EM::Synchrony.sleep(2) # Let the pending fibers run

I gather this is to simulate the reactor doing other stuff while the async operation completes. But what if we want to chain a callback on the async operation -- do you have sample code of this working with your gem?

Did you know the aws library uses Kernel.sleep for exponential backoff?

I forked their library to switch it to:

      if defined?(EM) && EM.reactor_running?
        fiber = Fiber.current
        EM::Timer.new(sleeps.shift) { fiber.resume }
        Fiber.yield
      else
        Kernel.sleep....

But I thought I would mention in case you want to monkey patch, as it was killing the performance of my servers (predictably).

ThreadError: deadlock; recursive locking

Hey @JoshMcKin,

I'm using your em_aws (great idea by the way) but I'm running into an issue. I'm using it for DynamoDB to do massive amounts of writes. On the initial query, it makes an HTTP request which requires a session authorization from AWS, which unfortunately has some locking. Any subsequence requests before the session authorization comes back results in the following exception:

[2012-05-11 23:26:17.954 #18573] ERROR -- : #<ThreadError: deadlock; recursive locking>
[2012-05-11 23:26:17.954 #18573] DEBUG -- : internal:prelude:8:in lock' <internal:prelude>:8:insynchronize'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/session_signer.rb:63:in get_session' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/session_signer.rb:72:insession'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/session_signer.rb:42:in access_key_id' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/dynamo_db/request.rb:31:inadd_authorization!'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:436:in build_request' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:375:inblock (3 levels) in client_request'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/response.rb:65:in call' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/response.rb:65:inrebuild_request'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/response.rb:60:in initialize' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:169:innew'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:169:in new_response' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:375:inblock (2 levels) in client_request'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:287:in log_client_request' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:363:inblock in client_request'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:275:in return_or_raise' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/core/client.rb:362:inclient_request'
(eval):3:in describe_table' /Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/dynamo_db/table.rb:497:inget_resource'
/Users/johnchow/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/aws-sdk-1.4.1/lib/aws/dynamo_db/table.rb:305:in `exists?'

Once the session is returned by AWS, the thing works like a charm.

I was wondering if you had any insight, maybe we could tackle this problem together?

Getting DynamoDB responses with status 0 for requests that work with the default net_http_handler

I'm sending some trivial dynamo fetch requests that are returning a status of 0. These requests work fine with a 200 response when using the default net_http_handler.

I am using ruby 2.0.0-p195 with aws-sdk 1.9.5 and em_aws 0.3.0

Here is some debug information about the requests being sent that I pulled from the #process_request method:

url: https://dynamodb.us-east-1.amazonaws.com:443
options: {:inactivity_timeout=>60, :connect_timeout=>10}
opts: {
  :inactivity_timeout => 0,
  :connect_timeout => 10, 
  :head=> {
    "content-type" => "application/x-amz-json-1.0",
    "x-amz-target" => "DynamoDB_20111205.GetItem",
    "content-length" => "150", 
    "user-agent" => "aws-sdk-ruby/1.9.5 ruby/2.0.0 x86_64-darwin12.3.0",
    "host" => "dynamodb.us-east-1.amazonaws.com",
    "x-amz-date"=>"20130608T182936Z",
    "x-amz-content-sha256" => "redacted",
    "authorization"=>"redacted"
  },
  :query => nil,
  :body => "{\"AttributesToGet\":[\"forward\"],\"TableName\":\"redacted\",\"Key\":{\"HashKeyElement\":{\"S\":\"progress is made on midnight oil\"}}}",
  :path=> "/",
  :async=>nil
}
method: apost

Both headers and body are the same in either http_handler, so I'm not sure why em_http_request is returning a status 0. The response from em_http is not all too helpful either . .

(apologies for the wall of text below, I tried to format it a little better than the plain old #inspect)

got response: #<EventMachine::HttpClient:0x007fa7427b6dd0
  @conn = #<EventMachine::HttpConnection:0x007fa74388a7b0
    @deferred=false,
    @middleware=[],
    @connopts= #<HttpConnectionOptions:0x007fa7427b2028
      @connect_timeout=10,
      @inactivity_timeout=60,
      @tls={},
      @proxy=nil,
      @host="dynamodb.us-east-1.amazonaws.com",
      @port=443
    >,
    @uri="https://dynamodb.us-east-1.amazonaws.com:443",
    @clients=[#<EventMachine::HttpClient:0x007fa7427b6dd0 ...>],
    @pending=[],
    @p= #<HTTP::Parser:0x007fa7427b4b98>,
    @conn= #<EventMachine::HttpStubConnection:0x007fa7427b5520
      @signature=7,
      @parent= #<EventMachine::HttpConnection:0x007fa74388a7b0 ...>,
      @deferred_status=:unknown,
      @callbacks=[#<Proc:0x007fa7427b4760@/Users/kbishop/.rvm/gems/ruby-2.0.0-p195@n/gems/em-http-request-1.0.3/lib/em-http/http_connection.rb:94>]
    >
  >,
  @req=#<HttpClientOptions:0x007fa74388a760
    @keepalive=false,
    @redirects=0,
    @followed=0,
    @method="POST",
    @headers = {
     "content-type"=>"application/x-amz-json-1.0",
     "x-amz-target"=>"DynamoDB_20111205.GetItem",
     "content-length"=>"150",
     "user-agent"=>"aws-sdk-ruby/1.9.5 ruby/2.0.0 x86_64-darwin12.3.0",
     "host"=>"dynamodb.us-east-1.amazonaws.com",
     "x-amz-date"=>"20130608T183008Z",
     "x-amz-content-sha256"=>"redacted",
     "authorization"=>"redacted"
    },
    @query=nil,
    @path="/",
    @file=nil,
    @body="{\"AttributesToGet\":[\"forward\"],
    \"TableName\":\"redacted\",
    \"Key\":{\"HashKeyElement\":{\"S\":\"progress is made on midnight oil\"}}}",
    @pass_cookies=true,
    @decoding=true,
    @uri=#<Addressable::URI:0x3fd3a13dabbc URI:https://dynamodb.us-east-1.amazonaws.com:443/>,
    @host="dynamodb.us-east-1.amazonaws.com",
    @port=443>,
    @stream=nil,
    @headers=nil,
    @cookies=[],
    @cookiejar=#<EventMachine::HttpClient::CookieJar:0x007fa7427b63f8 @jar=#<CookieJar::Jar:0x007fa7427b5b60 @domains={}>>,
    @response_header={},
    @state=:response_header,
    @response="",
    @error=nil,
    @content_decoder=nil,
    @content_charset=nil,
    @deferred_status=:failed,
    @callbacks=[#<Proc:0x007fa743890200@/Users/kbishop/.rvm/gems/ruby-2.0.0-p195@n/gems/em-synchrony-1.0.3/lib/em-synchrony.rb:64>],
    @errbacks=[],
    @deferred_timeout=nil,
    @deferred_args=[#<EventMachine::HttpClient:0x007fa7427b6dd0 ...>]
  >

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.