GithubHelp home page GithubHelp logo

libs3's Introduction

iRODS

The Integrated Rule-Oriented Data System (iRODS) is open source data management software used by research, commercial, and governmental organizations worldwide.

iRODS is released as a production-level distribution aimed at deployment in mission critical environments. It virtualizes data storage resources, so users can take control of their data, regardless of where and on what device the data is stored.

The development infrastructure supports exhaustive testing on supported platforms; plugin support for microservices, storage resources, authentication mechanisms, network protocols, rule engines, new API endpoints, and databases; and extensive documentation, training, and support services.

Core Competencies

  • iRODS implements data virtualization, allowing access to distributed storage assets under a unified namespace, and freeing organizations from getting locked in to single-vendor storage solutions.
  • iRODS enables data discovery using a metadata catalog that describes every data object, collection, and every storage resource in the iRODS Zone.
  • iRODS automates data workflows, with a rule engine framework that permits any action to be initiated by any trigger on any server or client in the Zone.
  • iRODS enables secure collaboration, so users only need to log in to their home Zone to access data hosted on a remote Zone.

History

iRODS has a 25+ year history of funded projects.

Funders have included DARPA, NSF, DOD, DOE, LC, NARA, NASA, NOAA, USPTO, and LLNL.

https://irods.org/history

License

iRODS is released under a 3-clause BSD License.

Reporting Security Vulnerabilities

See SECURITY.md for details.

Links to elsewhere...

libs3's People

Contributors

alexandersack avatar alexeip0 avatar andreikop avatar benmcclelland avatar bester avatar bingmann avatar bji avatar chenji-kael avatar dalgaaf avatar earlephilhower avatar ellert avatar estadtherr avatar guillermomuntaner avatar jengelh avatar junjiexing avatar justinkylejames avatar konfan avatar ktdreyer avatar likema avatar martinprikryl avatar meinemitternacht avatar mutantkeyboard avatar sergeydobrodey avatar sivachandran avatar spielkind avatar twincitiesguy avatar vlibioulle avatar yehudasa avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

libs3's Issues

S3StatusErrorQuotaExceeded not listed as retryable.

S3_status_is_retryable() does not list S3StatusErrorQuotaExceeded as a retryable error.

At least in the case of Oracle S3 this occurs when there are too many simultaneous requests which in that case would be retryable. Maybe in others it wouldn't.

Consider adding this to the list.

Ceph S3 is complaining on range copy

When doing a range copy to Ceph S3, the server is complaining about not having the Content-Length header. According to AWS documentation this header is not required.

Using AWS or Minio this works.

Look at updating libs3 to (conditionally?) add it.

> PUT <snip>/f2?partNumber=2&uploadId=<snip>
Host: <snip>
User-Agent: Mozilla/4.0 (Compatible; s3; libs3 4.1; Linux x86_64)
Accept: */*
Range: bytes=67108864-134217727
Authorization: <snip>
x-amz-date: 20201207T185949Z
x-amz-copy-source: <snip>/f1
x-amz-copy-source-range: bytes=67108864-134217727
x-amz-content-sha256: <snip>
Expect: 100-continue
^M
< HTTP/1.1 411 Length Required
< Content-Length: 278
< x-amz-request-id: <snip>
< Accept-Ranges: bytes
< Content-Type: application/xml

Range is set improperly for S3_copy_object_range

  • v4
  • master

The end range for the following should be params->startByte + params->byteCount - 1.

Bytes in the middle are being written more than once and then it attempts to write one byte extra.

libs3/src/request.c

Lines 399 to 403 in 59b6237

if (params->byteCount > 0) {
headers_append(1, "x-amz-copy-source-range: bytes=%lld-%lld",
(unsigned long long) params->startByte,
(unsigned long long) (params->startByte + params->byteCount) );
}

Timeout when complete multipart upload takes a long time.

When I upload a very large file (>30GiB) to a MinIO server on my machine, I am getting timeouts on complete multipart upload while MinIO is putting the parts together. In my case it takes about 18 minutes to complete the multipart. This is likely due to a non-production level MinIO server setup.

Nevertheless, this does indicate a need to allow the complete multipart operation longer to complete.

All of the libs3 operations have a timeout parameter (ms) which is passed directly to the CURLOPT_TIMEOUT_MS curl option.

However, there are two other settings that will cause a timeout in this scenario: CURLOPT_LOW_SPEED_LIMIT and CURLOPT_LOW_SPEED_TIME. The latter is set to 15 seconds in libs3. Because complete multipart is not sending a lot of data in the response to the complete multipart (I believe it does send some spaces to keep alive) the data passed will be below the limit and the 15 second timeout will trigger.

To resolve this, I propose that the CURLOPT_LOW_SPEED* settings only be set if there is a non-zero timeout set. (Zero means no timeout.) So if there is no timeout set, use the low speed options to trigger a timeout. If there is a timeout set, just use that timeout.

Investigate increasing CURLOPT_LOW_SPEED_TIME for calls which do not send data

Some calls do not transfer data but may take time for the S3 provider to handle. One notable one is the S3_copy_object call.

We set this parameter to 15 seconds if the main timeout is set to 0 (no timeout). That may not be long enough for some providers.

There are two options:

  1. Simple increase the time for CURLOPT_LOW_SPEED_TIME.
  2. Use the default of 0 for CURLOPT_LOW_SPEED_TIME in this case.

The second option seems consistent with the desire to not have timeouts. However, this could possibly result in a situation where a call hangs indefinitely or waits for some type of transport timeout.

Add glacier support

Update HEAD to read and store glacier related parameters.

  • x-amz-storage-class or x-goog-storage-class
  • x-amz-restore or x-goog-restore

Add RestoreObject support.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.