Comments (6)
cc @martindurant, @mrocklin. Also cc @danielfrg based on conversation from earlier today.
from s3fs.
As one data point, there's another python s3
filesystem here (with a bit of a different interface/design) that sets bucket
as a configuration parameter.
from s3fs.
Some thoughts on this.
- having bucket config as an optional mode seems like it would be more confusing than not, since the code would have two modes of operation
- at some point, listing the buckets that a user owns seems like a thing that is desired, and getting formation at the bucket level in general
- agree that almost all operation are intra-bucket, and maybe you are right that this matches most people's expectation (don't know)
- There does seem to be a tendency for people to want to supply URLs as 's3://bucket/path' even when dealing with the explicitly s3 interface here; this probably comes from the various CLI tools and Spark/Dask usage. Is it not reasonable to assume that the path that works in Dask works identically here?
- I could imagine this working by having a concept of
cwd
, i.e., we have a current location, and so paths that look relative (no leading'/'
) are within. Does this mean acwd
that can descend down the directory tree, and how do you form URLs with "s3:"? I don't know.
from s3fs.
Just my opinion here, is that most of the time I expect to supply a bucket when I try to access data on S3. Is kinda also similar to what boto does, you create a bucket object and then pass a key to get/set the file contents.
There is indeed some things like hadoop distcp that allow URIs like: s3://bucket/path/
but I would bet that they just parse the URI to bucket + path for the Java lib they use.
from s3fs.
FWIW as a user I find the s3://bucket/path
approach intuitive. Also when I share data I often want to share a single address, not a bucket and an address.
from s3fs.
FWIW as a user I find the s3://bucket/path approach intuitive. Also when I share data I often want to share a single address, not a bucket and an address.
This would be doable making the bucket part of the S3FileSystem
config - tools like dask
or pandas
don't access the S3FileSystem
object directly. Currently things like host
go as part of the configuration when parsing a URI, which would then be forwarded using dask's normal filesystem init, removing our need for a s3fs/gcsfs special case.
from s3fs.
Related Issues (20)
- Add transfer configuration to support concurrent downloading HOT 1
- It is not possible to use synchronous and asynchronous calls together HOT 3
- Confused about using s3fs asynchronously HOT 7
- Failed to check IAM role name HOT 2
- Access Denied when IAM policy give access (Read/Write/Listing) to only a prefix area HOT 14
- difficult to perform delete_object request instead of delete_objects using S3FileSystem HOT 7
- S3fs doesn't check again if file exists HOT 4
- Inconsistent recursive `put` behavior when running an identical command twice successively HOT 1
- open_async file is closed on arrival HOT 1
- set_session does not seem to be thread / jobs safe HOT 4
- Random XAmzContentSHA256Mismatch Errors HOT 6
- Access denied when providing an authentication token associated with a set of permission policies to S3FileSystem HOT 3
- calling flush on s3fs fails HOT 2
- s3fs 2024.3.0 fails reading glob patterns through pandas HOT 12
- Question: is awscrt useful ? HOT 2
- Errors when installing s3fs on Sagemaker Studio HOT 1
- Why isn't Pathlib supported yet? HOT 1
- Working example of using Async/Await HOT 7
- Custom s3 compatible https endpoint not working, port forwarded to localhost works HOT 9
- How to Increase async httpconnection limit? HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from s3fs.