GithubHelp home page GithubHelp logo

Comments (6)

jcrist avatar jcrist commented on August 11, 2024

cc @martindurant, @mrocklin. Also cc @danielfrg based on conversation from earlier today.

from s3fs.

jcrist avatar jcrist commented on August 11, 2024

As one data point, there's another python s3 filesystem here (with a bit of a different interface/design) that sets bucket as a configuration parameter.

from s3fs.

martindurant avatar martindurant commented on August 11, 2024

Some thoughts on this.

  • having bucket config as an optional mode seems like it would be more confusing than not, since the code would have two modes of operation
  • at some point, listing the buckets that a user owns seems like a thing that is desired, and getting formation at the bucket level in general
  • agree that almost all operation are intra-bucket, and maybe you are right that this matches most people's expectation (don't know)
  • There does seem to be a tendency for people to want to supply URLs as 's3://bucket/path' even when dealing with the explicitly s3 interface here; this probably comes from the various CLI tools and Spark/Dask usage. Is it not reasonable to assume that the path that works in Dask works identically here?
  • I could imagine this working by having a concept of cwd, i.e., we have a current location, and so paths that look relative (no leading '/') are within. Does this mean a cwd that can descend down the directory tree, and how do you form URLs with "s3:"? I don't know.

from s3fs.

danielfrg avatar danielfrg commented on August 11, 2024

Just my opinion here, is that most of the time I expect to supply a bucket when I try to access data on S3. Is kinda also similar to what boto does, you create a bucket object and then pass a key to get/set the file contents.

There is indeed some things like hadoop distcp that allow URIs like: s3://bucket/path/ but I would bet that they just parse the URI to bucket + path for the Java lib they use.

from s3fs.

mrocklin avatar mrocklin commented on August 11, 2024

FWIW as a user I find the s3://bucket/path approach intuitive. Also when I share data I often want to share a single address, not a bucket and an address.

from s3fs.

jcrist avatar jcrist commented on August 11, 2024

FWIW as a user I find the s3://bucket/path approach intuitive. Also when I share data I often want to share a single address, not a bucket and an address.

This would be doable making the bucket part of the S3FileSystem config - tools like dask or pandas don't access the S3FileSystem object directly. Currently things like host go as part of the configuration when parsing a URI, which would then be forwarded using dask's normal filesystem init, removing our need for a s3fs/gcsfs special case.

from s3fs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.