GithubHelp home page GithubHelp logo

Comments (14)

raybellwaves avatar raybellwaves commented on July 24, 2024

I just created my own blob via the Azure free account and it worked fine:

>>> dd.read_parquet('abfs://tmp/tmp.parquet', storage_options=storage_options)
Dask DataFrame Structure:
                col1   col2
npartitions=2              
0              int64  int64
2                ...    ...
3                ...    ...
Dask Name: read-parquet, 2 tasks

May have to see how the other blob is setup.

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

Blob that worked:

Location: East US, West US (Primary: Available, Secondary: Available)
Replication: Read-access geo-redundant storage (RA-GRS)

Blob that does not work:

Location: East US 2
Replication: Locally-redundant storage (LRS)

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

Closing as not related to adlfs. If am able to 'fix' the storage account i'll post it at https://stackoverflow.com/questions/61220615/dask-read-parquet-from-azure-blob-azurehttperror

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

I believe I got to the bottom of it.

When creating a blob to match the one that wasn't working I switched on 'Hierarchical namespace' Data Lake Storage Gen2 and reproduced the error.

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

image

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

Not sure if related to
azure-datalake-store https://github.com/dask/adlfs/blob/master/requirements.txt#L1

from adlfs.

hayesgb avatar hayesgb commented on July 24, 2024

Thanks for digging into this @raybellwaves. I just looked at the configuration on one of the accounts I'm using with adlfs daily. There, Heirarchical namespace is turned on, but Replication is set to Geo-redundant Storage. Can you confirm if that combination works in your test space?

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

I believe the below matches your configuration and I was not able to read the parquet file.

Screenshot from 2020-04-24 07-56-01
Screenshot from 2020-04-24 07-59-13

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

Can kind of see it in the blob logs but not in much detail

Screenshot from 2020-04-24 08-21-10

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

Wanted to be a bit more thorough on the blob configuration. I am unable to read when I turn on the hierarchical namespace.

Screenshot from 2020-05-04 22-11-54

from adlfs.

hayesgb avatar hayesgb commented on July 24, 2024

Thanks for sharing this @raybellwaves. Just to add to it, my current instance is:

  • Region: East US 2
  • Replication: GRS
  • Performance: Standard
  • Secure Transfer: Enabled
  • Access Tier: Hot
  • Hierarchical Namespace: Enabled
  • NFSv3: Disabled

Any chance you can check the other parameters? Unfortunately, it will be at least another week before I can circle back to work on this.

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

No problem. Thanks for sharing your config. Will do.

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

Screenshot from 2020-05-05 07-29-52

from adlfs.

raybellwaves avatar raybellwaves commented on July 24, 2024

This has been working fine during the last couple of days. Closing.

from adlfs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.