Comments (14)
I just created my own blob via the Azure free account and it worked fine:
>>> dd.read_parquet('abfs://tmp/tmp.parquet', storage_options=storage_options)
Dask DataFrame Structure:
col1 col2
npartitions=2
0 int64 int64
2 ... ...
3 ... ...
Dask Name: read-parquet, 2 tasks
May have to see how the other blob is setup.
from adlfs.
Blob that worked:
Location: East US, West US (Primary: Available, Secondary: Available)
Replication: Read-access geo-redundant storage (RA-GRS)
Blob that does not work:
Location: East US 2
Replication: Locally-redundant storage (LRS)
from adlfs.
Closing as not related to adlfs
. If am able to 'fix' the storage account i'll post it at https://stackoverflow.com/questions/61220615/dask-read-parquet-from-azure-blob-azurehttperror
from adlfs.
I believe I got to the bottom of it.
When creating a blob to match the one that wasn't working I switched on 'Hierarchical namespace' Data Lake Storage Gen2 and reproduced the error.
from adlfs.
from adlfs.
Not sure if related to
azure-datalake-store
https://github.com/dask/adlfs/blob/master/requirements.txt#L1
from adlfs.
Thanks for digging into this @raybellwaves. I just looked at the configuration on one of the accounts I'm using with adlfs daily. There, Heirarchical namespace is turned on, but Replication is set to Geo-redundant Storage. Can you confirm if that combination works in your test space?
from adlfs.
I believe the below matches your configuration and I was not able to read the parquet file.
from adlfs.
Can kind of see it in the blob logs but not in much detail
from adlfs.
Wanted to be a bit more thorough on the blob configuration. I am unable to read when I turn on the hierarchical namespace.
from adlfs.
Thanks for sharing this @raybellwaves. Just to add to it, my current instance is:
- Region: East US 2
- Replication: GRS
- Performance: Standard
- Secure Transfer: Enabled
- Access Tier: Hot
- Hierarchical Namespace: Enabled
- NFSv3: Disabled
Any chance you can check the other parameters? Unfortunately, it will be at least another week before I can circle back to work on this.
from adlfs.
No problem. Thanks for sharing your config. Will do.
from adlfs.
from adlfs.
This has been working fine during the last couple of days. Closing.
from adlfs.
Related Issues (20)
- "sdk_moniker" key error HOT 9
- Avoid private APIs from azure.storage HOT 2
- InternalServerError while writing large json data.
- await file_obj.credential.close() : TypeError: object NoneType can't be used in 'await' expression HOT 4
- update readme HOT 1
- Support py3.12
- `find` doesn't accept `maxdepth` parameter HOT 1
- Add use_emulator setting to better align with object_store crate HOT 2
- Current state of the library, milestones and current development HOT 1
- Concurrent download of multiple files HOT 1
- Support virtual directory stubs with uppercase "Hdi_isfolder" metadata HOT 1
- Feature Suggestion: Optional content type when for writing file HOT 2
- Support passing url in AzureBlobFileSystem HOT 1
- Add comment why `aiohttp` is required
- Fix typo in repo About
- Python 3.12 support blocked by aiohttp HOT 1
- Feature Request: Support for Adding Metadata to Blobs
- Runtime warning from missing await HOT 2
- `fs.info()` and `fs.ls(detail=True)` return different etag formats
- Issue with parallel uploads to the same blob
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from adlfs.