Comments (7)
Excellent - it works for my test files!
Thanks lots!
from readstat.
Notes for myself:
It appears that a subheader pointer in bar.sas7bdat
is pointing to compressed data, but the pointer itself indicates that the data is not compressed. (The byte at offset 5809096 is 0x00
, whereas all of the other pointers in the block have the value 0x04
.) In a way the issue is the converse of #37. I'm not sure if the compression bit is stored elsewhere or if a workaround is required here.
from readstat.
If it helps I can produce a really char compressed dataset, not just one labelled so. I see problems with char compressed datasets so often that I do not believe it to be restricted to "wrong" labelling.
from readstat.
The data you provided is indeed compressed. The "mislabeling" occurs for a short segment of data deep in the file. The trouble occurs because a compressed file can contain both compressed and uncompressed chunks, and these individual chunks are sometimes mislabeled (apparently).
from readstat.
Most char compressed SAS datasets can now be read by development version 0.2.0.9000 of haven. But the simple example ietest2.sas7bdat in [https://github.com/reikoch/testfiles/blob/master/ietest2.sas7bdat] returns for read_sas("ietest2.sas7bdat"):
ReadStat: Error parsing page 0, bytes 8192-16383
Error: Failed to parse /opt/BIOSTAT/home/kochr4/sas7bdat/ietest2.sas7bdat: Invalid file, or file has unsupported features.
from readstat.
It looks like Python module sas7bdat from [https://bitbucket.org/jaredhobbs/sas7bdat] has solved decoding of SAS compressed datasets...
from readstat.
Fixed in c92f697
Please re-open the issue if you are still experiencing problems.
from readstat.
Related Issues (20)
- spss invalid file when reading char value labels HOT 1
- cannot read correctly variable name
- Issues writing Stata StrL variables HOT 4
- ENH: Add buffer based IO support
- Use-after-free Error , [gcc12 couldnt build] HOT 1
- Improve SAS7BDAT reader performance HOT 1
- Troubleshooting of reading sas7bdat format HOT 2
- Non-deterministic result of readstat_get_file_label in a DTA file HOT 1
- Different results of readstat_get_modified_time on Windows and Mac HOT 1
- readstat exporting value labels to sas7bcat from a Stata dta.
- Example for SAV metadeta changing
- Numeric variables files generated from CSV input always have decimals HOT 1
- Should the write functions use int64_t instead of long for row_count. HOT 1
- Number of rows in sas7bdat file nearly tripled
- Skip deleted observations in SAS7BDAT files HOT 10
- Security: heap-buffer-overflow in readstat_convert
- Unable to parse sas7bdat when data set page size >= 16MB HOT 2
- `Error: Failed to parse [...].sav: Invalid file, or file has unsupported features` when using haven package to read .sav file HOT 3
- Problem in export file (in python libary) HOT 1
- `sprintf()` -> `snprintf()` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from readstat.