Comments (10)
I'll put this in the XMLDocumentWriter.end
method to mitigate the error. It's odd that I never encountered it during the three years prior to psims
being formally published, where I wrote some very big files.
I'll likely make another release within the next day, but the change is in 8e7ad24 if you can work with the development version.
from psims.
Hmm. This doesn't seem to happen on the CI server, ever, but I have seen it on one of the Windows machines I use, but with a different long as the error code. Unfortunately, I've only seen it intermittently too. Does it cause the XML file to be incomplete/malformed when it does happen?
from psims.
The error code changes each time, which makes me think it might be dereferencing a null pointer somewhere? I will check the mzML when I'm at my work machine.
from psims.
Yes, the mzML file is complete and readable.
Looks like others have seen this too: https://pastebin.com/xJcEME60
and https://bitbucket.org/openpyxl/openpyxl/issues/1046/serialisationerror-unknown-error
from psims.
To clarify on the above, the error code is constant for a given file, but it changes for different files. I will keep investigating.
from psims.
This problem boils out of libxml2
, and it is shielded from view at the Python level because the attribute where the error is stored is only visible from C, and the C-extension class is defined @cython.internal
, so it can't be accessed without rebuilding lxml
from scratch.
from psims.
It seems to happen pretty frequently once the mzML file gets about 2 or 3 GB when running in our Docker container. Thinking the easiest way forward might be this:
try:
with MzMLWriter(open(self._mzml_fname, 'wb')) as writer:
# do work...
pass
except etree.SerialisationError as e:
self.logger.warning(f'Error closing file\n\n{e}\n')
from psims.
Looks like it might be a known problem for lxml for large files: https://bugs.launchpad.net/lxml/+bug/1570388
Slightly safer try/except:
try:
with IndexedMzMLWriter(open(self.mzml_fname, 'wb') , close=True) as writer:
pass
except etree.SerialisationError as e:
if str(e).startswith('unknown error'):
# There seems to be a bug when closing the mzML file, usually when they get large
# see here: https://bugs.launchpad.net/lxml/+bug/1570388
self.logger.debug(f'Error closing file: {e}')
else:
raise
from psims.
Fix is live in https://pypi.org/project/psims/0.1.27/
from psims.
Have tested, and throws as expected on a file that used to crash:
/Users/dev/anaconda3/envs/dev/lib/python3.7/site-packages/psims/xml.py:882: UserWarning: Error closing file: unknown error -1476144936
warnings.warn("Error closing file: {}".format(err))
Thanks for the fix
from psims.
Related Issues (17)
- mzIdentML Usage Patterns HOT 5
- Binary string in IndexedMzML? HOT 5
- Recommended usage pattern for extracting & saving just header info from mzML HOT 2
- Specify version of CV to use? HOT 3
- Problem with writing precursor information HOT 2
- Compatibility with other tools HOT 7
- Multiple unit options are possible for parameter 'base peak intensity' but none were specified HOT 7
- NameError: name 'err' is not defined HOT 6
- Conda package HOT 3
- Update package metadata on PyPI? HOT 9
- Files not closing (was: Truncated files) HOT 8
- [Maintainance] Deprecation warnings for sqlalchemy 2.0 HOT 5
- (MzIdentMLWriter) register AttributeError HOT 3
- (MzIdentMLWriter) SpectrumIdentification.write TypeError missing positional argument xml_file HOT 3
- [Question] Adding 1/k0 information to selected precursor ions HOT 2
- Import issues on Python 3.12 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from psims.