Comments (11)
Please explain what you mean by a sax push parser. I believe the current Ox SAX parser is what is sometimes referred to as a push parser.
from ox.
I mean that now we did not have http://nokogiri.org/Nokogiri/XML/SAX/PushParser.html analog
from ox.
It look like that takes is just a standard SAX parser. I don't see the difference between that and the Os::Sax parser other than the handler is created inline. Am I missing something?
from ox.
Main feature of push parser is that we push new incoming data to parser instead of situation when we let parser read data from IO stream directly. It is useful in event-processing code (eventmachine library for example).
Code example:
p = PushParser.new(sax_handler)
p.push '<chi'
p.push 'ld>some con'
p.push 'tent'
where sax_handler object with methods such as 'start_element','end_element', 'text' and so on.
from ox.
I understand. Basically it parses a stream until the stream closes.
from ox.
Look for a continuous SAX parser in some future release. Now that I understand what you were asking for it makes a lot of sense and would be ver useful.
from ox.
I just found myself in need of this feature as well.
from ox.
Not very efficient but does the job:
buffer, buffer_in = IO.pipe
handler = MySAXHandler.new
Thread.new do
Ox.sax_xml( document, buffer, sax_options )
end
buffer_in << '<parent><chi'
buffer_in << 'ld>some con'
buffer_in << 'tent</child></parent>'
buffer_in.close
# Do stuff with handler
from ox.
I prefer that over cooking something into Ox that does the same thing as a stream or pipe. You might be surprised how efficient it is.
from ox.
Creating threads isn't efficient for this, for example in my case this can be called hundreds of times in series leaving me with that many short-lived threads which has substantial impact on resource utilization and performance.
Pushing directly to Ox seems like the cleaner way and push parsers are an established paradigm.
from ox.
Another approach would be to keep a longer running thread that services parse requests. This feature will take some time to get done. Right now the state of the parser is on the stack and not carried around in a ruby object. That would have to change for the parser to be reentrant. The solution may be to keep a thread in C to maintain the parser state but I have no idea what problems that would cause for Ruby.
from ox.
Related Issues (20)
- Various thoughts HOT 10
- Use "interned" (frozen and deduplicated) Strings in Ruby 3.0+ to minimize object allocations. HOT 3
- Become `Ractor`-safe to allow usage in non-main `Ractor`. HOT 1
- Bundle install failing HOT 7
- Parse error for comments containing special characters inside a DOCTYPE declaration HOT 1
- Single carriage return is not converted to line feed HOT 16
- Easy way to get backtrace information from the C extension? HOT 18
- Process XML with instructions longer than 1024 characters. HOT 9
- Ox 2.14.7 causes segfault HOT 14
- RESX XML parsing - support for xml:space="preserve" HOT 3
- Preserve newlines in attributes and bodies when parsing. HOT 2
- [BUG] Bus Error (crashes ruby when using more than 16458 characters in a field) HOT 5
- Request to include a new mode to load which prepends xml attributes with a given string or a default HOT 1
- Ox.load result changes after parsing specific xml. HOT 2
- Sax parser segfault in v2.14.7+ HOT 4
- `StringIO` in `Ox::Builder` HOT 2
- Sax parsing with default encoding set to UTF-8 breaks Ox::Sax#text HOT 2
- OpenSSL::SSL::SSLContext::DEFAULT_CERT_STORE is not shareable across ractors HOT 1
- Issue with gems on M1 that depends on ox HOT 3
- test/tests.rb fails with ruby3.2.0preview3 and onwards HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ox.