GithubHelp home page GithubHelp logo

Comments (10)

mischov avatar mischov commented on July 21, 2024 1

I would appreciate that. I will let you know when I get stuff pushed.

from html5ever_elixir.

hansihe avatar hansihe commented on July 21, 2024

I just tested it on my machine, I'm getting these results:

$ env MIX_ENV=prod mix run -e MeeseeksHtml5everParse.run
Running tests...
Parsed with Html5ever async in 20553.7 us
Parsed with Html5ever sync in 21025.3 us

Created Meeseeks Document from tuples in 9082.3 us

Parsed with Meeseeks async in 49555.2 us
Parsed with Meeseeks sync in 48633.9 us

Do you have any idea what might cause this difference between our machines?

from html5ever_elixir.

mischov avatar mischov commented on July 21, 2024

I have no idea why that behavior would be different on our two machines, but version differences maybe? I doubt it'd be Rust differences but I guess we could have different Erlangs? Or maybe some kind of OS differences?

Any idea why when the two operations take you 20ms and 10ms respectively but combined they take 50ms? Is it somehow down to scheduling or gc?

from html5ever_elixir.

mischov avatar mischov commented on July 21, 2024

Closing this issue because it seems like whatever is causing the differences between sync and async for me probably isn't related to your code.

I really want to get rid of that extra 20ms (20ms + 10ms = 50ms???), but that's not what this issue was about.

Thanks.

from html5ever_elixir.

hansihe avatar hansihe commented on July 21, 2024

Sorry I haven't been able to look at this more, I have had a really busy week. I should have a chance to look at this more next week.

from html5ever_elixir.

mischov avatar mischov commented on July 21, 2024

I'm fairly certain the extra time is due to gc related to receiving the tuple-tree of html from html5ever_elixir and then converting it into a flat-map.

I spent much more of my weekend than I should have writing a FlatDom TreeSink (inspired by yours and the RcDom) that converts to a Meeseeks.Document directly, and it looks like it shaves a pretty consistent 30%-40% (conservatively) off of my Meeseeks.parse time.

from html5ever_elixir.

hansihe avatar hansihe commented on July 21, 2024

If you would be willing make a pull request, I would be happy to accept that upstream.

from html5ever_elixir.

mischov avatar mischov commented on July 21, 2024

I'll get the code public in a few days at most, but it's tied fairly intimately to Meeseeks and so probably not appropriate to be pulled into html5ever_elixir (though maybe it will be useful for adaptation or inspiration).

I'm also not very good with Rust, so I'm sure there are numerous problems and inefficiencies.

from html5ever_elixir.

hansihe avatar hansihe commented on July 21, 2024

If you are interested, I would be happy to do code review/suggest improvements.

from html5ever_elixir.

mischov avatar mischov commented on July 21, 2024

Here you go. https://github.com/mischov/meeseeks_html5ever

from html5ever_elixir.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.