Comments (16)
The main library will only expose the DOM, however, there could be a V8 integration library (which is why I decorate every DOM element (type, property and method) with a DOMAttribute).
If you want to write a V8 integration library it would be great, otherwise it will probably take some time until I start doing it.
from anglesharp.
V8 integration would be great, good thinking for decorating DOM elements with a DOMAttribute.
from anglesharp.
Alright so I think I will probably include a V8 integration sample in the samples application soon.
from anglesharp.
I will update the samples within the next week - there I will showcase a sample implementation of Jurassic, which is something like V8 (but all managed).
from anglesharp.
It will be just an example or full featured implementation?
from anglesharp.
It will be an example AND it will use another library called (name still to be found) AngleSharp.Jurassic, which will also be distributed over NuGet. This library will depend on both, Jurassic and AngleSharp, and do all the necessary binding. Is that sufficient?
Initially I planned V8, but back then I did not know about Jurassic and it seems very good. I also started writing my own JS engine, but in fact it now useless, since it would do the same as Jurassic, which is already established. If you require / need V8, then I hope that the implementation of using Jurassic will give you enough to get started.
from anglesharp.
Sounds really good!
Please clarify - DOM will be exposed to Jurassic or COULD be exposed?
I mean - if I add AngleShart.Jurassic can i suppose dom in js environment?
from anglesharp.
will this make anglesharp have the ability to behave like a headless browser so we can use it to improve SEO ?
from anglesharp.
Exactly. But please keep in mind that this ability comes with 2 things (and I don't know if those 2 things will be implemented right away, however, they are on the roadmap and will be implemented soon otherwise):
1.) the ability to extend AngleSharp by registering elements e.g. for HTTP requests
2.) the scripting sections of AngleSharp
Some things in 2.) are already implemented (you can check yourself that it makes a difference calling e.g. the DocumentBuilder.Html() method (has scripting disabled at the moment) and e.g. the HtmlParser with a fresh HtmlDocument() that has scripting enabled - the noscript section for instance is treated differently). The rest is performing scripting actions and more on the DOM, i.e. running the code given in <script> tags.
tl;dr: Yes in the long run, maybe no in the short run (but quite close).
from anglesharp.
thanks a lot for answering, do you mean that in the current time that HtmlParser with a fresh HtmlDocument() that has scripting enabled will execute the scripts in page like in Browser ?
regarding the implementation above that you mentioned, this would be awesome, a lot of us are facing many problems, like
- ability to extract styles and embed them in html document to send it as an email
- ability to execute scripts for AJAX application like mentioned above to improve SEO
currently we tend to use html agility pack and premailer to solve this, but if AngleSharp would enable us to do this nativly or through extensible points then this would be one of the best things happened in this domain for .NET
another thing, i dont see any documentation section :)
from anglesharp.
Hi, documentation is currently given in form of the samples, however, I would welcome if somebody would start something like a documentation section (that would be very helpful!). So if you are willing to contribute I would really find that amazing!
Regarding the list of problems:
1.) Yes that will be easily possible, but I have to integrate "getComputedStyle" style. This will happen in the next couple of weeks so stay tuned!
2.) That will be possible as well (as discussed in this thread).
Right now nothing executes scripts since no scripting engine is integrated. However, there is a flag if scripting is enabled or not. This flag determines the behavior for some tags e.g. noscript. If scripting is disabled everything is parsed as usual, otherwise everything inside is parsed in rawtext mode, which is a big difference.
In the future the DocumentBuilder will automatically switch on scripting if a scripting engine has been registered. Right now this has to be done manually by creating a fresh document, setting the flag, instantiating the parser etc.
from anglesharp.
I can help with the documentation. Please let me know.
Registering a script engine will be helpful instead of sticking to particular engine.
from anglesharp.
thanks a lot for reply, this sound so promising, i have downloaded the source and i am overwhelmed by the amount of work you have done, this library is so great and it is obviously way beyond Html Agility back and it needs more attention from the community. thanks a lot for putting out this gem :)
that aside, i cant access Htmldocument, Visual studio says it is internal so something like this cant work
var htmlDoc = new HtmlDocument();
from anglesharp.
The scripting flag is also internal - so this was just to illustrate (AngleSharp aims to follow the W3C spec, hence I usually do not expose internal stuff, which also does not follow the spec -- helper methods to do certain things are exceptions). As I wrote at the moment this is more or less useless - in the future when you can register a scripting engine AngleSharp will auto-detect this and handle requests accordingly (with scripting enabled / disabled depending on the scripting engine state).
Like in a browser once a scripting engine is registered one can also de-activate it just in a certain context. Hence one could load a webpage with scripting and without.
By the way: The DOM gives you the possibility of creating a new HTML document - but it is tricky (I think it is not known by many since it is quite usless in a real world context). You need the implementation property of a Document (or derived) instance and call the createDocument method (if you want an HTMLDocument and not a Document then you have also to enter the HTML namespace URI).
from anglesharp.
this is really great to hear all the best to you on your great work with this library :)
from anglesharp.
Thanks!
from anglesharp.
Related Issues (20)
- Read only DOM and other ways to reduce allocation rate HOT 9
- DSL or fluent API for document construction? HOT 1
- Im not able to get any element from the site HOT 5
- Request for Support / Sponsorship HOT 1
- IHtmlDocument has IDisposable - for what?) HOT 1
- Redirect to Custom URL Scheme HOT 2
- Issues with Headers HOT 2
- Use libraries provided by framework HOT 3
- Provide repo link as part of nuget package HOT 8
- QuerySelectorAll problem HOT 1
- SemVer scheme in AngleSharp -alpha versions broken HOT 1
- Multipart/form-data support HOT 1
- IndexOutOfRangeException in AngleSharp.Common.ArrayPoolBuffer.Append HOT 1
- Getting Attributes for each Element HOT 1
- InvalidOperationException: Stack empty in AngleSharp.Html.Parser.HtmlDomBuilder HOT 1
- Attributes in Elements HOT 3
- Parser Issue Findings from Fuzzing HOT 4
- Additional Findings from Fuzzing HOT 3
- NullReferenceException when using own HttpClient HOT 1
- How to Change the Accept header in DocumentRequest HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from anglesharp.