GithubHelp home page GithubHelp logo

Parse a HTML string about embed HOT 6 CLOSED

oscarotero avatar oscarotero commented on May 18, 2024
Parse a HTML string

from embed.

Comments (6)

younes0 avatar younes0 commented on May 18, 2024 1

With the new Guzzle5 resolver:

use GuzzleHttp\Client;
use GuzzleHttp\Event\BeforeEvent;
use GuzzleHttp\Message\Response;
use GuzzleHttp\Stream\Stream;

$html = file_get_contents('http://whatever'); // HTML string

$client = new Client();

$client->getEmitter()->on('before', function(BeforeEvent $e) use ($html) {
    $body = isset($html) ? Stream::factory($html) : null;
    $e->intercept(new Response(200, [], $body));
});

$embed = Embed::create($url = 'dummy', [ // $url must not be empty
    'resolver' => [ 
        'class' => \Embed\RequestResolvers\Guzzle5::class,
        'config' => [ 'client' => $client ],
    ],
]);

from embed.

oscarotero avatar oscarotero commented on May 18, 2024

Hi.
The Url class has the Url->resolve method to do the request and get all available data (https://github.com/oscarotero/Embed/blob/master/Embed/Url.php#L58)
You can provide your own resolver class editing the Embed\Url::$resolver variable. Note that the Url class is used not only to get the content of the main url but also to get the response of other secondary urls (APIs, redirects, oembed, etc). I guess you want to change only the way to resolve the main url, not all these secondary requests, so changing the resolver of the Url class is not the best way.
Maybe a possible solution can be to provide a new method to set manually the content and headers of the url.

from embed.

oscarotero avatar oscarotero commented on May 18, 2024

Hi again, @nazieb
I've working in a new feature to provide custom urls resolvers in a more flexible way. There is a new branch called "custom_url_resolvers" with some changes:

You can create your own url resolver and use it on create a new Url instance:

$resolver = new GuzzleResolver($guzzleData);
$url = new Embed\Url($resolver); //You can provide directly the resolver instead the url string
$info = Embed\Embed::create($url);

You can set your url resolver as default to use it always, not only the main url:

Embed\Url::setDefaultResolver('GuzzleResolver');

Please, let me know if this is what you need.

from embed.

nazieb avatar nazieb commented on May 18, 2024

Hello Oscar,

It's really nice of you to build the custom URL resolver. That might come
handy, but what I really meant is the ability to parse HTML string without
needing to do an HTTP request.

For the example, in my app the HTML is already stored in the database
(after resolved by Guzzle in another process) then I want to parse the
OpenGraph, Twitter Card etc from those HTMLs.

Without Wax,

Ainun Nazieb
http://nazie.bz/

On Sat, Mar 22, 2014 at 1:48 AM, Oscar Otero [email protected]:

Hi again, @nazieb https://github.com/nazieb
I've working in a new feature to provide custom urls resolvers in a more
flexible way. There is a new branch called "custom_url_resolvers" with some
changes:

You can create your own url resolver and use it on create a new Url
instance:

$resolver = new GuzzleResolver($guzzleData);$url = new Embed\Url($resolver); //You can provide directly the resolver instead the url string$info = Embed\Embed::create($url);

You can set your url resolver as default to use it always, not only the
main url:

Embed\Url::setDefaultResolver('GuzzleResolver');

Please, let me know if this is what you need.


Reply to this email directly or view it on GitHubhttps://github.com//issues/18#issuecomment-38309790
.

from embed.

oscarotero avatar oscarotero commented on May 18, 2024

Hi.
There is some data that require http requests: data provided by oembed, by facebook graph or some others APIs.
For example, youtube has an oembed service to provide information about the videos (example: http://www.youtube.com/oembed?format=xml&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DeiHXASgRTcA) so it's required to do this request to get the title, author, description, embed code, etc.
If you have the html content of the page stored, you can prevent to do the first http request (to get the page content) but not the requests that connect with other apis.
With the url resolver, you can do something like this:

//Create your own resolver class that implements the Embed\UrlResolvers\UrlResolverInterface and instance it:
$resolver = new MyOwnResolver();

//Now set the information you have stored in your database (url, content, etc) to prevent the request
$resolver->setUrl($url);
$resolver->setContent($content);

//This can be set by default:
$resolver->setMimetype('text/html');
$resolver->setHttpCode(200);

//Ok, you can now get all information about this url
$info = Embed\Embed::create($resolver);

from embed.

nazieb avatar nazieb commented on May 18, 2024

Wow, that's nice solution. I'm fine with the http request to another API
providers.

I'll test your example right away. Thanks for the support and this great
library!
On Mar 22, 2014 7:27 PM, "Oscar Otero" [email protected] wrote:

Hi.
There is some data that require http requests: data provided by oembed, by
facebook graph or some others APIs.
For example, youtube has an oembed service to provide information about
the videos (example:
http://www.youtube.com/oembed?format=xml&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DeiHXASgRTcA)
so it's required to do this request to get the title, author, description,
embed code, etc.
If you have the html content of the page stored, you can prevent to do the
first http request (to get the page content) but not the requests that
connect with other apis.
With the url resolver, you can do something like this:

//Create your own resolver class that implements the Embed\UrlResolvers\UrlResolverInterface and instance it:$resolver = new MyOwnResolver();
//Now set the information you have stored in your database (url, content, etc) to prevent the request$resolver->setUrl($url);$resolver->setContent($content);
//This can be set by default:$resolver->setMimetype('text/html');$resolver->setHttpCode(200);
//Ok, you can now get all information about this url$info = Embed\Embed::create($resolver);


Reply to this email directly or view it on GitHubhttps://github.com//issues/18#issuecomment-38350204
.

from embed.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.