Comments (9)
Hey @robertgarrigos
I can replicate the problem. It looks as if the error comes from the DOM crawler, not PHPScraper itself. The xPath could use some tweaking:
$myClassElements = $web->filter("//*[@class='prose']");
with ->text()
you should get the text of the sub-nodes:
$myClassElements = $web->filter("//*[@class='prose']")->text();
I've also tried to use other PHPScraper built-in selectors and they worked. The $web->lists
for example returns the lists as expected.
I hope this helps,
Peter
from phpscraper.
Hey everyone,
I've added a page to document the way custom selectors can be used: https://phpscraper.de/examples/custom-selectors.html
There are also some new tests for this: https://github.com/spekulatius/PHPScraper/blob/master/tests/CustomSelectorTest.php
Please let me know if you think anything is missing.
Cheers,
Peter
from phpscraper.
I have the same question :(
from phpscraper.
Hello @Kkiomen and @gcijuentes,
sorry for the late reply.
Have you tried the filterXPath method? It should allow you to simply filter by any class name using an xPath like $myClassElements = $web->filterXPath("//[@class='my-class']");
.
Cheers,
Peter
from phpscraper.
While trying it, I'm getting this error:
Call to undefined method spekulatius\core::filterXPath()
I just installed PHPScrapper (0.6.2) with Composer and the first example of getting a website's title worked fine.
What am I missing?
from phpscraper.
Hey @robertgarrigos
Oh sorry, I mixed up the naming with the underlying package. It's filter
instead of filterXPath
. filterXPath
is used in the DOM crawler package: https://github.com/symfony/dom-crawler/blob/8cb4c6e6c8d30c26f70529ed5e50d79a09576c0c/Crawler.php#L686
Please try again with filter
. CC @Kkiomen and @gcijuentes
Cheers,
Peter
from phpscraper.
Still not working:
Warning: DOMXPath::query(): Invalid expression in /app/vendor/symfony/dom-crawler/Crawler.php on line 1013 Fatal error: Uncaught InvalidArgumentException: Expecting a DOMNodeList or DOMNode instance, an array, a string, or null, but got "bool". in /app/vendor/symfony/dom-crawler/Crawler.php:145 Stack trace: #0 /app/vendor/symfony/dom-crawler/Crawler.php(1013): Symfony\Component\DomCrawler\Crawler->add(false) #1 /app/vendor/symfony/dom-crawler/Crawler.php(771): Symfony\Component\DomCrawler\Crawler->filterRelativeXPath('descendant-or-s...') #2 /app/vendor/spekulatius/phpscraper/src/phpscraper.php(165): Symfony\Component\DomCrawler\Crawler->filterXPath('descendant-or-s...') #3 /app/vendor/spekulatius/phpscraper/src/phpscraper.php(60): spekulatius\core->filter('//[@class='pros...') #4 /app/phpscraper.php(11): spekulatius\phpscraper->__call('filter', Array) #5 {main} thrown in /app/vendor/symfony/dom-crawler/Crawler.php on line 145
from phpscraper.
from phpscraper.
require __DIR__ . '/vendor/autoload.php';
$web = new \spekulatius\phpscraper;
$web->go('https://www.lieder.net/lieder/get_settings.html?ComposerId=2520');
// print_r($web->title);
$myClassElements = $web->filter("//[@class='prose']");
print_r($myClassElements);
from phpscraper.
Related Issues (20)
- Idea: Allow to select presets of common browser in recent versions
- [Proposal] Exposing Goutte/Client via client() property/callable method HOT 1
- Allow to set cookies
- TypeError HOT 3
- get http status code HOT 7
- Parsing structured data (microdata) HOT 3
- Idea: Discovery Sets
- Idea: Implement low-level util to access the web. HOT 1
- Idea: Directly exposing received headers HOT 1
- What location PHPSCrapper based on? HOT 1
- Docker Composer Install Error HOT 12
- [Request] Add robots.txt parsing HOT 3
- [Request] Sitemap Index Files HOT 2
- Syntax Error when i tried using PHP 7.3 HOT 3
- fabpot/goutte HOT 14
- Spanish web content not displayed correctly '?' is putted instead of the correct character HOT 1
- Fix problems reported by PHPStan HOT 5
- psr/http-message 2.0 compatibility HOT 2
- issue about php scraping api HOT 1
- Scraping a site with CloudFlare protection/redirect returns no results HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from phpscraper.