GithubHelp home page GithubHelp logo

Unhandled exception about linkcheck HOT 9 CLOSED

LucasGorgal avatar LucasGorgal commented on August 29, 2024
Unhandled exception

from linkcheck.

Comments (9)

LucasGorgal avatar LucasGorgal commented on August 29, 2024

I am using a MacOS High Sierra 10.13.6

from linkcheck.

filiph avatar filiph commented on August 29, 2024

Sorry I didn't see this earlier. Could you please share the HTML file this failed on? If you're not sure, please share the result of linkcheck --debug <url>.

from linkcheck.

zepalmer avatar zepalmer commented on August 29, 2024

I'm having a similar problem, I think. Here's a page I've put on my server under the name sample.php:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Example Title</title>
<style type="text/css">
</style>
</head>

<body>
</body>
</html>

(Disclaimer: this is a file from an old site. I promise I don't use PHP for anything modern.)

Here's a log from my terminal illustrating the failure:

$ linkcheck --debug http://localhost/sample.php
Reading URLs:
http://localhost/sample.php
Crawl will start on the following URLs: [http://localhost/sample.php]
Crawl will check pages only on URLs satisfying: {http://localhost/sample.php**}
Crawl will skip links that match patterns: UrlSkipper<>
Crawl will check the following servers (and their robots.txt) first: {localhost}
Using 4 threads.
Checking robots.txt and availability of server: localhost
Added: http://localhost/sample.php to Worker<1> with 0ms delay
Server check of localhost complete.
Server check for localhost complete: connected, no robots.txt.
Unhandled exception:
NoSuchMethodError: The getter 'primaryType' was called on null.
Receiver: null
Tried calling: primaryType
#0      Object.noSuchMethod (dart:core-patch/object_patch.dart:50:5)
#1      DestinationResult.updateFromResponse (package:linkcheck/src/destination.dart:326:48)
#2      checkPage (package:linkcheck/src/worker/worker.dart:127:11)
<asynchronous suspension>
#3      worker.<anonymous closure> (package:linkcheck/src/worker/worker.dart:192:29)
<asynchronous suspension>
#4      _RootZone.runUnaryGuarded (dart:async/zone.dart:1314:10)
#5      _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:336:11)
#6      _BufferingStreamSubscription._add (dart:async/stream_impl.dart:263:7)
#7      _SyncStreamController._sendData (dart:async/stream_controller.dart:764:19)
#8      _StreamController._add (dart:async/stream_controller.dart:640:7)
#9      _StreamController.add (dart:async/stream_controller.dart:586:5)
#10     _RootZone.runUnaryGuarded (dart:async/zone.dart:1314:10)
#11     _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:336:11)
#12     _BufferingStreamSubscription._add (dart:async/stream_impl.dart:263:7)
#13     _SyncStreamController._sendData (dart:async/stream_controller.dart:764:19)
#14     _StreamController._add (dart:async/stream_controller.dart:640:7)
#15     _StreamController.add (dart:async/stream_controller.dart:586:5)
#16     _StreamSinkWrapper.add (dart:async/stream_controller.dart:858:13)
#17     _RootZone.runUnaryGuarded (dart:async/zone.dart:1314:10)
#18     CastStreamSubscription._onData (dart:_internal/async_cast.dart:81:11)
#19     _RootZone.runUnaryGuarded (dart:async/zone.dart:1314:10)
#20     _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:336:11)
#21     _BufferingStreamSubscription._add (dart:async/stream_impl.dart:263:7)
#22     _SyncStreamController._sendData (dart:async/stream_controller.dart:764:19)
#23     _StreamController._add (dart:async/stream_controller.dart:640:7)
#24     _StreamController.add (dart:async/stream_controller.dart:586:5)
#25     _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:172:12)
Killing unresponsive Worker<1>
Done checking: http://localhost/sample.php (connection failed) => 0 links
- BROKEN
All jobs are done or user pressed Ctrl-C
Deduping destinations
Closing the isolate pool
Broken links
Done crawling.                   

Provided URLs failing:
http://localhost/sample.php (connection failed)

Error. Couldn't connect or find any links. Have you started the server?

Is it possible that this is specifically a PHP-related thing? Despite the PHP URL, the result of an HTTP request should be a valid HTML document, so I'm not sure why it would fail.

Using linkcheck 2.0.9 with Dart 2.4.1 on Debian 10. Thanks for the excellent tool!

from linkcheck.

filiph avatar filiph commented on August 29, 2024

Hi, thanks for the detailed report!

It looks like the local server you're using isn't reporting the Content-Type (mime type). I fixed the bug that crashes linkcheck in such instances but beyond that, there's not much I can do, unfortunately. I tentatively decided that in such cases linkcheck will try to parse the resource as if it was HTML, and assign a warning. That means you'll still get your site crawled, but you'll get a bazillion warnings on every page.

I said "looks like" above because I'm not 100% sure. It's possible there's some other reason why linkcheck doesn't see any content type. If so, please feel free to reopen this issue.

The fix will land shortly as 2.0.10. Run pub global activate linkcheck to upgrade.

from linkcheck.

zepalmer avatar zepalmer commented on August 29, 2024

Thanks for pushing this update! I've just checked the server I'm using. When accessing a page which doesn't crash linkcheck, I get response headers like

  HTTP/1.1 301 Moved Permanently
  Date: Mon, 02 Sep 2019 11:42:57 GMT
  Server: Apache/2.4.38 (Debian)
  Location: http://localhost/~zpalmer/cs21/
  Content-Length: 314
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive
  Content-Type: text/html; charset=iso-8859-1
  HTTP/1.1 200 OK
  Date: Mon, 02 Sep 2019 11:42:57 GMT
  Server: Apache/2.4.38 (Debian)
  Vary: Accept-Encoding
  Content-Length: 949
  Keep-Alive: timeout=5, max=99
  Connection: Keep-Alive
  Content-Type: text/html;charset=UTF-8

The page that does crash linkcheck produces these response headers:

  HTTP/1.1 200 OK
  Date: Mon, 02 Sep 2019 11:43:26 GMT
  Server: Apache/2.4.38 (Debian)
  Last-Modified: Thu, 29 Aug 2019 20:44:47 GMT
  ETag: "1f83-5914793818dc0"
  Accept-Ranges: bytes
  Content-Length: 8067
  Keep-Alive: timeout=5, max=100
  Connection: Keep-Alive

With 2.0.10, I'm now getting a different exception:

Unhandled exception:
NoSuchMethodError: The getter 'charset' was called on null.
Receiver: null
Tried calling: charset
#0      Object.noSuchMethod (dart:core-patch/object_patch.dart:50:5)
#1      checkPage (package:linkcheck/src/worker/worker.dart:148:29)
<asynchronous suspension>
#2      worker.<anonymous closure> (package:linkcheck/src/worker/worker.dart:192:29)
<asynchronous suspension>
#3      _RootZone.runUnaryGuarded (dart:async/zone.dart:1314:10)
#4      _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:336:11)
#5      _BufferingStreamSubscription._add (dart:async/stream_impl.dart:263:7)
#6      _SyncStreamController._sendData (dart:async/stream_controller.dart:764:19)
#7      _StreamController._add (dart:async/stream_controller.dart:640:7)
#8      _StreamController.add (dart:async/stream_controller.dart:586:5)
#9      _RootZone.runUnaryGuarded (dart:async/zone.dart:1314:10)
#10     _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:336:11)
#11     _BufferingStreamSubscription._add (dart:async/stream_impl.dart:263:7)
#12     _SyncStreamController._sendData (dart:async/stream_controller.dart:764:19)
#13     _StreamController._add (dart:async/stream_controller.dart:640:7)
#14     _StreamController.add (dart:async/stream_controller.dart:586:5)
#15     _StreamSinkWrapper.add (dart:async/stream_controller.dart:858:13)
#16     _RootZone.runUnaryGuarded (dart:async/zone.dart:1314:10)
#17     CastStreamSubscription._onData (dart:_internal/async_cast.dart:81:11)
#18     _RootZone.runUnaryGuarded (dart:async/zone.dart:1314:10)
#19     _BufferingStreamSubscription._sendData (dart:async/stream_impl.dart:336:11)
#20     _BufferingStreamSubscription._add (dart:async/stream_impl.dart:263:7)
#21     _SyncStreamController._sendData (dart:async/stream_controller.dart:764:19)
#22     _StreamController._add (dart:async/stream_controller.dart:640:7)
#23     _StreamController.add (dart:async/stream_controller.dart:586:5)
#24     _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:172:12)

It appears that this occurs when the charset is missing from the Content-Type header. I'm guessing that this is a consequence of the default being applied when my page lacks a Content-Type entirely, but it also reveals a more general issue if a server produces a Content-Type with no associated charset value.

Thanks for the help on this!

from linkcheck.

zepalmer avatar zepalmer commented on August 29, 2024

As a note, I was not a maintainer closing this issue, so I'm not permitted to re-open it. I just learned that about GitHub. :)

from linkcheck.

filiph avatar filiph commented on August 29, 2024

This is excellent info, @zepalmer! I'll look into this. No promises on speed, though. :/

from linkcheck.

zepalmer avatar zepalmer commented on August 29, 2024

No problem! Thanks again for the excellent tool. This is part of my workflow for updating my course website and speeds things up a lot. If it takes a while, that's fine; if it bugs me, I'll go learn Dart and send you a PR. :)

from linkcheck.

filiph avatar filiph commented on August 29, 2024

Ooof, this took way longer than I anticipated, but it's finally fixed in version 2.0.15. If things don't work as expected, please run linkcheck with --verbose and paste the output here. Thanks for the patience!

from linkcheck.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.