Comments (1)
As discussed onSlack, I think these limitations are fine and similar to what we had last year.
The problem with the script is that it's detecting every importScripts(), even inside libraries, like Workbox. The good thing is that since importScrpts() is a worker method, it doesn't match client-side libraries, so we might be able to filter this better.
The intent of the importScripts
query “How are developers using PWAs? Are they writing it from scratch or using libraries and, if so, which ones?” So if WorkBox is using some other library then the page is (indirectly) using that so good to track. So I don't think this is an issue.
Yes there is a concern if this pattern was used:
if (some test) {
importScript(...)
}
And the page doesn't actually pass that test and use that library but I'd imagine that's small and arguably it is using that library even though it's not actually using it - if that contradictory message makes sense 😁
Something similar happens with the service worker events and properties (output at
$.data.runs[1].firstView.pwa.swEventListenersInfo
and$.data.runs[1].firstView.pwa.wPropertiesInfo
). This one is matching all events including those written inside libraries (e.g. Workbox'sinstall
) and, in many cases, theonmessage
listener, which can be used in client-side libraries as well.
Again I think that's fine. Workbox in particular breaks it's libraries into small files so we'll only find the ones for workbox functionality we use - even if we don't use them all. And again arguably by importing that code we are "using" those events - or at least having access to use them. If we remove them from the web platform we'd likely need to rewrite this code so that's a "use" in my convoluted, contrived mind here to answer this concern 😁
We can see how many message
events are logged for non-service worker pages after the run and decide whether this is a concern or not. Suspect it won't be.
Matching only importScripts() and service worker events inside the service worker file.
As Rick left in one comment in the script: "We should use serviceWorkerInitiatedURLs here SW detection but it has some false negatives".
I'm a bit concerned that these false negatives might be actually quite frequent (minification might be one cause, as Rick mentioned here).
Here are some example sites that have service workers, where the test returns it empty for the
serviceWorkers
field, but have values forswEventListenersInfo
:
Those are weird. Had a look at them myself and can't see how the service worker is registered! So not surprised the custom metric can't figure this out. I think we can accept the limitation of this for really obfuscated code.
As suggested on Slack, we can also look at sites that register service worker event listeners (e.g. install
, fetch
...etc.) and see how many pages with those don't have the serviceWorkers
object defined to see the scale of the problem. Can then decide if we need to include those pages or, if it's small enough, to just ignore.
WPT also detected they were service worker calls as they are in blue. So could also ask @pmeenan how it does that and use that potentially?
So, on one hand, if we end up taking into account only the service worker URLs, we might be able to reduce the noise in the results, but, given that there are potentially so many false negatives, it might not be a good idea.
I think that because we are mostly looking for service worker specific methods and text, we should look at everything like we did last year otherwise we exclude too much from importScripts. I think the likelihood of false positives is reasonably small and the risk of false negatives if we limited just to the main service worker.js is much larger.
I'm pretty new to all this, but I wanted to see if we could get these things done by Monday, when the crawl takes place, but based on these early results, I think we might need to discuss some of the limitations the pwa.js script a little bit more.
As discussed, I think your changes are good to propose in a PR now. Not sure when @rviscomi or @OBTo will get a chance to look at this since it's a weekend (and a holiday weekend at that!), but even if it doesn't make the start of the June crawl, we can still hopefully get it in for some of that crawl to give us enough to have a look at and make any amendments before the main July crawl we will use for the Web Almanac.
from legacy.httparchive.org.
Related Issues (20)
- Legacy website explorer limited to July 2018 HOT 2
- Crawlid 558 missing from stats HOT 3
- Legacy Website Reports are Missing Historical Data HOT 3
- Update FAQs HOT 1
- Video summary needs updating HOT 3
- Data from 2018-12-15 contains duplicate tests for some sites HOT 4
- Data from 2019-01-01 contains unknown crawls HOT 13
- Calculation of reqTotal incorrect for many sites for 2019-03-01 data HOT 1
- Legacy website not reachable HOT 4
- stats download for November is empty HOT 5
- A11Y metrics bugs HOT 3
- wpt_bodies meta description and robots gathering is invalid as the selector used is case sensitive HOT 1
- Create documentation file listing the contents of each custom metrics file
- Add a shorter timeout for fetches in custom metrics HOT 13
- Better script element custom metrics
- New event-names and pwa metrics did not use JSON.stringify HOT 7
- Add nativeSource to a11y custom metric
- Improve avif detection
- Improve a11y metric for captioned tables HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from legacy.httparchive.org.