Comments (5)
I have added the twitter card functionality in the #196 pull request. Please let me know if this works.
from extruct.
hi @platelminto that could be a great addition. Do you have some examples of the pages with this markup, and example outputs? Do you have an estimate of how popular is this markup?
from extruct.
Example pages:
https://shop-eu.kurzgesagt.org/
with markup:
<meta name="twitter:site" content="@kurz_gesagt">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="kurzgesagt shop">
<meta name="twitter:description" content="The official kurzgesagt online shop. Merch created with love. Posters, notebooks, clothes, plushies and more from the kurzgesagt universe.">
<meta name="twitter:image" content="https://cdn.shopify.com/s/files/1/0252/6822/4088/t/64/assets/logo_twitter.png?v=14636856715189202634" />
https://store.taylorswift.com/products/i-would-die-for-you-in-secret-hoodie
with markup:
<meta name="twitter:card" content="summary"><meta name="twitter:title" content="“i would die for you in secret” hoodie">
<meta name="twitter:description" content="FOLKLORE ALBUM COLLECTION
*please note we are doing our best to deliver your order as fast as possible, however, we may experience delays somewhere along the way as we try to keep everyone safe.Black hooded sweatshirt featuring "folklore album" printed in copper glitter on front and photo of Taylor Swift printed on back along with "All these people think love's for show, but I would die for you in secret" lyrics in copper glitter.
100% cottondepiction of this product is a digital rendering and for illustrative purposes only. actual product detailing may vary.Taylor Swift®©2021 TAS Rights Management, LLCUsed By Permission. All Rights Reserved.
">
<meta name="twitter:image" content="https://cdn.shopify.com/s/files/1/0011/4651/9637/products/dieforyouhoodiefront_600x600_crop_center.png?v=1627046674">
https://github.com/
with markup:
<meta property="twitter:site" content="github">
<meta property="twitter:site:id" content="13334762">
<meta property="twitter:creator" content="github">
<meta property="twitter:creator:id" content="13334762">
<meta property="twitter:card" content="summary_large_image">
<meta property="twitter:title" content="GitHub">
<meta property="twitter:description" content="GitHub is where people build software. More than 65 million people use GitHub to discover, fork, and contribute to over 200 million projects.">
<meta property="twitter:image:src" content="https://github.githubassets.com/images/modules/open_graph/github-logo.png">
<meta property="twitter:image:width" content="1200">
<meta property="twitter:image:height" content="1200">
https://www.billboard.com/
with markup:
<meta data-rh="true" name="twitter:site" content="@billboard" />
<meta data-rh="true" property="og:site_name" content="Billboard" />
<meta data-rh="true" property="og:url" content="https://www.billboard.com/" />
<meta data-rh="true" name="og:image" property="og:image" content="https://static.billboard.com/files/2019/07/billboard-logo-b-20-billboard-1548-1092x722-1598619661-compressed.jpg" />
<meta data-rh="true" name="og:image:width" property="og:image:width" content="1092" />
<meta data-rh="true" name="og:image:height" property="og:image:height" content="722" />
<meta data-rh="true" name="og:title" property="og:title" content="Billboard - Music Charts, News, Photos & Video" />
<meta data-rh="true" name="twitter:title" property="twitter:title" content="Billboard - Music Charts, News, Photos & Video" />
These include a couple I just found now randomly, it looks extremely popular (to the extent of opengraph and json+ld).
from extruct.
Thanks for examples, looks quite popular indeed, +1 that it's useful.
Also it seems that this is already somewhat supported, for example this works
>>> extruct.extract('<!doctype html><html><head><meta property="twitter:card" content="summary">')
{'microdata': [],
'json-ld': [],
'opengraph': [],
'microformat': [],
'rdfa': [{'@id': '',
'https://dev.twitter.com/cards#card': [{'@value': 'summary'}]}]}
But not this
>>> extruct.extract('<!doctype html><html><head><meta name="twitter:card" content="summary">')
{'microdata': [],
'json-ld': [],
'opengraph': [],
'microformat': [],
'rdfa': []}
from extruct.
👍 @platelminto anything I can do to help this along? Otherwise, I'll be building my own extraction for twitter cards. Unless anyone knows of another package that handles twitter cards already?
from extruct.
Related Issues (20)
- rdflib 6.0.0 does not always return bytes, breaking extruct.rdfa.RDFaExtractor HOT 3
- Extruct - 0.13.0 is not compatible with the latest rdflib HOT 5
- Installation error: "rdflib-jsonld setup command: use_2to3 is invalid" HOT 1
- ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration. HOT 6
- ModuleNotFoundError: No module named 'rdflib_jsonld.serializer HOT 9
- Example from the README does not work any more
- Extruct not matching up with Schema.org structured data testing tool (Incorrect image Urls) HOT 3
- Some websites put meta tags outside the head. HOT 2
- Very slow extraction for specific string HOT 6
- LD+JSON outside HTML element HOT 1
- error extracting json-ld for validated json
- [suggestion] adding type hints? HOT 7
- Should not Depends on python3 (<< 3.7) HOT 6
- lxml.etree.ParserError: Document is empty HOT 5
- " in application/ld+json gives exception
- Consider switching from lxml's clean_html for enhanced security (and possibly performance) HOT 7
- Selectolax benchmarks
- Unable to get meta tag value from inside body
- SyntaxWarning invalid escape sequence '\s'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from extruct.