Comments (5)
@kalravparsana which versions of the libraries are you using? I'm not getting a failure with rdflib==6.0.0 extruct==0.13.0 pyrdfa3==3.5.3
from extruct.
@lopuhin thanks for checking.
This is the o/p of pip freeze
awslambdaric==1.2.2
beautifulsoup4==4.10.0
boto3==1.18.38
botocore==1.21.38
certifi==2019.11.28
chardet==3.0.4
click==8.0.1
cssselect==1.1.0
dbus-python==1.2.16
distro-info===0.23ubuntu1
extruct==0.3.0
feedfinder2==0.0.4
feedparser==6.0.8
filelock==3.0.12
html5lib==1.1
idna==2.8
isodate==0.6.0
jieba3k==0.35.1
jmespath==0.10.0
joblib==1.0.1
langdetect==1.0.9
loggers==0.1.4
lxml==4.6.3
newspaper3k==0.2.8
nltk==3.6.2
Pillow==8.3.2
PyGObject==3.36.0
pyparsing==2.4.7
pyRdfa3==3.5.3
python-apt==2.0.0+ubuntu0.20.4.6
python-dateutil==2.8.2
PyYAML==5.4.1
rdflib==6.0.0
regex==2021.8.28
requests==2.22.0
requests-file==1.5.1
requests-unixsocket==0.2.0
s3transfer==0.5.0
sgmllib3k==1.0.0
simplejson==3.17.2
six==1.14.0
soupsieve==2.2.1
textdistance==4.2.1
tinysegmenter==0.3
tldextract==3.1.2
tqdm==4.62.2
unattended-upgrades==0.1
urllib3==1.25.8
webencodings==0.5.1
And this is my requirements.txt file
boto3
requests
loggers
extruct
langdetect
textdistance
newspaper3k
feedparser
python-dateutil
rdflib
pyrdfa3
You can see that version for extruct is taken 0.3.0 by default which is way older and if I pin to 0.13.0, I get this error
#19 7.597 error in rdflib-jsonld setup command: use_2to3 is invalid. #19 7.597 ---------------------------------------- #19 7.597 WARNING: Discarding https://files.pythonhosted.org/packages/a7/60/267b54976f779d0c5b22448525495524c069285586dc22f21bfb29c25cf6/rdflib-jsonld-0.2.tar.gz#sha256=aed044b9c9eb7b136446e169e88c9626b53991066696a533482051c0ccf84375 (from https://pypi.org/simple/rdflib-jsonld/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output. #19 7.598 ERROR: Could not find a version that satisfies the requirement rdflib-jsonld (from extruct) (from versions: 0.2, 0.3, 0.4.0, 0.5.0) #19 7.598 ERROR: No matching distribution found for rdflib-jsonld
Just to add the extra information, we are trying to install inside docker. And our base image is FROM ubuntu:latest
here is the docker file
FROM ubuntu:latest
ENV DEBIAN_FRONTEND noninteractive
# install basic packages
RUN apt-get update
RUN apt-get install software-properties-common curl unzip gcc git -y
# install python
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update
RUN apt install python3.9 python3.9-distutils python3.9-dev -y
# install aws
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
RUN unzip awscliv2.zip
RUN ./aws/install
# install pip
RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
RUN python3.9 get-pip.py
RUN pip3.9 install boto3
RUN pip3.9 install awslambdaric
COPY dir/requirements.txt /home/requirements.txt
RUN pip3.9 install -r /home/requirements.txt
from extruct.
Aha I see, must be some issue with a newer pip version resolver which does not allow 0.13.0 to be installed. So downgrading pip to some version from mid-2020 may help (or disabling this resolver), but we should reproduce and fix it on our side as well.
from extruct.
@lopuhin This seems to have been solved after recent changes on https://github.com/RDFLib/rdflib-jsonld.
Thanks for your support anyway.
from extruct.
Nice, thank you @kalravparsana 👍
from extruct.
Related Issues (20)
- rdflib 6.0.0 does not always return bytes, breaking extruct.rdfa.RDFaExtractor HOT 3
- Adding twitter tags HOT 5
- Installation error: "rdflib-jsonld setup command: use_2to3 is invalid" HOT 1
- ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration. HOT 6
- ModuleNotFoundError: No module named 'rdflib_jsonld.serializer HOT 9
- Example from the README does not work any more
- Extruct not matching up with Schema.org structured data testing tool (Incorrect image Urls) HOT 3
- Some websites put meta tags outside the head. HOT 2
- Very slow extraction for specific string HOT 6
- LD+JSON outside HTML element HOT 1
- error extracting json-ld for validated json
- [suggestion] adding type hints? HOT 7
- Should not Depends on python3 (<< 3.7) HOT 6
- lxml.etree.ParserError: Document is empty HOT 5
- " in application/ld+json gives exception
- Consider switching from lxml's clean_html for enhanced security (and possibly performance) HOT 7
- Selectolax benchmarks
- Unable to get meta tag value from inside body
- SyntaxWarning invalid escape sequence '\s'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from extruct.