GithubHelp home page GithubHelp logo

dgnsrekt / nitter_scraper Goto Github PK

View Code? Open in Web Editor NEW
59.0 3.0 12.0 219 KB

Scrape Twitter API without authentication using Nitter.

Home Page: https://nitter-scraper.readthedocs.io/

License: MIT License

Python 24.35% HTML 75.09% Makefile 0.56%
twitter javascript python client tweets no-authentication nitter-scraper scrape-tweets profile docker

nitter_scraper's People

Contributors

dgnsrekt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

nitter_scraper's Issues

Problem with Installing

Hi, I just got this triying to install the library via PIP. What can I do?

ERROR: Cannot install nitter_scraper because these package versions have conflicting dependencies.

The conflict is caused by:
docker 4.4.4 depends on pywin32==227; sys_platform == "win32"
docker 4.4.3 depends on pywin32==227; sys_platform == "win32"
docker 4.4.2 depends on pywin32==227; sys_platform == "win32"
docker 4.4.1 depends on pywin32==227; sys_platform == "win32"
docker 4.4.0 depends on pywin32==227; sys_platform == "win32"
docker 4.3.1 depends on pywin32==227; sys_platform == "win32"

markupsafe error when using nitter_scraper as library

Traceback (most recent call last):
 File "/home/r3r/.local/bin/gallery-dl-hydrus", line 8, in <module>
   sys.exit(app())
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
   return get_command(self)(*args, **kwargs)
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
   return self.main(*args, **kwargs)
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/click/core.py", line 1055, in main
   rv = self.invoke(ctx)
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
   return _process_result(sub_ctx.command.invoke(sub_ctx))
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
   return ctx.invoke(self.callback, **ctx.params)
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/click/core.py", line 760, in invoke
   return __callback(*args, **kwargs)
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/typer/main.py", line 532, in wrapper
   return callback(**use_params)  # type: ignore
 File "/mnt/ac54dceb-73a5-4f94-b52c-cb7a426c0f29/Documents/gallery-dl/gallery_dl/hydrus.py", line 466, in send_url
   jq.put(DataJob(url))
 File "/mnt/ac54dceb-73a5-4f94-b52c-cb7a426c0f29/Documents/gallery-dl/gallery_dl/job.py", line 694, in __init__
   Job.__init__(self, url, parent)
 File "/mnt/ac54dceb-73a5-4f94-b52c-cb7a426c0f29/Documents/gallery-dl/gallery_dl/job.py", line 27, in __init__
   extr = extractor.find(extr)
 File "/mnt/ac54dceb-73a5-4f94-b52c-cb7a426c0f29/Documents/gallery-dl/gallery_dl/extractor/__init__.py", line 197, in find
   for cls in _list_classes():
 File "/mnt/ac54dceb-73a5-4f94-b52c-cb7a426c0f29/Documents/gallery-dl/gallery_dl/extractor/__init__.py", line 241, in _list_classes
   module = __import__(module_name, globals_, None, (), 1)
 File "/mnt/ac54dceb-73a5-4f94-b52c-cb7a426c0f29/Documents/gallery-dl/gallery_dl/extractor/nitter.py", line 9, in <module>
   from nitter_scraper import schema, tweets
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/nitter_scraper/__init__.py", line 1, in <module>
   from nitter_scraper.nitter import NitterScraper
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/nitter_scraper/nitter.py", line 11, in <module>
   from jinja2 import Environment, FileSystemLoader
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/jinja2/__init__.py", line 12, in <module>
   from .environment import Environment
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/jinja2/environment.py", line 25, in <module>
   from .defaults import BLOCK_END_STRING
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/jinja2/defaults.py", line 3, in <module>
   from .filters import FILTERS as DEFAULT_FILTERS  # noqa: F401
 File "/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/jinja2/filters.py", line 13, in <module>
   from markupsafe import soft_unicode
ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/home/r3r/.local/pipx/venvs/gallery-dl/lib/python3.9/site-packages/markupsafe/__init__.py)

based on https://stackoverflow.com/a/72747002/1766261 you have to downgrade markupsafe

for example

from setuptools import setup
# ...
setup(
    # ...
    install_requires=[
        # ...
        "nitter-scraper",
        "markupsafe==2.0.1",
        # ...
    ],
    # ...
)

Cannot install module: onflict with dependencies

Hi everyone, I have this issue when installing the module with pip3 on my Windows 10 machine:

pip3 install nitter-scraper
WARNING: Ignoring invalid distribution -ip (c:\python310\lib\site-packages)
WARNING: Ignoring invalid distribution -ip (c:\python310\lib\site-packages)
Collecting nitter-scraper
  Using cached nitter_scraper-0.5.0-py3-none-any.whl (12 kB)
Collecting loguru<0.6.0,>=0.5.1
  Using cached loguru-0.5.3-py3-none-any.whl (57 kB)
Collecting requests-html<0.11.0,>=0.10.0
  Using cached requests_html-0.10.0-py3-none-any.whl (13 kB)
Collecting jinja2<3.0.0,>=2.11.2
  Using cached Jinja2-2.11.3-py2.py3-none-any.whl (125 kB)
Collecting pendulum<3.0.0,>=2.1.2
  Using cached pendulum-2.1.2.tar.gz (81 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Collecting docker<5.0.0,>=4.3.1
  Using cached docker-4.4.4-py2.py3-none-any.whl (147 kB)
Collecting pydantic<2.0.0,>=1.6.1
  Using cached pydantic-1.10.2-cp310-cp310-win_amd64.whl (2.1 MB)
Collecting websocket-client>=0.32.0
  Using cached websocket_client-1.4.1-py3-none-any.whl (55 kB)
Collecting requests!=2.18.0,>=2.14.2
  Using cached requests-2.28.1-py3-none-any.whl (62 kB)
Collecting six>=1.4.0
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting docker<5.0.0,>=4.3.1
  Using cached docker-4.4.3-py2.py3-none-any.whl (146 kB)
  Using cached docker-4.4.2-py2.py3-none-any.whl (146 kB)
  Using cached docker-4.4.1-py2.py3-none-any.whl (146 kB)
  Using cached docker-4.4.0-py2.py3-none-any.whl (146 kB)
  Using cached docker-4.3.1-py2.py3-none-any.whl (145 kB)
INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of nitter-scraper to determine which version is compatible with other requirements. This could take a while.
Collecting nitter-scraper
  Using cached nitter_scraper-0.4.2-py3-none-any.whl (10 kB)
  Using cached nitter_scraper-0.3.4-py3-none-any.whl (9.9 kB)
  Using cached nitter_scraper-0.3.3-py3-none-any.whl (9.6 kB)
  Using cached nitter_scraper-0.3.2-py3-none-any.whl (7.9 kB)
ERROR: Cannot install nitter-scraper because these package versions have conflicting dependencies.

The conflict is caused by:
    docker 4.4.4 depends on pywin32==227; sys_platform == "win32"
    docker 4.4.3 depends on pywin32==227; sys_platform == "win32"
    docker 4.4.2 depends on pywin32==227; sys_platform == "win32"
    docker 4.4.1 depends on pywin32==227; sys_platform == "win32"
    docker 4.4.0 depends on pywin32==227; sys_platform == "win32"
    docker 4.3.1 depends on pywin32==227; sys_platform == "win32"

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
WARNING: Ignoring invalid distribution -ip (c:\python310\lib\site-packages)
WARNING: Ignoring invalid distribution -ip (c:\python310\lib\site-packages)
WARNING: Ignoring invalid distribution -ip (c:\python310\lib\site-packages)

How can I fix?

How to set this up?

I ran the code

from pprint import pprint

from nitter_scraper import NitterScraper

with NitterScraper(host="0.0.0.0", port=8008) as nitter:
    profile = nitter.get_profile("dgnsrekt")
    print("serialize to json\n")
    print(profile.json(indent=4))
    print("serialize to a dictionary\n")
    pprint(profile.dict())

The errors

python3 index.py 
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.8/http/client.py", line 1255, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 950, in send
    self.connect()
  File "/home/imran/.local/lib/python3.8/site-packages/docker/transport/unixconn.py", line 43, in connect
    sock.connect(self.unix_socket)
PermissionError: [Errno 13] Permission denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 400, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 702, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.8/http/client.py", line 1255, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1250, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1010, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 950, in send
    self.connect()
  File "/home/imran/.local/lib/python3.8/site-packages/docker/transport/unixconn.py", line 43, in connect
    sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', PermissionError(13, 'Permission denied'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/imran/.local/lib/python3.8/site-packages/docker/api/client.py", line 214, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/home/imran/.local/lib/python3.8/site-packages/docker/api/daemon.py", line 181, in version
    return self._result(self._get(url), json=True)
  File "/home/imran/.local/lib/python3.8/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/home/imran/.local/lib/python3.8/site-packages/docker/api/client.py", line 237, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', PermissionError(13, 'Permission denied'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "index.py", line 5, in <module>
    with NitterScraper(host="0.0.0.0", port=8008) as nitter:
  File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/home/imran/.local/lib/python3.8/site-packages/nitter_scraper/nitter.py", line 179, in NitterScraper
    nitter.start()
  File "/home/imran/.local/lib/python3.8/site-packages/nitter_scraper/nitter.py", line 145, in start
    client = self._get_client()
  File "/home/imran/.local/lib/python3.8/site-packages/nitter_scraper/nitter.py", line 28, in _get_client
    cls.client = docker.from_env()
  File "/home/imran/.local/lib/python3.8/site-packages/docker/client.py", line 96, in from_env
    return cls(
  File "/home/imran/.local/lib/python3.8/site-packages/docker/client.py", line 45, in __init__
    self.api = APIClient(*args, **kwargs)
  File "/home/imran/.local/lib/python3.8/site-packages/docker/api/client.py", line 197, in __init__
    self._version = self._retrieve_server_version()
  File "/home/imran/.local/lib/python3.8/site-packages/docker/api/client.py", line 221, in _retrieve_server_version
    raise DockerException(
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))

I have docker installed

Docker installed, but "no module named docker"

Hi
Tried to set this up today on Win11. So far everything seems to have installed. I already have Docker Desktop.

However I'm hitting a wall trying to run a simple scrape (the example from github on the main page).
Running the .py, I get

Traceback (most recent call last):
  File "c:\Scraping\nitter_scraper\test.py", line 3, in <module>
    from nitter_scraper import NitterScraper
  File "c:\Scraping\nitter_scraper\nitter_scraper\__init__.py", line 1, in <module>
    from nitter_scraper.nitter import NitterScraper
  File "c:\Scraping\nitter_scraper\nitter_scraper\nitter.py", line 8, in <module>
    import docker
ModuleNotFoundError: No module named 'docker'

Which is weird because I've literally just run pip install docker, as well as already having desktop.

Any tips here appreciated. Version mismatch? I ran into an earlier issue where I couldn't pip install nitter_scraper because it didn't like python=3.11. Am now running 3.8. But perhaps it also wants an older docker?

pip list shows:

appdirs             1.4.4
beautifulsoup4      4.12.3
bs4                 0.0.2
certifi             2023.11.17
charset-normalizer  3.3.2
colorama            0.4.6
cssselect           1.2.0
docker              4.4.4
fake-useragent      1.4.0
idna                3.6
importlib-metadata  7.0.1
importlib-resources 6.1.1
Jinja2              2.11.3
loguru              0.5.3
lxml                5.1.0
MarkupSafe          2.1.3
nitter-scraper      0.5.0
parse               1.20.0
pendulum            2.1.2
pip                 23.3.1
pydantic            1.10.13
pyee                8.2.2
pyppeteer           1.0.2
pyquery             2.0.0
python-dateutil     2.8.2
pytzdata            2020.1
pywin32             227
requests            2.31.0
requests-html       0.10.0
setuptools          68.2.2
six                 1.16.0
soupsieve           2.5
tqdm                4.66.1
typing_extensions   4.9.0
urllib3             1.26.18
w3lib               2.1.2
websocket-client    1.7.0
websockets          10.4
wheel               0.41.2
win32-setctime      1.1.0
zipp                3.17.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.