GithubHelp home page GithubHelp logo

gingeleski / odds-portal-scraper Goto Github PK

View Code? Open in Web Editor NEW
102.0 12.0 53.0 1.49 MB

Sports odds and results scraping for Odds Portal (oddsportal.com).

License: The Unlicense

Python 100.00%
scraper sports-stats sports-data

odds-portal-scraper's Introduction

Odds Portal scraping

This repository contains multiple scraping projects for Odds Portal.

Each one has its own README.md so just look in the directories, as detailed below.

Mainly full_scraper/ is the most comprehensive and will cover most use cases.

Note that all projects here were developed with Python 3.x and you should run/develop them with at least that

Directory name Description
full_scraper Will scrape nearly any sport and output as JSON. Most comprehensive and flexible.
soccer_to_sql Scrapes soccer odds and scores then puts them in a SQLite database.
predictions Scrapes predictions of users you follow - public or private - and saves them off.

odds-portal-scraper's People

Contributors

agmezr avatar dependabot[bot] avatar gingeleski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

odds-portal-scraper's Issues

Scraping now fails - has oddsportal site changed?

I've re-run some commands usingFinalScraper.py which now fail, despite these same commands completeing successfully in December

This is the error I get. The scripts looks to be cycling through pages and getting this error every time.

Page not found
This page not exist on Oddsportal.com!

Sample commands run:

  • scrape_oddsportal_current_season(sport = 'soccer', country = 'england', league = 'premier-league', season = '2022', max_page = 20)
  • scrape_oddsportal_historical(sport = 'soccer', country = 'england', league = 'premier-league', start_season = '2022-2023', nseasons = 1, current_season = 'yes', max_page = 25)
  • scrape_oddsportal_historical(sport = 'soccer', country = 'world', league = 'world-cup', start_season = '2018', nseasons = 1, current_season = 'no', max_page = 2)

I note that the oddsportal site looks different visually and has the word "beta" in the logo.

Has the structure of the oddsportal site changed and hence resulted in these scraping fails?

No JSON object could be decoded

Hi, thanks for a great resource. However, I'm having some difficulties with running the scraper. I get the following error. I've tried searching for what could be the problem, but nothing has worked so far.

(venv) 5072AB1C:odds-portal-scraper-master admin$ python /Users/admin/Desktop/odds-portal-scraper-master/run.py
Traceback (most recent call last):
  File "/Users/admin/Desktop/odds-portal-scraper-master/run.py", line 19, in <module>
    match_scraper = Scraper(json_str, initialize_db)
  File "/Users/admin/Desktop/odds-portal-scraper-master/Scraper.py", line 28, in __init__
    self.league = self.parse_json(league_json)
  File "/Users/admin/Desktop/odds-portal-scraper-master/Scraper.py", line 42, in parse_json
    return json.loads(json_str)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
(venv) 5072AB1C:odds-portal-scraper-master admin$ 

Breaking change in dependency joblib

in requirements.txt files, joblib==0.13.2 needs to change to joblib==1.1.0, and need to add Cython==0.29.30

Before making the change, got this error:

(venv) ➜  full_scraper git:(master) ✗ python op.py --help
Traceback (most recent call last):
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/op.py", line 8, in <module>
    from joblib import delayed
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/__init__.py", line 119, in <module>
    from .parallel import Parallel
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/parallel.py", line 28, in <module>
    from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/_parallel_backends.py", line 22, in <module>
    from .executor import get_memmapping_executor
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/executor.py", line 14, in <module>
    from .externals.loky.reusable_executor import get_reusable_executor
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/loky/__init__.py", line 12, in <module>
    from .backend.reduction import set_loky_pickler
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/loky/backend/reduction.py", line 125, in <module>
    from joblib.externals import cloudpickle  # noqa: F401
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/cloudpickle/__init__.py", line 3, in <module>
    from .cloudpickle import *
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/cloudpickle/cloudpickle.py", line 152, in <module>
    _cell_set_template_code = _make_cell_set_template_code()
  File "/Users/brentbrewington/project-files-github/odds-portal-scraper/full_scraper/venv/lib/python3.9/site-packages/joblib/externals/cloudpickle/cloudpickle.py", line 133, in _make_cell_set_template_code
    return types.CodeType(
TypeError: an integer is required (got type bytes)

After making the change the error went away:

(venv) ➜  full_scraper git:(master) ✗ python op.py --help            
usage: op.py [-h] [--number-of-cpus [NUMBER_OF_CPUS]]
             [--wait-time-on-page-load [WAIT_TIME_ON_PAGE_LOAD]]

oddsporter v1.0

optional arguments:
  -h, --help            show this help message and exit
  --number-of-cpus [NUMBER_OF_CPUS]
                        Number parallel CPUs for processing (default -1 for max available)
  --wait-time-on-page-load [WAIT_TIME_ON_PAGE_LOAD]
                        How many seconds to wait on page load (default 3)

Missing config file?

The config directory and config file sports.json appear to be missing. Is this the case?

Could not read environment variable ODDS_PORTAL_USERNAME

Got error

(venv) c:\Helper\odds-portal-scraper\predictions>python scraper.py
Traceback (most recent call last):
File "scraper.py", line 52, in main
username = os.environ['ODDS_PORTAL_USERNAME']
File "C:\Users\1\AppData\Local\Programs\Python\Python38\lib\os.py", line 673,
in getitem
raise KeyError(key) from None
KeyError: 'ODDS_PORTAL_USERNAME'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "scraper.py", line 135, in
asyncio.get_event_loop().run_until_complete(main())
File "C:\Users\1\AppData\Local\Programs\Python\Python38\lib\asyncio\base_event
s.py", line 612, in run_until_complete
return future.result()
File "scraper.py", line 54, in main
raise RuntimeError('Could not read environment variable ODDS_PORTAL_USERNAME
')
RuntimeError: Could not read environment variable ODDS_PORTAL_USERNAME

What should I fix ?

Write new proof-of-concept scraper for user predictions

This issue is prompted by an email I received from a fan of this project named Dmitry.

He'd written some scraping code to grab data of a user he follows on Odds Portal who makes private predictions. The results from his code seemed unstable.

I'm tackling this use case out of curiosity for how my approach might be different to his. I haven't looked at this project in several years, really, and write scrapers differently now than I used to. 🥇

The way I see it, whether a user making predictions is public or private doesn't really affect the scraping approach. My proof-of-concept logic should apply either way.

Pseudo-code works out to...

Sign in
Go to your user profile
Go to "following"
Collect list of handicappers you're following
For each handicapper in the list...
	Get that handicapper's next predictions - https://www.oddsportal.com/profile/OldTwinTowersFutbol/my-predictions/next/
	For each page of predictions...
		ACTION: Save screenshot
		For each prediction...
			ACTION: Get sport
			ACTION: Get region
			ACTION: Get league
			ACTION: Get start time
			ACTION: Get game name
			ACTION: Get game specifier
			ACTION: Get link to the game on Odds Portal
			ACTION: Get outcome odds
		        ACTION: Get picked outcome

.Json output to include DRAW when the game has finished after OT/pen. (hockey)

Hi,

Firstly, many thanks for your project.

Currently the output for games finishing after OT / Pen. is either HOME/AWAY for whoever wins the match after OT/pen. Would it be possible to add "DRAW" for the outcome (f.ex. HOCKEY), when a game results in a draw after regulation time? The final output would still result in the final score, but the match outcome would result in "DRAW".

Many thanks for your time and consideration!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.