jonchar / ma-scraper Goto Github PK

View Code? Open in Web Editor NEW

28.0 28.0 14.0 8.63 MB

Scraper for the Metal Archives.

Python 0.20% Jupyter Notebook 99.80%

ma-scraper's People

Contributors

Stargazers

Watchers

Forkers

kdg14445 silvercobraa dersteppenwolf jdhazard thomasdrain garbersc raks-coder two7sclash-zz leobarcelos akimpt vmorenomarin brunopolyglot sclp-association

ma-scraper's Issues

ReviewContent is blank

Hello,

First of all, thank you so much for this. I was trying to make a MA scraper myself, particularly for lyrics and reviews, but couldn't get it to work. Maybe I should pick a less tricky (and less interesting) site to try web-scraping the first time!

Anyway, after running MA_review_scraper.py everything was copied except for the ReviewContent. I tried using a small subset of the code on a sample review it managed to successfully print.

url = "https://www.metal-archives.com/reviews/Death/Scream_Bloody_Gore/598/CactusSlaughter/400395"
r = requests.get(url)
html = r.text

# Create a BeautifulSoup object from the HTML: soup
soup = BeautifulSoup(html, "lxml")
review_soup = BeautifulSoup(r.text, 'html.parser')
review_title = review_soup.find_all('h3')[0].text.strip()[:-6]
review = review_soup.find_all('div', {'class': 'reviewContent'})[0].text
print(review)

KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported.

Hi Jon, thank you for your brilliant data analyzer for MA :-) However, I seem to have hit a small problem, I'm not sure how to circumvent;

KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported.

The error occurs at:

We have 11 clusters which we can put on a 3x4 grid of plots, and disable the last plot...

code

I'm just running the notebook as it is - no changes, other than an updated .csv file from MA. Everything works like a charm, until the routine gets to clustering.

I've tried for days to figure out what this means and tried numerous things, but to no avail. Please bear with me - I haven't programmed for 18+ years, so I'm a bit rusty.

Hope you can help, since you little program seems really awesome 👍

Best regards
Kim

Returning JSON is the same

Hello! I tried to use MA_band_scraper.py for getting full list of bands, but code was crashing with error json.decoder.JSONDecodeError: Expecting value: line 4 column 11 (char 66). After some checking stuff I found that problem is in request response, as value after "sEcho": is empty space. How it looks:

{ 
    "iTotalRecords": 11835,
    "iTotalDisplayRecords": 11835,
    "sEcho": ,
    "aaData": [
        [ 
        "<a href='https://www.metal-archives.com/bands/A_--_Solution/3540442600'>A // Solution</a>", 
        "United States",
... a lot of strings

Therefore, to fix it, I manually added value to each such response - as payload in lines 36-38 does not help. What's worse is that payload does not work at all - every chunk of band data is the same first 500 bands. It changes only with changing letters, but for one letter every chunk is the same. Here is code after my slight changes: https://gist.github.com/ramskyi/8d831e561d835ef0659bcfb8788ca4e0

Response 403 from request

Hi, I am just getting into web-scraping and is experimenting with metal-archives.com. I just found your ma-scraper here and it looks super helpful for me to learn!

However, the code does not seems to be able to get data from MA right now (was working just a few days earlier, maybe because MA moved their data on a new server? from their homepage: Maintenance / 2018-10-19 14:50
The site will be migrating to a new server tonight at midnight EDT / 4 am UTC. There will be some downtime. We'll try to make the process as quick and smooth as possible.
)

specifically, it seems that the problem is with
r = requests.get(BASEURL + RELURL + letter, params=payload),
that I am unable to get data by requests, it just returns
<Response [403]>
I absolutely have no idea how to make the scraper work at this point, so I was wondering someone more knowledgeable than me like you have an idea of what's going on.
Thank you very much for your work!

jonchar / ma-scraper Goto Github PK

ma-scraper's People

Contributors

Stargazers

Watchers

Forkers

ma-scraper's Issues

ReviewContent is blank

KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported.

We have 11 clusters which we can put on a 3x4 grid of plots, and disable the last plot...

Returning JSON is the same

Response 403 from request

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs