GithubHelp home page GithubHelp logo

instapy / instagram-profilecrawl Goto Github PK

View Code? Open in Web Editor NEW
1.1K 56.0 240.0 13 MB

๐Ÿ“ quickly crawl the information (e.g. followers, tags etc...) of an instagram profile.

License: MIT License

Python 95.69% Shell 2.60% PowerShell 1.71%
instagram crawler python instapy selenium simple information python-script automation

instagram-profilecrawl's Introduction

InstaPy

Tooling that automates your social media interactions to โ€œfarmโ€ Likes, Comments, and Followers on Instagram Implemented in Python using the Selenium module.

Twitter of InstaPy | Discord Channel |ย How it works (FreeCodingCamp) |
Talk about automating your Instagram | Talk about doing Open-Source work |ย Listen to the "Talk Python to me"-Episode

Newsletter: Sign Up for the Newsletter here!
Guide to Bot Creation: Learn to Build your own Bots


Find the full documentation in Docs

Table of contents


Credits

Community

An active and supportive community is what every open-source project needs to sustain. Together we reached every continent and most of the countries in the world!
Thank you all for being part of the InstaPy community โœŒ๏ธ

InstaPy reach

Contributors

This project exists thanks to all the people who contribute. [Contribute].

Backers

Thank you to all our backers! ๐Ÿ™ [Become a backer]


Disclaimer: Please note that this is a research project. I am by no means responsible for any usage of this tool. Use it on your behalf. I'm also not responsible if your accounts get banned due to the extensive use of this tool.

instagram-profilecrawl's People

Contributors

0rc0 avatar alexroan avatar calpt avatar dependabot[bot] avatar estebancortero avatar hyomin14 avatar imansh77 avatar jayphen avatar justdvl avatar kusw3 avatar mschrader15 avatar noiob avatar omarrr avatar prafulfillment avatar psh0502 avatar raviriley avatar rtpharry avatar tcvieira avatar timgrossmann avatar timmoh avatar tranvansang avatar valentin0h avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

instagram-profilecrawl's Issues

Unable to locate element: class name "_mesn5"

Trying to scrape a public Instagram profile, I got this error:


Extracting information from ???
Traceback (most recent call last):
  File "crawl_profile.py", line 28, in <module>
    information = extract_information(browser, username)
  File "C:\Users\Stefan\git-repos\instagram-profilecrawl\util\extractor.py", lin                                                                                                                e 89, in extract_information
    = get_user_info(browser)
  File "C:\Users\Stefan\git-repos\instagram-profilecrawl\util\extractor.py", lin                                                                                                                e 11, in get_user_info
    container = browser.find_element_by_class_name('_mesn5')
  File "C:\Program Files\Python36\lib\site-packages\selenium\webdriver\remote\we                                                                                                                bdriver.py", line 485, in find_element_by_class_name
    return self.find_element(by=By.CLASS_NAME, value=name)
  File "C:\Program Files\Python36\lib\site-packages\selenium\webdriver\remote\we                                                                                                                bdriver.py", line 855, in find_element
    'value': value})['value']
  File "C:\Program Files\Python36\lib\site-packages\selenium\webdriver\remote\we                                                                                                                bdriver.py", line 308, in execute
    self.error_handler.check_response(response)
  File "C:\Program Files\Python36\lib\site-packages\selenium\webdriver\remote\er                                                                                                                rorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Una                                                                                                                ble to locate element: {"method":"class name","selector":"_mesn5"}
  (Session info: headless chrome=63.0.3239.132)
  (Driver info: chromedriver=2.35.528161 (5b82f2d2aae0ca24b877009200ced9065a772e                                                                                                                73),platform=Windows NT 10.0.16299 x86_64)

Am I missing something? Why is it using headless chrome? I actually appreciate headless, but I read in another issue that it's apparently not yet supported?
Or is this an issue with Python 3.6?

IndexError: list index out of range - Video post

Hi @timgrossmann

Thanks for your amazing work !!!
For some obvious reason your app stop to work since there are videos posts ..
The error is img = imgs[1].get_attribute('src')
IndexError: list index out of range

Do you have an idea about this issue ?
Regards

Laurent

Caption and location information

I have modified my local repo to extract caption, location name, and location url for each post. I would love to contribute if this can be considered as an enhancement.

extractor.py

please help!

File "crawl_profile.py", line 27, in <module>
    information = extract_information(browser, username)
  File "/home/kurozone/instagram-profilecrawl/util/extractor.py", line 127, in extract_information
    img, tags, likes, comments = extract_post_info(browser)
  File "/home/kurozone/instagram-profilecrawl/util/extractor.py", line 80, in extract_post_info
    return img, tags, int(likes), int(len(comments) - 1)
TypeError: object of type 'int' has no len()

error code 127, selenium and chromedriver issues on ubuntu 17.04

hi this is what i'm getting (testing on my DO droplet and local desktop ubuntu) both same message:

root@ubuntu:/home/aria/instagram-profilecrawl# python3.5 crawl_profile.py ashishegaran
Traceback (most recent call last):
  File "crawl_profile.py", line 17, in <module>
    browser = webdriver.Chrome('./assets/chromedriver', chrome_options=chrome_options)
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/chrome/webdriver.py", line 62, in __init__
    self.service.start()
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 96, in start
    self.assert_process_still_running()
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 109, in assert_process_still_running
    % (self.path, return_code)
selenium.common.exceptions.WebDriverException: Message: Service ./assets/chromedriver unexpectedly exited. Status code was: 127

first it said this

root@ubuntu:/home/aria/instagram-profilecrawl# python3.5 crawl_profile.py ashishegaran
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 74, in start
    stdout=self.log_file, stderr=self.log_file)
  File "/usr/lib/python3.5/subprocess.py", line 676, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.5/subprocess.py", line 1282, in _execute_child
    raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: './assets/chromedriver'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "crawl_profile.py", line 17, in <module>
    browser = webdriver.Chrome('./assets/chromedriver', chrome_options=chrome_options)
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/chrome/webdriver.py", line 62, in __init__
    self.service.start()
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/common/service.py", line 81, in start
    os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home


which it seemed it needs the forlder "assets" but it's not there, i created that and got to the first error. i donno what is happening or why it is showing this, i installed selenium on pip and pip3 and google chrom is also installed as well as chrome driver x64 linux.

Followers list?

I used your tool, but it didn't output the followers list. I checked the setting file and there was also nothing in there about followers. But in the git description, it is said that followers list is supported ?!

the script doesn't get past posts and shows error

root@ubuntu:/home/aria/instagram-profilecrawl# python3.5 crawl_profile.py sabaasafari
Extracting information from sabaasafari
BEFORE IMG
- Could not get information from post: https://www.instagram.com/p/BWc05WaHn8u/?taken-by=sabaasafari
BEFORE IMG
	- Could not get information from post: https://www.instagram.com/p/BWYS2inHYsN/?taken-by=sabaasafari
BEFORE IMG
- Could not get information from post: https://www.instagram.com/p/BWUkEu5HAEe/?taken-by=sabaasafari
BEFORE IMG
- Could not get information from post: https://www.instagram.com/p/BWTDPuxnG79/?taken-by=sabaasafari

can you test this and see if this happens on ur side as well or not? if yes can you provide a fix?

We can't use pyvirtualdisplay on Windows

Hi,
In windows we have a problem with Display function (from pyvirtualdisplay import Display)
actually we can't use pyvirtualdisplay on Windows.
It is just a wrapper that calls Xvfb. Xvfb is a headless display server for the X Window System. Windows does not use the X Window System.

How can I use this in windows OS ?

WebDriverException: DevToolsActivePort file doesn't exis

I've installed chrome on EC2 running Amazon Linux AMI.
When running crawl_profile.py, WebDriverException pop and the script stop.
What's the problem and how to fix it?
Thank you in advance.
Here's the error message:

Traceback (most recent call last):
File "crawl_profile.py", line 21, in
browser = webdriver.Chrome('./assets/chromedriver', chrome_options=chrome_options)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 75, in init
desired_capabilities=desired_capabilities)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 156, in init
self.start_session(capabilities, browser_profile)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 251, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 320, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: DevToolsActivePort file doesn't exist
(Driver info: chromedriver=2.40.565383 (76257d1ab79276b2d53ee976b2c3e3b9f335cde7),platform=Linux 4.1.7-15.23.amzn1.x86_64 x86_64)

Unable to located element.

Here's my error. The script ran fine a few times but is now unable to locate this element.

[selenium.webdriver.remote.remote_connection] DEBUG: Finished Request {'sessionId': '202cae480ffc0f2e7f76665155ad5760', 'status': 7, 'value': {'message': 'no such element: Unable to locate element: {"method":"xpath","selector":"//a[contains(@Class, "_1cr2e _epyes")]"}\n (Session info: chrome=64.0.3282.140)\n (Driver info: chromedriver=2.35.528157 (4429ca2590d6988c0745c24c8858745aaaec01ef),platform=Mac OS X 10.12.6 x86_64)'}}

absolute import instead of a relative one?

File "crawl_profile.py", line 5, in
from .settings import Settings
SystemError: Parent module '' not loaded, cannot perform relative import

I was running into this issue i googled a bit and found

so i changed line 5 to:

#!/usr/bin/env python3.5
"""Goes through all usernames and collects their information"""
import json
from util.settings import Settings

works for me

TypeError: 'NoneType' object is not iterable

Getting the following error after scrolling the profile and scrapping the first link:

Traceback (most recent call last): File "crawl_profile.py", line 33, in <module> information, user_commented_list = extract_information(browser, username, limit_amount) File "/Users/kevinleahey/Git/instagram-profilecrawl/util/extractor.py", line 225, in extract_information caption, location_url, location_name, location_id, lat, lng, img, tags, likes, comments, date, user_commented_list = extract_post_info(browser) TypeError: 'NoneType' object is not iterable

I looked at the extract_post_info method, but nothing stuck out to me. Any thoughts?

WebDriver Issue

Excuse me, I've changed the webdriver to newest version by chrome.
But still happened the same error.. how to fix it?

Traceback (most recent call last):
File "/Users/edward/instagram-profilecrawl/crawl_profile.py", line 11, in
browser = webdriver.Chrome('./assets/chromedriver')
File "/Library/Python/2.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 62, in init
self.service.start()
File "/Library/Python/2.7/site-packages/selenium/webdriver/common/service.py", line 81, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home

Broken after Instagram updated profiles.

The error message is suppressed so I can't post the error beyond:

$python crawl_profile.py john
Waiting 10 sec
Extracting information from john

Error: Couldn't get user profile.
Terminating

Instagram Update - Class Name Issue

Hello,

I keep getting an error when I run extractor.py particularly in line 14. I think Instagram updated their class name because this is the error I get after changing the error code in line 301-304. I'm not sure how to find the correct class name. All help is greatly appreciated!

`Message: no such element: Unable to locate element: {"method":"class name","selector":"v9tJq"}'

Error Code I put in:

except Exception as e: print(e) print ("\nError: Couldn't get user profile.\nTerminating") quit()

Run using firefox on RPi 3

I'd like to run the crawler headless on my RPi3, ideally just using Firefox/geckodriver. Installing chrome on RPi is always kind of a mess. Is there a simple workaround? It should be possible with selenium, right?

unknown error: call function result missing 'value'

Traceback (most recent call last):
File "C:/Users/kk703.DESKTOP-J939SLP/PycharmProjects/POM/TestScripts/Login_Test.py", line 7, in
loginPage.login("admin","manager")
File "C:\Users\kk703.DESKTOP-J939SLP\PycharmProjects\POM\PageClasses\LoginPage.py", line 11, in login
self.__username.send_keys (user_name)
File "C:\Users\kk703.DESKTOP-J939SLP\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 479, in send_keys
'value': keys_to_typing(value)})
File "C:\Users\kk703.DESKTOP-J939SLP\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 628, in _execute
return self._parent.execute(command, params)
File "C:\Users\kk703.DESKTOP-J939SLP\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 312, in execute
self.error_handler.check_response(response)
File "C:\Users\kk703.DESKTOP-J939SLP\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: call function result missing 'value'
(Session info: chrome=65.0.3325.181)
(Driver info: chromedriver=2.33.506120 (e3e53437346286c0bc2d2dc9aa4915ba81d9023f),platform=Windows NT 10.0.16299 x86_64)

Process finished with exit code 1

Couldn't get user profile.

Hello,

now i am running into straight errors.

When i execute this script i get the return:

Couldn't get user profile.

Anyone any idea?

bug report .

i got this error i must be the most annoying user of your scrpit but i can't figure what i did wrong

Traceback (most recent call last):
File "crawl_profile.py", line 27, in
information = extract_information(browser, username)
File "/Users/Desktop/instagram-profilecrawl/util/extractor.py", line 117, in extract_information
img, tags, likes, comments = extract_post_info(browser)
File "/Users/Desktop/instagram-profilecrawl/util/extractor.py", line 34, in extract_post_info
img = imgs[1].get_attribute('src')
IndexError: list index out of range
thanks for the support

Cant start running

I run using nohup and got this error
/Users/phongyewtong/Desktop/InstaPy-master/chainingExample.py: line 3: syntax error near unexpected token username='test',' /Users/phongyewtong/Desktop/InstaPy-master/chainingExample.py: line 3: InstaPy(username='test', password='test')'

Terminate

I'm encountering
"bio:
Error: Couldn't get user profile.
Terminating"
How can I solve this?
I will be grateful for any help you can provide.

Having problem crawling a complete profile

root@ubuntu:/home/aria/instagram-profilecrawl# ./crawl_profile.py behzadshishegaran
Extracting information from behzadshishegaran
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
BEFORE IMG
Traceback (most recent call last):
  File "./crawl_profile.py", line 27, in <module>
    information = extract_information(browser, username)
  File "/home/aria/instagram-profilecrawl/util/extractor.py", line 128, in extract_information
    img, tags, likes, comments = extract_post_info(browser)
  File "/home/aria/instagram-profilecrawl/util/extractor.py", line 68, in extract_post_info
    while (comments[1].text == 'load more comments'):
IndexError: list index out of range

the script quits without any extra data on what happened. what's the problem here?

Logging In?

Sorry if this is spelled out in the documentation (or InstaPy), but I was wondering if there was a way to login during the session. I'd like to get the likes of a friend's pics, but his profile's private - I could access it if I could figure a way to log in during the Headless session. Thanks!

'Service' object has no attribute 'process'

Hey guys, its me, again...

I'm trying to run that DigitalOcen Ubuntu... but getting that error when I try to run the script. If someone has any tip, would be helpful.

image

In the page it says to run Python3.5etc... is that mandatory?

If that's the problem, I'm sorry... but I wanna try it with 2.7 version.

Thanks guys.

Cannot find Chrome Binary

Hello, I'm trying to run this with Python 3.5, and chrome driver 2.40, but I'm getting the next error
_Traceback (most recent call last):
File "crawl_profile.py", line 21, in
browser = webdriver.Chrome('./assets/chromedriver', chrome_options=chrome_options)
File "/home/psyco/.local/lib/python3.5/site-packages/selenium/webdriver/chrome/webdriver.py", line 75, in init
desired_capabilities=desired_capabilities)
File "/home/psyco/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 156, in init
self.start_session(capabilities, browser_profile)
File "/home/psyco/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 245, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/home/psyco/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in execute
self.error_handler.check_response(response)
File "/home/psyco/.local/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary
(Driver info: chromedriver=2.40.565383 (76257d1ab79276b2d53ee976b2c3e3b9f335cde7),platform=Linux 4.13.0-45-generic x86_64)
_

Does anyone know something I can try to fix it?

Minor error: replacing k,m with '000'/'000000'

inside the code block in utils/extractor.py,
k is replaced as '00', but should be replaced as '000'.
similarly: m is replaced as '00000', but should be replaced as '000000'.

  followers = int(followers.replace('k', '00').replace('m', '00000'))
  following = infos[2].text.split(' ')[0].replace(',', '').replace('.', '')
  following = int(following.replace('k', '00'))

fixed issue - comments problem

Sorry not great with github but there was a problem where comments = 0 on line 61 of extractor.py and then later tried to find the len of it which caused:

 return img, tags, int(likes), int(len(comments) - 1)
TypeError: object of type 'int' has no len()

if you just change comments = 0 to comments = [] then works fine.

Could not get information from post...

Everything works great. After the script runs it creates the json file and outputs the profile information. However, when it starts crawling individual posts, although I can see the driver scanning the correct post, it fails to grab any information relevant to any post.

To troubleshoot, I added some print statements in the extract_post_info function to see if it was grabbing info. It's able to perform well up to here:

 if len(imgs) >= 2:
    img = imgs[1].get_attribute('src')
    print(img) #added print statement

After that, I tried adding some print statements here:

likes = likes.split(' ')
  
  print("likes is: ", likes) #my addition

  #count the names if there is no number displayed
  if len(likes) > 2:
    likes = len([word for word in likes if word not in ['and', 'like', 'this']])
    print(likes) #my addition
  else:
    likes = likes[0]
    likes = likes.replace(',', '').replace('.', '')
    likes = likes.replace('k', '00')

But nothing prints out. I assume, this must be related to the issue of the function not returning anything and thus leading to the except NoSuchElementException: print('- Could not get information from post: ' + link)

What do you think could be wrong? I don't want to alter the code too much since I'm not particularly familiar with selenium.

Any help would awesome!

Thanks!

Empty Caption

Hello.
Line 50 to 57 in extractor.py file is where, post's caption will have read.
but it seems in new version of instagram HTML file it doesn't work properly,
i fixed this issue, just replace all of codes in try block with this line:
caption = post.find_element_by_class_name('gElp9').find_element_by_tag_name('span').text
Be lucky :-)

Scrolling Profile no stop

Good morning people.

I started using Instagram-profilecrawl. But a problem.

It is not to load all the posts.

The message at the prompt is:

...
Scrolling profile 324/380
Scrolling profile 336/380
Scrolling profile 348/380
Scrolling profile 360/380
Scrolling profile 372/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380
Scrolling profile 379/380

I already tried to increase sleep in extractor.py but no work

TypeError on save_profile_json()

Traceback (most recent call last):
File "crawl_profile.py", line 33, in
Datasaver.save_profile_json(username,information)
TypeError: unbound method save_profile_json() must be called with Datasaver inst
ance as first argument (got str instance instead)

I didn't do any change to the code so far.

selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally

instapy is running perfectly, however I'm having trouble starting profilecrawl. Heres the output after trying to start it :

Traceback (most recent call last): File "crawl_profile.py", line 18, in <module> browser = webdriver.Chrome('./assets/chromedriver', chrome_options=chrome_options) File "/home/jwkoch/.local/lib/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 69, in __init__ desired_capabilities=desired_capabilities) File "/home/jwkoch/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 98, in __init__ self.start_session(desired_capabilities, browser_profile) File "/home/jwkoch/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 188, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "/home/jwkoch/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute self.error_handler.check_response(response) File "/home/jwkoch/.local/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally (Driver info: chromedriver=2.29.461571 (8a88bbe0775e2a23afda0ceaf2ef7ee74e822cc5),platform=Linux 4.4.0-83-generic x86_64)

You got any hints/ideas?

Comments not exceeding 25?

The number of comments are stuck at 25 for me
Like so:
"likes": 1029020,
"comments": 25

Can anyone help me to fix this?

Can't work since yesterday

The script can't get information since yesterday. It seems that Instagram has changed the tag&class name on the html page?
Error message๏ผš
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"class name","selector":"_de9bg"}

Incorrect followers/following value

The followers/following value is incorrect for IG accounts that have no decimal in their followers/following count.
(e.g. 630k followers becomes 63000 instead of 630000)

followers = infos[1].text.split(' ')[0].replace(',', '').replace('.', '')
followers = int(followers.replace('k', '00').replace('m', '00000'))

This can be corrected with the following lines of code:

followers = str(infos[1].text.split(' ')[0].replace(',', ''))
if followers.find('.') != -1:
  followers = followers.replace('.', '')
  followers = int(followers.replace('k', '00').replace('m', '00000'))
else:
  followers = int(followers.replace('k', '000').replace('m', '000000'))

following = str(infos[2].text.split(' ')[0].replace(',', ''))
if following.find('.') != -1:
  following = following.replace('.', '')
  following = int(following.replace('k', '00').replace('m', '00000'))
else:
  following = int(following.replace('k', '000').replace('m', '000000'))

Read user's bio?

Is there a way to extract a user's bio text and write it into a file?

.... is not clickable at point (943, 933)

Hi folks,

just wanted to tell about a problem i got and fixed:

i got this error:

Traceback (most recent call last):
  File "./crawl_profile.py", line 30, in <module>
    information = extract_information(browser, username)
  File "/home/pi/instagram/instagram-profilecrawl/util/extractor.py", line 102, in extract_information
    load_button.click()
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 78, in click
    self._execute(Command.CLICK_ELEMENT)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 499, in _execute
    return self._parent.execute(command, params)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 297, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Element <a href="/freddyfrog/?max_id=1594149044930567909" class="_1cr2e _epyes">...</a> is not clickable at point (943, 933). Other element would receive the click: <div class="_8c4cy">...</div>
  (Session info: headless chrome=60.0.3112.113)
  (Driver info: chromedriver=2.29 (8e8216e581c512667203931f81c1a1ead47222e5),platform=Linux 4.9.50-v7+ armv7l)

Solution:
the problem is that the page seems to be not fully loaded, so just go ahead and raise the sleep before this statement in utils/extractor.py:

      :
      load_button = body_elem.find_element_by_xpath\
        ('//a[contains(@class, "_1cr2e _epyes")]')
      body_elem.send_keys(Keys.END)
>>>sleep(3)
      load_button.click()
      :

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.