GithubHelp home page GithubHelp logo

tryolabs / requestium Goto Github PK

View Code? Open in Web Editor NEW
1.8K 69.0 148.0 126 KB

Integration layer between Requests and Selenium for automation of web actions.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
web-automation requests selenium python interface

requestium's Introduction

Requestium

Build Status License

Requestium is a Python library that merges the power of Requests, Selenium, and Parsel into a single integrated tool for automatizing web actions.

The library was created for writing web automation scripts that are written using mostly Requests but that are able to seamlessly switch to Selenium for the JavaScript heavy parts of the website, while maintaining the session.

Requestium adds independent improvements to both Requests and Selenium, and every new feature is lazily evaluated, so its useful even if writing scripts that use only Requests or Selenium.

Read more about the motivation behind creating this library in this blopost.

Features

  • Enables switching between a Requests' Session and a Selenium webdriver while maintaining the current web session.
  • Integrates Parsel's parser into the library, making xpath, css, and regex much cleaner to write.
  • Improves Selenium's handling of dynamically loading elements.
  • Makes cookie handling more flexible in Selenium.
  • Makes clicking elements in Selenium more reliable.
  • Supports Chromedriver natively plus adding a custom webdriver.

Installation

pip install requestium

You should then download your preferred Selenium webdriver if you plan to use the Selenium part of Requestium, such as Chromedriver.

Usage

First create a session as you would do on Requests, and optionally add arguments for the web-driver if you plan to use one.

from requestium import Session, Keys

options = {'arguments': ['headless']}
s = Session(webdriver_path='./chromedriver', default_timeout=15, webdriver_options=options)

Since headless mode is common, there's a shortcut for it by pecifying headless=True.

from requestium import Session, Keys

s = Session(webdriver_path='./chromedriver' headless=True)

You can also create a Selenium webdriver outside Requestium and have it use that instead:

from selenium import webdriver
from requestium import Session, Keys

firefox_driver = webdriver.Firefox()

s = Session(driver=firefox_driver)

You can also specify a 3rd party Chrome webdriver class and use it by specifying the browser argument as well. This will allow, for example, to use Selenium-Wire to get XHR requests of a web page:

from seleniumwire import webdriver
from requestium import Session, Keys

seleniumwire_driver = webdriver.Chrome()

s = Session(webdriver_path='./chromedriver', driver=seleniumwire_driver)

You don't need to parse the response, it is done automatically when calling xpath, css or re.

title = s.get('http://samplesite.com').xpath('//title/text()').extract_first(default='Default Title')

Regex require less boilerplate when compared to Python's standard re module.

response = s.get('http://samplesite.com/sample_path')

# Extracts the first match
identifier = response.re_first(r'ID_\d\w\d', default='ID_1A1')

# Extracts all matches as a list
users = response.re(r'user_\d\d\d')

The Session object is just a regular Requests's session object, so you can use all of its methods.

s.post('http://www.samplesite.com/sample', data={'field1': 'data1'})
s.proxies.update({'http': 'http://10.11.4.254:3128', 'https': 'https://10.11.4.252:3128'})

And you can switch to using the Selenium webdriver to run any js code.

s.transfer_session_cookies_to_driver()  # You can maintain the session if needed
s.driver.get('http://www.samplesite.com/sample/process')

The driver object is a Selenium webdriver object, so you can use any of the normal selenium methods plus new methods added by Requestium.

s.driver.find_element("xpath", "//input[@class='user_name']").send_keys('James Bond', Keys.ENTER)

# New methods which wait for element to load instead of failing, useful for single page web apps
s.driver.ensure_element("xpath", "//div[@attribute='button']").click()
s.driver.ensure_element_by_xpath("//div[@attribute='button']").click()

Requestium also adds xpath, css, and re methods to the Selenium driver object.

if s.driver.re(r'ID_\d\w\d some_pattern'):
    print('Found it!')

And finally you can switch back to using Requests.

s.transfer_driver_cookies_to_session()
s.post('http://www.samplesite.com/sample2', data={'key1': 'value1'})

Selenium workarounds

Requestium adds several 'ensure' methods to the driver object, as Selenium is known to be very finicky about selecting elements and cookie handling.

Wait for element

The ensure_element and ensure_element_by_ methods wait for the element to be loaded in the browser and returns it as soon as it loads. They're named after Selenium's find_element and find_element_by_ methods (which immediately raise an exception if they can't find the element).

Requestium can wait for an element to be in any of the following states:

  • present (default)
  • clickable
  • visible
  • invisible (useful for things like waiting for loading... gifs to disappear)

These methods are very useful for single page web apps where the site is dynamically changing its elements. We usually end up completely replacing our find_element and find_element_by_ calls with ensure_element and ensure_element_by_ calls as they are more flexible.

Elements you get using these methods have the new ensure_click method which makes the click less prone to failure. This helps with getting through a lot of the problems with Selenium clicking.

s.driver.ensure_element("xpath", "//li[@class='b1']", state='clickable', timeout=5).ensure_click()

# === We also added these methods named in accordance to Selenium's api design ===
# ensure_element_by_id
# ensure_element_by_name
# ensure_element_by_xpath
# ensure_element_by_link_text
# ensure_element_by_partial_link_text
# ensure_element_by_tag_name
# ensure_element_by_class_name
# ensure_element_by_css_selector

Add cookie

The ensure_add_cookie method makes adding cookies much more robust. Selenium needs the browser to be at the cookie's domain before being able to add the cookie, this method offers several workarounds for this. If the browser is not in the cookies domain, it GETs the domain before adding the cookie. It also allows you to override the domain before adding it, and avoid making this GET. The domain can be overridden to '', this sets the cookie's domain to whatever domain the driver is currently in.

If it can't add the cookie it tries to add it with a less restrictive domain (Eg.: home.site.com -> site.com) before failing.

cookie = {"domain": "www.site.com",
          "secure": false,
          "value": "sd2451dgd13",
          "expiry": 1516824855.759154,
          "path": "/",
          "httpOnly": true,
          "name": "sessionid"}
s.driver.ensure_add_cookie(cookie, override_domain='')

Considerations

New features are lazily evaluated, meaning:

  • The Selenium webdriver process is only started if you call the driver object. So if you don't need to use the webdriver, you could use the library with no overhead. Very useful if you just want to use the library for its integration with Parsel.
  • Parsing of the responses is only done if you call the xpath, css, or re methods of the response. So again there is no overhead if you don't need to use this feature.

A byproduct of this is that the Selenium webdriver could be used just as a tool to ease in the development of regular Requests code: You can start writing your script using just the Requests' session, and at the last step of the script (the one you are currently working on) transfer the session to the Chrome webdriver. This way, a Chrome process starts in your machine, and acts as a real time "visor" for the last step of your code. You can see in what state your session is currently in, inspect it with Chrome's excellent inspect tools, and decide what's the next step your session object should take. Very useful to try code in an IPython interpreter and see how the site reacts in real time.

When transfer_driver_cookies_to_session is called, Requestium automatically updates your Requests session user-agent to match that of the browser used in Selenium. This doesn't happen when running Requests without having switched from a Selenium session first though. So if you just want to run Requests but want it to use your browser's user agent instead of the default one (which sites love to block), just run:

s.copy_user_agent_from_driver()

Take into account that doing this will launch a browser process.

Note: The Selenium Chrome webdriver doesn't support automatic transfer of proxies from the Session to the Webdriver at the moment.

Comparison with Requests + Selenium + lxml

A silly working example of a script that runs on Reddit. We'll then show how it compares to using Requests + Selenium + lxml instead of Requestium.

Using Requestium

from requestium import Session, Keys

# If you want requestium to type your username in the browser for you, write it in here:
reddit_user_name = ''

s = Session('./chromedriver', default_timeout=15)
s.driver.get('http://reddit.com')
s.driver.find_element("xpath", "//a[@href='https://www.reddit.com/login']").click()

print('Waiting for elements to load...')
s.driver.ensure_element("class name", "desktop-onboarding-sign-up__form-toggler",
				      state='visible').click()

if reddit_user_name:
    s.driver.ensure_element('id', 'user_login').send_keys(reddit_user_name)
    s.driver.ensure_element('id', 'passwd_login').send_keys(Keys.BACKSPACE)
print('Please log-in in the chrome browser')

s.driver.ensure_element("class name", "desktop-onboarding__title", timeout=60, state='invisible')
print('Thanks!')

if not reddit_user_name:
    reddit_user_name = s.driver.xpath("//span[@class='user']//text()").extract_first()

if reddit_user_name:
    s.transfer_driver_cookies_to_session()
    response = s.get("https://www.reddit.com/user/{}/".format(reddit_user_name))
    cmnt_karma = response.xpath("//span[@class='karma comment-karma']//text()").extract_first()
    reddit_golds_given = response.re_first(r"(\d+) gildings given out")
    print("Comment karma: {}".format(cmnt_karma))
    print("Reddit golds given: {}".format(reddit_golds_given))
else:
    print("Couldn't get user name")

Using Requests + Selenium + lxml

import re
from lxml import etree
from requests import Session
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# If you want requestium to type your username in the browser for you, write it in here:
reddit_user_name = ''

driver = webdriver.Chrome('./chromedriver')
driver.get('http://reddit.com')
driver.find_element("xpath", "//a[@href='https://www.reddit.com/login']").click()

print('Waiting for elements to load...')
WebDriverWait(driver, 15).until(
    EC.visibility_of_element_located((By.CLASS_NAME, "desktop-onboarding-sign-up__form-toggler"))
).click()

if reddit_user_name:
    WebDriverWait(driver, 15).until(
        EC.presence_of_element_located((By.ID, 'user_login'))
    ).send_keys(reddit_user_name)
    driver.find_element('id', 'passwd_login').send_keys(Keys.BACKSPACE)
print('Please log-in in the chrome browser')

try:
    WebDriverWait(driver, 3).until(
        EC.presence_of_element_located((By.CLASS_NAME, "desktop-onboarding__title"))
    )
except TimeoutException:
    pass
WebDriverWait(driver, 60).until(
    EC.invisibility_of_element_located((By.CLASS_NAME, "desktop-onboarding__title"))
)
print('Thanks!')

if not reddit_user_name:
    tree = etree.HTML(driver.page_source)
    try:
        reddit_user_name = tree.xpath("//span[@class='user']//text()")[0]
    except IndexError:
        reddit_user_name = None

if reddit_user_name:
    s = Session()
    # Reddit will think we are a bot if we have the wrong user agent
    selenium_user_agent = driver.execute_script("return navigator.userAgent;")
    s.headers.update({"user-agent": selenium_user_agent})
    for cookie in driver.get_cookies():
        s.cookies.set(cookie['name'], cookie['value'], domain=cookie['domain'])
    response = s.get("https://www.reddit.com/user/{}/".format(reddit_user_name))
    try:
        cmnt_karma = etree.HTML(response.content).xpath(
            "//span[@class='karma comment-karma']//text()")[0]
    except IndexError:
        cmnt_karma = None
    match = re.search(r"(\d+) gildings given out", str(response.content))
    if match:
        reddit_golds_given = match.group(1)
    else:
        reddit_golds_given = None
    print("Comment karma: {}".format(cmnt_karma))
    print("Reddit golds given: {}".format(reddit_golds_given))
else:
    print("Couldn't get user name")

Similar Projects

This project intends to be a drop-in replacement of Requests' Session object, with added functionality. If your use case is a drop in replacement for a Selenium webdriver, but that also has some of Requests' functionality, Selenium-Requests does just that.

License

Copyright © 2018, Tryolabs. Released under the BSD 3-Clause.

requestium's People

Contributors

bmos avatar calfzhou avatar delirious-lettuce avatar dependabot[bot] avatar fabalbertoni avatar gerhc avatar joaqo avatar joeld1 avatar jschnurr avatar lordjabez avatar marksmayo avatar spyridonlaz avatar wtgg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

requestium's Issues

socks5

how to use socks proxy in this script ?

error with transfer_session_cookies_to_driver()

When I try to use transfer_session_cookies_to_driver() to transfer my session to a driver I get the following error. Seems like this is a bug with the lib?

  File "src\gevent\greenlet.py", line 716, in gevent._greenlet.Greenlet.run
  File "C:\Users\Abdos\AppData\Local\Programs\Python\Python36\lib\site-packages\eel\__init__.py", line 191, in _process_message
    return_val = _exposed_functions[message['name']](*message['args'])
  File "C:\Users\Abdos\Documents\GitHub\Bol.com-AIbot\src\main.py", line 113, in bot_start
    s.transfer_session_cookies_to_driver()
  File "C:\Users\Abdos\AppData\Local\Programs\Python\Python36\lib\site-packages\requestium\requestium.py", line 114, in transfer_session_cookies_to_driver
    'expiry': c.expires, 'domain': c.domain})
  File "C:\Users\Abdos\AppData\Local\Programs\Python\Python36\lib\site-packages\requestium\requestium.py", line 235, in ensure_add_cookie
    self.add_cookie(cookie)
  File "C:\Users\Abdos\AppData\Local\Programs\Python\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 894, in add_cookie
    self.execute(Command.ADD_COOKIE, {'cookie': cookie_dict})
  File "C:\Users\Abdos\AppData\Local\Programs\Python\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\Abdos\AppData\Local\Programs\Python\Python36\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument: invalid 'expiry'

How to add chromeoptions like selenium?

how to add chromeoptions like selenium?

from selenium import webdriver

options = webdriver.ChromeOptions()
prefs = {
    'profile.default_content_setting_values': {
        'images': 2
    }
}
options.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(chrome_options=options)
driver.get("http://www.google.com/")
#driver.quit()

Why can't I use request features also in selenium?

If I start a requestium instance using a proxy applied for the request session like:
s.proxies.update({'http': 'https://10.11.4.254:3128', 'https': 'https://10.11.4.252:3128'})

The proxy does not seem to apply for the selenium instance. Why is that?
I thought the whole point of requestium was to merge requests and selenium together. Am I doing something wrong?

The version dependency of selenium has not been updated

I noticed that the requestium 0.2.0 code has updated its selenium version requirement to >=4.6.0. However, when running the pip install command, the information retrieved still shows a requirement of <4.0.0 and >=3.7.0.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
requestium 0.2.0 requires selenium<4.0.0,>=3.7.0, but you have selenium 4.8.3 which is incompatible.

Message: invalid cookie domain: Cookie 'domain' mismatch

Traceback (most recent call last):
File "e:\moi soft\4ej\check3.py", line 82, in
s.driver.ensure_add_cookie(cook, override_domain='')
File "C:\Users\Micha\AppData\Local\Programs\Python\Python310\lib\site-packages\requestium\requestium.py", line 279, in ensure_add_cookie
self.add_cookie(cookie)
File "C:\Users\Micha\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 894, in add_cookie
self.execute(Command.ADD_COOKIE, {'cookie': cookie_dict})
File "C:\Users\Micha\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\Micha\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidCookieDomainException: Message: invalid cookie domain: Cookie 'domain' mismatch
(Session info: chrome=103.0.5060.114)

webdriver is on the correct domain but can't add cookie :(

Switch cookies from Session to driver not working on Instagram

Hi all,

This is more a question than an issue.
I'm trying to implement requestium to do some operations between the Session object and the driver object on Instagram.com.
As I successfully manage to login inside the website through Session, it seems that it blocks me in switching to the driver in order to have the already logged in page.
This is the error that the console prints me

DevTools listening on ws://127.0.0.1:12779/devtools/browser/73b726a1-520f-4d08-8982-a9ada65ead2b
Traceback (most recent call last):
  File "requestiumTest.py", line 124, in <module>
    s.transfer_session_cookies_to_driver()  # You can maintain the session if needed
  File "C:\Users\gforcell\AppData\Local\Continuum\anaconda3\lib\site-packages\requestium\requestium.py", line 114, in transfer_session_cookies_to_driver
    'expiry': c.expires, 'domain': c.domain})
  File "C:\Users\gforcell\AppData\Local\Continuum\anaconda3\lib\site-packages\requestium\requestium.py", line 235, in ensure_add_cookie
    self.add_cookie(cookie)
  File "C:\Users\gforcell\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 872, in add_cookie
    self.execute(Command.ADD_COOKIE, {'cookie': cookie_dict})
  File "C:\Users\gforcell\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 312, in execute
    self.error_handler.check_response(response)
  File "C:\Users\gforcell\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unable to set cookie
  (Session info: chrome=66.0.3359.139)
  (Driver info: chromedriver=2.36.540470 (e522d04694c7ebea4ba8821272dbef4f9b818c91),platform=Windows NT 10.0.15063 x86_64)

And this is all the script to reproduce it, if interested

https://pastebin.com/TwEn9T5B

pip has wrong dependencies, rendering requestium useless

Hi.

When trying to "from requestium" or read the doc the error "problem in requestium - AttributeError: module 'selenium.webdriver' has no attribute 'PhantomJS'" is triggered.

Uncommenting class RequestiumPhantomJS in requestium/requestium.py solve the Error.

I can see this was fixed in b87baf4 but pip just installs selenium-4.1.3:

user@smally~ $ pip install --no-cache-dir requestium
Defaulting to user installation because normal site-packages is not writeable
Collecting requestium
  Downloading requestium-0.1.9-py2.py3-none-any.whl (17 kB)
Requirement already satisfied: tldextract>=2.1.0 in ./.local/lib/python3.10/site-packages (from requestium) (3.2.0)
Requirement already satisfied: requests>=2.18.1 in /usr/lib/python3.10/site-packages (from requestium) (2.27.1)
Requirement already satisfied: parsel>=1.2.0 in ./.local/lib/python3.10/site-packages (from requestium) (1.6.0)
Collecting selenium>=3.7.0
  Downloading selenium-4.1.3-py3-none-any.whl (968 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 968.8/968.8 KB 3.3 MB/s eta 0:00:00

This error is both on Ubuntu 20.04.4 LTS and Clear Linux, both latest upgrades as 12th marts 2022.

To avoid the issue, this can be used:

user@smally~ $ pip install requestium 'selenium>=3.7.0,<4.0.0'

Else this is the default behavior a user will experience .. using pip

user@smally~ $ pip install requestium
Defaulting to user installation because normal site-packages is not writeable
Collecting requestium
  Using cached requestium-0.1.9-py2.py3-none-any.whl (17 kB)
Requirement already satisfied: parsel>=1.2.0 in ./.local/lib/python3.10/site-packages (from requestium) (1.6.0)
Requirement already satisfied: requests>=2.18.1 in /usr/lib/python3.10/site-packages (from requestium) (2.27.1)
Requirement already satisfied: tldextract>=2.1.0 in ./.local/lib/python3.10/site-packages (from requestium) (3.2.0)
Requirement already satisfied: selenium>=3.7.0 in ./.local/lib/python3.10/site-packages (from requestium) (4.1.3)
Requirement already satisfied: w3lib>=1.19.0 in ./.local/lib/python3.10/site-packages (from parsel>=1.2.0->requestium) (1.22.0)
Requirement already satisfied: lxml in /usr/lib/python3.10/site-packages (from parsel>=1.2.0->requestium) (4.8.0)
Requirement already satisfied: cssselect>=0.9 in ./.local/lib/python3.10/site-packages (from parsel>=1.2.0->requestium) (1.1.0)
Requirement already satisfied: six>=1.6.0 in /usr/lib/python3.10/site-packages (from parsel>=1.2.0->requestium) (1.16.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3.10/site-packages (from requests>=2.18.1->requestium) (1.26.8)
Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3.10/site-packages (from requests>=2.18.1->requestium) (2021.10.8)
Requirement already satisfied: charset_normalizer~=2.0.0 in /usr/lib/python3.10/site-packages (from requests>=2.18.1->requestium) (2.0.12)
Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3.10/site-packages (from requests>=2.18.1->requestium) (3.3)
Requirement already satisfied: trio-websocket~=0.9 in ./.local/lib/python3.10/site-packages (from selenium>=3.7.0->requestium) (0.9.2)
Requirement already satisfied: trio~=0.17 in ./.local/lib/python3.10/site-packages (from selenium>=3.7.0->requestium) (0.20.0)
Requirement already satisfied: requests-file>=1.4 in /usr/lib/python3.10/site-packages (from tldextract>=2.1.0->requestium) (1.5.1)
Requirement already satisfied: filelock>=3.0.8 in /usr/lib/python3.10/site-packages (from tldextract>=2.1.0->requestium) (3.6.0)
Requirement already satisfied: sniffio in ./.local/lib/python3.10/site-packages (from trio~=0.17->selenium>=3.7.0->requestium) (1.2.0)
Requirement already satisfied: attrs>=19.2.0 in /usr/lib/python3.10/site-packages (from trio~=0.17->selenium>=3.7.0->requestium) (21.4.0)
Requirement already satisfied: async-generator>=1.9 in ./.local/lib/python3.10/site-packages (from trio~=0.17->selenium>=3.7.0->requestium) (1.10)
Requirement already satisfied: sortedcontainers in /usr/lib/python3.10/site-packages (from trio~=0.17->selenium>=3.7.0->requestium) (2.4.0)
Requirement already satisfied: outcome in ./.local/lib/python3.10/site-packages (from trio~=0.17->selenium>=3.7.0->requestium) (1.1.0)
Requirement already satisfied: wsproto>=0.14 in ./.local/lib/python3.10/site-packages (from trio-websocket~=0.9->selenium>=3.7.0->requestium) (1.1.0)
Requirement already satisfied: pyOpenSSL>=0.14 in ./.local/lib/python3.10/site-packages (from urllib3<1.27,>=1.21.1->requests>=2.18.1->requestium) (22.0.0)
Requirement already satisfied: cryptography>=1.3.4 in ./.local/lib/python3.10/site-packages (from urllib3<1.27,>=1.21.1->requests>=2.18.1->requestium) (36.0.1)
Requirement already satisfied: PySocks!=1.5.7,<2.0,>=1.5.6 in ./.local/lib/python3.10/site-packages (from urllib3<1.27,>=1.21.1->requests>=2.18.1->requestium) (1.7.1)
Requirement already satisfied: cffi>=1.12 in /usr/lib/python3.10/site-packages (from cryptography>=1.3.4->urllib3<1.27,>=1.21.1->requests>=2.18.1->requestium) (1.15.0)
Requirement already satisfied: h11<1,>=0.9.0 in ./.local/lib/python3.10/site-packages (from wsproto>=0.14->trio-websocket~=0.9->selenium>=3.7.0->requestium) (0.13.0)
Requirement already satisfied: pycparser in /usr/lib/python3.10/site-packages (from cffi>=1.12->cryptography>=1.3.4->urllib3<1.27,>=1.21.1->requests>=2.18.1->requestium) (2.21)
Installing collected packages: requestium
Successfully installed requestium-0.1.9

user@smally~ $ pydoc3 requestium
problem in requestium - AttributeError: module 'selenium.webdriver' has no attribute 'PhantomJS'

user@smally~ $ python3.10
Python 3.10.2 (main, Feb  9 2022, 15:58:08) [GCC 11.2.1 20220209 releases/gcc-11.2.0-744-gec01f11091] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from requestium import Session, Keys
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/chel/.local/lib/python3.10/site-packages/requestium/__init__.py", line 1, in <module>
    from .requestium import Session
  File "/home/chel/.local/lib/python3.10/site-packages/requestium/requestium.py", line 404, in <module>
    class RequestiumPhantomJS(DriverMixin, webdriver.PhantomJS):
AttributeError: module 'selenium.webdriver' has no attribute 'PhantomJS'
>>> 

How to use with Firefox?

I really need firefox. This is the library of my DREAMS but I need firefox... any help? please? pretty please?

I write software for business production and chromedriver is NEVER up to date and NEVER supports all versions... org updates Chrome and I have no say... gecko always works...

👋 From the Selenium project!

At the Selenium Project we want to collaborate with you and work together to improve the WebDriver ecosystem. We would like to meet you, understand your pain points, and discuss ideas around Selenium and/or WebDriver.

If you are interested, please fill out the form below and we will reach out to you.
https://forms.gle/Z72BmP4FTsM1GKgE6

We are looking forward to hearing from you!

PS: Feel free to close this issue, it was just meant as a way to reach out to you 😄

Allow the Requestium session to replace its Selenium driver with an external driver instance

In codebases that already have lots of Selenium code, but that need some of Requestium's features, it could be useful to plug an already running Selenium driver instance into a Requestium Session.

Things to consider:

  • We need to add some methods to this external object for it to have all of the features Requestium adds. Probably the best way to do this would be to iterate over the methods in the DriverMixin class and add them to the external object. Probably something like this should work: http://zderadicka.eu/dynamically-mix-in-methods-into-an-instance-in-python/
  • We need to check not to overwrite any methods that have the same name in the external object, except when the external object is a RequestiumPhantomJS or RequestiumChrome object. This way we can switch drivers between different Requestium Session objects, I dont think this is a particularly useful thing to do, but we should still allow it to happen.

The need for this use case was brought up by @limpair in #29 so thanks a lot for your contributions!

avoid GET when using transfer_session_cookies_to_driver()

"It also allows you to override the domain before adding it, and avoid making this GET"
How do I avoid making GET to the domain when I transfer session cookies to the driver? or is it possible?

The problem I'm facing is that after I obtained access of domain A through a requests session, I want to pass the session cookies from requests to webdriver, but it raise error "selenium.common.exceptions.InvalidCookieDomainException: Message: invalid cookie domain: Cookie 'domain' mismatch" becaue I couldn't access domain A by driver before passing the session cookies to the driver?

Thanks!

Release tags

The repo has tags at the commits of versions through 0.2.1 but is lacking 0.3.0 and 0.4.0.
It looks like the missing tags are:

Commit Hash Tag
0b9945f v0.3.0
3f535ed v0.4.0

requestiumResponse.xpath don't respect to requests.Response' s encoding setting

s = Session(webdriver_path='./chromedriver',
browser='chrome',
default_timeout=15,
webdriver_options={
'arguments': ['headless']
})
a = s.get('https://www.baidu.com/')
print(a.encoding)
####### after setting the encoding to utf-8 a.text is OK, but a.xpath is still unreadable code..
a.encoding = 'utf-8'
#######
print(a.text) # here is OK
print(a.xpath('//text()[normalize-space() and not(ancestor::script | ancestor::style)]')
.extract()) # unrecognizable characters here.

External webdriver outside Requestium and Selenium Wire

In recent commits, an interesting feature has appeared - the use of a Selenium webdriver outside Requestium.
But to do this, I have to do some cumbersome acrobatics in my scripts, completely replacing the _start_chrome_browser() and _start_chrome_headless_browser() methods, if I want to use some of my webdriver_options.

It seems to me that it would be sufficient to simply replace the webdriver.Chrome dependency in the RequestiumChrome class definition.
That is, in my script I have now written:

import seleniumwire.webdriver    # I want to use this webdriver

# Here I replaced common `webdriver` with `seleniumwire.webdriver`
class RequestiumChrome(requestium.requestium.DriverMixin, seleniumwire.webdriver.Chrome):
            pass
requestium.requestium.RequestiumChrome = RequestiumChrome

# after I call a regular Requestium instance
self.s = requestium.Session(
            webdriver_path='chromedriver', browser='chrome', default_timeout=60,
            webdriver_options={'arguments': [
                '--start-maximized',
                '--window-size=1200,1000',
            ], # 'binary_location': "/usr/bin/google-chrome"
            })

But this is also a cumbersome construct in my code.

So I think the external webdriver can be added to requestium.py with something like:

webdriver = driver_class   # can assign this via an initatial argument
class RequestiumChrome(requestium.requestium.DriverMixin, webdriver):
            pass

Out of Date Requirements

Can this be refactored to work with updated versions of each package found in the requirements.txt?

I'm unfortunately unable to use this package anymore since it's requirements are out of date :(

Thank you!

DeprecationWarning: use options instead of chrome_options

Selenium has deprecated the use of chrome_options in favor of options when creating a WebDriver (see the source code here). This results in the following warnings in Requestium:

/requestium/requestium/requestium.py:194: DeprecationWarning: use options instead of chrome_options
  super(DriverMixin, self).__init__(*args, **kwargs)

Thank you for making this tool

when i got known requestium ,I found it really a good tool instead of reuqests,but sometimes it work not very well in Chinese web,I hope it will update in 2019,thanks very much.

how to set chorme path

'binary_location':os.path.abspath('D:\ProgramFile\Chrome\chrome.exe'),but it is not work
the exception is selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary

Empty cookies <RequestsCookieJar[]>

Hi there,
Just testing requestium
`from requestium import Session

url = "https://{sign_in_endpoint}"
login = "xxx"
password = "xxx"

s = Session("./chromedriver", browser="chrome", default_timeout=15)
s.post(url, data={"login": login, "password": password})
print(response.status_code) #Ensure that POST was successful - OK
print(response.text) #Ensure that POST body was correct (Json with user data in my case) - OK
print(s.cookies) # <- Here I'm trying to figure out what is in the session - <RequestsCookieJar[]>
s.transfer_session_cookies_to_driver()
s.driver.get({resource homepage}) #Here I'm expecting logged in user on respected page`

No errors while running

Looks like I'm missing smth obvious related to cookies management/transfer here?

Why not synchronize the request header between selenium and requests ?

I found that the request header is not synchronization when I used 's.headers.update' to update a header,and there is a question that s.get() method works well,but the s.driver.get() method does not work satisfyingly!The header is not synchronization.Maybe they could work synchronously one day!

How to get the changed tagname ?

When I open The http://product.dangdang.com/20528119.html,the tagname is "加入购物车",5 seconds later,the tagename is changed to "不再销售",I write code below,But It return None,How to get the changed tagname ?Thank you

from requestium import Session, Keys
import time
s = Session(webdriver_path='chromedriver/chromedriver.exe',
            browser='chrome',
            default_timeout=15,
            webdriver_options={'arguments': ['headless']})

scode = s.get('http://product.dangdang.com/20528119.html')
tagname = s.driver.ensure_element_by_xpath("//div[@class='buy_box_btn']",state = 'invisible',timeout = 5).ensure_element_by_class_name()
print(tagname)

About requestium with 2 step authentication

Dear Requestium team,

Thank you for your project Requestium, I think this is the thing I need.
I am coding with the Request library to call API with just simple code:

 import requests
 s = requests.Session()
 s.post(api_Auth, headers=token)

But now the website added a new function that I need faceid when signing in.
I tried step by step as below:

 from selenium import webdriver
 from requestium import Session, Keys

 token = {"authorization": 'jb206YW5oZW'}
 chrome_driver = webdriver.Chrome()
 s = Session(driver=chrome_driver)
 response = s.driver.get('https://sampleweb/login')
 # sleep code when i am doing faceid
 time.sleep(180) 

 # Done faceid
 # But i got message: Incorrect authentication credentials then below code start to call api
 s.post(api_Auth, headers=token)
 api_url_need_call = s.get('https://sampleweb/........')

I read your blog and http://pypi.org/ but still haven't found out how to do it.
Can you please give me a sample code about this?

Thank you so much !

selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH

Hi, thanks for service.

I'am using pytest-selenium to setting up browser start in conftest.py:

from requestium import Session, Keys


import pytest
@pytest.fixture
def selenium(selenium):
    selenium.implicitly_wait(10)
    selenium.maximize_window()
    return selenium


@pytest.fixture
def requestium():
    s = Session(webdriver_path='./chromedriver',
                browser='chrome',
                default_timeout=15,
                webdriver_options={'arguments': ['headless']})
    return s

https://pytest-selenium.readthedocs.io/en/latest/user_guide.html#specifying-capabilities

And PageObject model to open pages, without setting up driver/paths:
https://selenium-python.readthedocs.io/page-objects.html

Here is my test.py:

    title = requestium.get('https://httpbin.org').xpath('//title/text()')
    requestium.transfer_session_cookies_to_driver()
    requestium.driver.get('http://www.samplesite.com/sample/process')

And starting like:

pytest

with pytest.ini

[pytest]
addopts = --driver Remote
          --verbose
          --tb short
          --selenium-port 4444
          --capability browserName chrome

and selenium remote in docker:

services:
    hub:
        image: selenium/hub:3.141.59-20200409
        ports:
            - "4444:4444"
    chrome_one:
        image: selenium/node-chrome-debug:latest
        volumes:
            - /dev/shm:/dev/shm
            - logs:/e2e/tests/logs
            - uploads:/e2e/tests/uploads:ro
        depends_on:
            - hub
        environment:
            - HUB_HOST=hub
        ports:
            - 5901:5900

Finally, I got an error, could you please help to resolve it?

№1
E selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home

If i change driver here requestium.driver.get('http://www.samplesite.com/sample/process') to requestium.selenium.get('http://www.samplesite.com/sample/process'), i catch another error:
№2

    requestium.selenium.get('http://www.samplesite.com/sample/process')
E   AttributeError: 'Session' object has no attribute 'selenium'

I also tried replace remote hub address in webdriver_path and got error №2:

    s = Session(webdriver_path='http://0.0.0.0:4444/wd/hub',
                browser='chrome',
                default_timeout=15,
                webdriver_options={'arguments': ['headless']})

Browser opens, but does not go to desired page

I have a problem, I wanted to run on chromium which is much lighter for my potato pc, but when I go to the link of the page I want it to go to, the browser stops at a page "data;" and the process dies, the code is just this one below.

from requestium import Session, Keys
from selenium import webdriver


def browser_configs():
    browserPath = "Droptator\chrome-win\chrome.exe"

    option = webdriver.ChromeOptions()
    option.binary_location = browserPath
    driverPath = "Droptator\drivers\chromedriver.exe"
    browser = webdriver.Chrome(executable_path=driverPath, chrome_options=option)

    browser.maximize_window()

    return browser

drive = browser_configs()

session = Session(
    driver=drive,
)

session.get("https://www.exemple.com/")

PhantomJS critical error with Selenium v4

In Selenium Webdriver v4 (released on 13 Oct 2021) removed support of PhantomJS.
https://github.com/SeleniumHQ/selenium/blob/5b5f3f4c841c3a8fe58d40044ed61231c082ff84/py/CHANGES

So, the error raising:

  File "C:\Users\username\AppData\Local\Programs\Python\Python38\lib\site-packages\requestium\__init__.py", line 1, in <module>
    from .requestium import Session
  File "C:\Users\username\AppData\Local\Programs\Python\Python38\lib\site-packages\requestium\requestium.py", line 404, in <module>
    class RequestiumPhantomJS(DriverMixin, webdriver.PhantomJS):
AttributeError: module 'selenium.webdriver' has no attribute 'PhantomJS'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.