Selenium is a web based Python framework that allows all kinds of tools to function. These tools include things like web scrapers and automated testing suites. In this tutorial, we'll take a look at how we can get a simple web scrape from a website.
Full Tutorial:
Selenium is a web based python framework that allows all kinds of tools to function. These tools include things like web scrapers and automated testing suites. In this tutorial, we'll take a look at how we can get a simple web scrape from a website.
- Open up your development environment, whether inside WayScript or local and install selenium into your desired environment, execute the install in your terminal:
pip install selenium
- Next, set up your selenium code. If using WayScript, you'll need a few settings to use selenium within our execution framework:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:39.0) Gecko/20100101 Firefox/39.0')
driver = webdriver.Chrome(options = options)
driver.set_page_load_timeout(30)
# Take action on browser
driver.get("http://www.python.org")
# Request browser information
title = driver.title
#Establish Waiting Strategy
driver.implicitly_wait(0.5)
#Find an element
elem = driver.find_element(By.XPATH, '//*[@id="content"]/div/section/div[2]/div[1]/div/ul/li[1]/a')
print(elem.text)
#end driver session
driver.quit()
- To execute this daily, you can create a cron trigger on WayScript. A detailed walk through of setting up the cron trigger can be found in the youtube tutorial. You can also check out the following video: