Category : web-scraping

I want to learn webscraping in python, but I don’t really know how or where to start. My code runs, but it only returns an empty string import requests import urllib from urllib.request import urlopen from bs4 import BeautifulSoup #import pandas as pd html = urllib.request.urlopen("https://www.nba.com/games") soup= BeautifulSoup(html, "lxml") games= soup.find_all("li", class_= "w-full flex flex-col ..

Read more

How would do you get the values from the red line in this plot below? Second graph: https://plotly.com/python/line-charts/ my attempt: import pandas as pd import numpy as np import requests from bs4 import BeautifulSoup header = { "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36", "X-Requested-With": "XMLHttpRequest" } link = ‘https://dash.gallery/python-docs-dash-snippets/_dash-update-component’ r ..

Read more

Currently, I’m trying to get ratings from a website called bookstoscrape and feed it into a database as a practice but there’s an error raised InterfaceError: Error binding parameter 1 – probably unsupported type. here’s my code def getURLs(url): result = requests.get(url) soup = BeautifulSoup(result.text, ‘html.parser’) return(soup) def getBooks(url): soup = getURLs(url) # remove the ..

Read more

I’m trying to web scrape this page and I’m looking for a way to click the load more button using selenium python. I have tried with these codes driver.find_element(By.LINK_TEXT, "Load more").click() driver.find_element_by_xpath(‘//*[@id="root"]/div/div[1]/div[1]/main/div[2]/div[1]/div/button/span’).click() driver.find_element_by_xpath(‘//*[@id="root"]/div/div[1]/div[1]/main/div[2]/div[1]/div/button’).click() but none of the above have worked the main code,My alternative solution was using the scroll like this… def infinite(driver): scroll_pause_time = ..

Read more

I tried scraping tables according to the question: Python BeautifulSoup scrape tables From the top solution, there I tried: HTML code: <div class="table-frame small"> <table id="rfq-display-line-items-list" class="table"> <thead id="rfq-display-line-items-header"> <tr> <th>Mfr. Part/Item #</th> <th>Manufacturer</th> <th>Product/Service Name</th> <th>Qty.</th> <th>Unit</th> <th>Ship Address</th> </tr> </thead> <tbody id="rfq-display-line-item-0"> <tr> <td><span class="small">43933</span></td> <td><span class="small">Anvil International</span></td> <td><span class="small">Cap Steel Black 1-1/2"</span></td> ..

Read more

I have a text in json format: ‘{"Info":{"Result":"OK","ID":8840,"FamilyName":"book","Title":"A950","Model":"A-A","Name":"A 5","Img":"A950-A.png"}}’ how do I capture the "Img" field I’m trying to print(json.loads(response.text[‘Info’][‘Img’])) but I get an error: string indices must be integers Source: Python..

Read more

I can input my username but I cannot input the password for this website. Even when I try waiting for the field to become clickable it does not work. from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium import webdriver url = ‘https://sso.accounts.dowjones.com/login?state=g6Fo2SBPMzZrekJkSlJDUWhfdHRaYmFMQXFzXzFSd2hFV01BMqN0aWTZIEVaQkVtc2FWT0Rkak1ENVl5Q21JeEM1Z3RhWWZZSUY4o2NpZNkgNWhzc0VBZE15MG1KVElDbkpOdkM5VFhFdzNWYTdqZk8&client=5hssEAdMy0mJTICnJNvC9TXEw3Va7jfO&protocol=oauth2&scope=openid%20idp_id%20roles%20email%20given_name%20family_name%20djid%20djUsername%20djStatus%20trackid%20tags%20prts%20suuid%20createTimestamp&response_type=code&redirect_uri=https%3A%2F%2Faccounts.barrons.com%2Fauth%2Fsso%2Flogin&nonce=fd1626cc-b81b-4b33-bb37-6795e64b26d3&ui_locales=en-us-x-barrons-81-2&ns=prod%2Faccounts-barrons#!/signin-password’ driver = webdriver.Firefox() driver.get(url) WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ..

Read more