I am trying to pass a link I extracted from beautifulsoup.
import requests r = requests.get('https://data.ed.gov/dataset/college-scorecard-all-data-files-through-6-2020/resources') soup = bs(r.content, 'lxml') links = [item['href'] if item.get('href') is not None else item['src'] for item in soup.select('[href^="http"], [src^="http"]') ] print(links)
This is the link I am wanting.
Now I am trying to pass this link through so I can download the contents.
# make a folder if it doesn't already exist if not os.path.exists(folder_name): os.makedirs(folder_name) # pass the url url = r'link from beautifulsoup result needs to go here' response = requests.get(url, stream = True) # extract contents with zipfile.ZipFile(io.BytesIO(response.content)) as zf: for elem in zf.namelist(): zf.extract(elem, '../data')
My overall goal is trying to take the link that I webscraped and place it in the url variable because the link is always changing on this website. I want to make it dynamic so I don’t have to manually search for this link and change it when its changing and instead it changes dynamically. I hope this makes sense and appreciate any help I can get.
If I manually enter my code as the following I know it works
url = r’https://ed-public-download.app.cloud.gov/downloads/CollegeScorecard_Raw_Data_07202021.zip’
If I can get my code to pass that exactly I know it’ll work I’m just stuck with how to accomplish this.
Source: Python Questions