WHY Python `requests` DON'T Always Get Web Page Text
Link To DagsHub Repo For The Code
Have you ever been web scraping a lot ...
You come to a new page to scrape.
You see the data on the web page.
You run a `page = requests.get(page_url)` command, and you get ...
NOTHING!
Ouch!
Sometimes, AND MORE AND MORE OFTEN, unless you actually load that page in a browser, you can't get to that data.
What can we do? AUTOMATE!
Automate the web page operations with a Python program that uses Selenium, and then collect the data.
Does this sound super cool?
I assure you that it is. As you build up a set of tools and approaches, it becomes more and more fun.
"Wait Thom. This sounds like making bots with Python!"
That is correct. You are essentially making a bot this way, and can grow into better bot making just by scraping data from pages this way.
It's great fun to automate the operations in a web browser on a website's pages and watch your code navigating through that website and operating it and collecting data.
The document contains a link to my DagsHub repo that contains all the starter code and setup instructions.
The attached PDF is just a conversion from my ReadMe.md in that repo. This PDF and an HTML version of this are also in the repo.
I hope you will follow the repo, so that you can see updates and examples as I add them.
I should be getting back to adding more soon. I had to study some other things first before coming back to advance this work.
Until next time,
Thom