Web Scraping: Python
- Ririn Andriyani
- Jun 16, 2024
- 2 min read
There are some steps to do web scraping to follow:
Requirements:
Stable internet connection
Python (3.5 or higher version)
Before writing the code, we need to install some packages for this:
Beautiful Soup (Library we can use to pass HTML code and find the elements inside)

Here we must install "pip install requests" from our local PC to amazon website

We need to install "pandas" because when we scrap data online we need to covert it into CAC format and we will use pandas to convert those text Data into proper data frame and store it into CSV.
We will request form our local PC to amazon website and get HTML code so we can get al information.

Check The Amazon URL We Want To Scrap

Then, we need to define the HTTP Header , we send the HTTP request that contain a lot of things, one of them "Header". One important thing is "user agent". It is mean you are trying to access the website and you are a genuine user by identifying your browser information and some other information to access the website.

Go to "whatismybrowser.com" to see your user agent.

Then , using request package (.get) and pass the URL

We need to check the "webpage" first. If it response [200] then its correct and sucessfull


We see above, and we want to know the "type" cause we will convert into proper HTML format.

Send web content and pass it into HTML.

Print the "soup"
Find all, all of the tags available inside our page that we just extracted

From that , we can see "<a class= "


We can see now, and we use "link[0]" to get the one link.

In here u make a request to that page , using "URL"

In here, we want extract the tittle

we can extract the other like price in here. we just eed to change the html_snippet"
That if we extract one by one.


Now we can print our code and see those result of scraping from "plasytasion 5" amazon. .
Comentários