top of page

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

Web-Scraping: Identifying HTML Elements, Attributes, and Extracting Data

Project Type

Web-Scraping & Data Extraction

Skills & Tools Used:

● Python
● Splinter
● BeautifulSoup
● Pandas
● Matplotlib

Web-Scraping & Data Analysis Project - Mars Exploration:

Deliverable 1: Scrape Titles and Preview Text from Mars News:
● Utilized automated browsing with Splinter to visit the Mars News website.
● Employed Beautiful Soup for HTML parsing to identify and extract relevant information.
● Extracted titles and preview text of Mars news articles.
● Stored the scraping results in Python data structures (dictionaries and lists).
● Optionally exported the scraped data to a JSON file for easy sharing.

Deliverable 2: Scrape and Analyze Mars Weather Data:
● Automated browsing to the Mars Temperature Data Site for weather information.
● Employed Beautiful Soup to scrape data from the HTML table.
● Assembled the scraped data into a Pandas DataFrame with appropriate column headings.
● Analyzed the dataset using Pandas functions to answer specific questions.
● Examined and converted data types as needed.
● Answered questions about the number of months on Mars, Martian days worth of data, coldest and warmest months, lowest and highest atmospheric pressure months, and terrestrial days in a Martian year.
● Visually represented analysis results with bar charts.
● Exported the DataFrame to a CSV file for future reference.

Conclusion:
● Successfully completed a web-scraping and data analysis project on Mars exploration.
● Demonstrated proficiency in automated browsing, HTML parsing, and data manipulation.
● Showcased skills in extracting, organizing, and analyzing data to derive meaningful insights.
● Displayed capability to communicate findings through visualization tools like bar charts.
● Strengthened core data analyst skills in web scraping, data manipulation, and exploratory data analysis.

bottom of page