Bird Strike: Design & Implement a Relational Database
Overview
I implemented a relational workflow on published bird-strike incident data. The resulting MySQL layout uses four related tables for incidents, airports, airlines, and sky conditions: primary keys throughout, foreign keys from the fact table into the lookups, and checks that keep flight phase inside a coarse set (takeoff, landing, inflight, unknown) while requiring non-negative altitude.
The pipeline relies on R to clean dates and missing labels, map raw flight phases into those buckets, convert pilot-warned flags, escape awkward quotation marks, and bulk-load rows through RMySQL and DBI. SQL ranks states by strike count, lists airlines above the fleet-wide average, and aggregates incidents by calendar month and flight phase; R pulls that grid and plots total strikes per month. A stored procedure mirrors the same normalization rules, adds airports, airlines, or sky-condition rows when they are missing, and registers a new incident in one step.
Deliverables
- DDL and integrity: Table definitions, foreign keys from the incident fact table to airports, airlines, and conditions, plus checks on flight phase and altitude.
- ETL in R: Column selection, invalid-date drops, batch inserts for dimensions and facts, and synthetic key alignment before incidents load.
- Analytical SQL: Top states by incident volume; airlines with above-average strike counts (subquery and HAVING); counts grouped by month and flight phase for plotting.
- Visualization: Scatter plot of total bird strikes by month with labeled points.
- Stored procedure: Encapsulates the same normalization logic, extends dimension rows when needed, and records a new strike in a single database round-trip (exercised from the notebook).
- Documentation: Written report (PDF) and linked R Markdown source.
Technology stack
- MySQL: Schema, constraints, analytical queries, and stored procedures.
- R: Cleaning, key matching, batch SQL execution, and plotting.
- RMySQL / DBI: Connection and execution from the notebook.