Awards Processing & Tracking System

A web-based system for bulk uploading, deduplicating, and tracking institutional awards from CSV data. It replaced a messy manual process with a clean, auditable workflow.

Batch CSV Processing
0 Duplicate Awards
Auto Detection & Flagging
Screenshot coming soon

What was broken.

At the institution, awards data was scattered across dozens of CSV files delivered at irregular intervals. Each file came from a different source, with different column orders, different date formats, and sometimes different column names entirely. Administrators had to open each file, visually scan for structure, and manually copy records into a central spreadsheet. One misaligned column could mean hundreds of awards assigned to the wrong people.

Duplicates were the bigger headache. The same award would appear in multiple files, sometimes with slightly different formatting, sometimes identical. The team tracked "already processed" records in a separate spreadsheet, but it was never fully up to date. When auditors asked for a clean list of all awards issued, nobody could produce one with confidence.

The volume made manual processing unsustainable. What started as a manageable trickle of award files had grown into regular bulk deliveries containing thousands of records. The team needed a system that could ingest messy data and detect problems on its own, then produce audit-ready output without requiring anyone to become a spreadsheet expert.

Inconsistent CSV Formats

Every data source delivered files with different column orders, naming conventions, and date formats. Automated processing was impossible without manual cleanup first.

Untracked Duplicates

The same awards appeared in multiple files. Tracking what had already been processed lived in a spreadsheet that was perpetually out of date.

Error-Prone Manual Entry

Copy-pasting thousands of records by hand led to misaligned columns, dropped rows, and data that couldn't be trusted when auditors came calling.

No Audit Trail

There was no centralized database or processing log. When asked "how many awards were issued last quarter?", the answer required hours of manual reconciliation.

How we solved it.

01

ZIP Upload & Automatic Extraction

Built a file upload pipeline that accepts ZIP archives containing multiple CSV files. The system extracts every CSV, validates file integrity, and queues each one for processing. Administrators can upload an entire delivery in one action instead of handling files individually.

The upload handler checks file size limits and validates archive structure. Corrupt or non-CSV entries are rejected before any data touches the database.
02

Intelligent Column Detection

Implemented automatic column mapping that identifies AwardId, UserName, and IssueDate regardless of column order or naming variations. The system reads header rows, matches against known patterns, and flags files where it can't confidently identify required fields. No more manual column alignment.

03

Duplicate Detection & Flagging

Every incoming record is checked against the existing database before insertion. Duplicates are flagged but never silently dropped: administrators see exactly which records were skipped and why, with enough context to verify each decision.

The deduplication logic handles edge cases like whitespace differences, date format mismatches, and case-insensitive username comparisons to catch near-duplicates that exact matching would miss.
04

Summary Reports & Session Management

After every batch upload, the system generates a detailed summary report: total records processed, new records inserted, duplicates flagged, errors encountered. Session management with cookie-based "Remember Me" authentication keeps administrators logged in across sessions without re-entering credentials.

Technologies Used

PHP MySQL CSV Processing ZIP Archive Handling Session Management Cookie-based Auth

Drowning in messy data files?

Let's talk about how automated processing could clean up your data workflow.

Start a Conversation

What it actually does.

ZIP Archive Upload

Upload entire ZIP archives containing multiple CSV files. The system extracts, validates, and queues every file automatically. No more processing files one at a time.

Auto Column Detection

Identifies AwardId, UserName, and IssueDate columns regardless of naming or order. Files with unrecognizable structures are flagged for manual review.

Duplicate Detection

Every record is cross-checked against existing data before insertion. Near-duplicates with whitespace or formatting differences are caught and flagged transparently.

Upload Summary Reports

After every batch, a detailed report breaks down total records, new inserts, duplicates skipped, and errors encountered. Every decision the system made is visible and verifiable.

Session & Auth Management

Secure login with session management and cookie-based "Remember Me" functionality. Administrators stay authenticated across sessions without repeated logins.

Data Validation & Error Reporting

Every row is validated against expected formats before processing. Malformed dates, missing required fields, and structural anomalies are caught and reported, not silently ignored.

See it in action.

The numbers speak.

0%
Faster Processing
Award uploads that took hours now complete in minutes
0
Duplicate Awards
Automated detection catches every repeat before insertion
Audit-Ready
Data Output
Clean, verified records with full processing history
Batch
Multi-File Processing
Upload once, process every CSV in the archive
We used to dread award season. Every delivery meant hours of sorting through CSVs, checking for duplicates by hand, and praying we didn't miss anything. Now we upload the ZIP, review the summary, and move on. The system catches things we never would have found manually.
DA
Data Administrator Awards Processing, The Institution

What I learned.

01

Messy data needs transparent handling, not silent fixes

The first instinct was to auto-correct formatting issues and silently skip duplicates. Administrators pushed back hard. They needed to see every decision the system made, which records were inserted, which were flagged, and why. Trust in automated processing comes from transparency, not from hiding the messy parts. The summary reports became the most valued feature.

02

Column detection is harder than parsing

Parsing a well-formed CSV is straightforward. Figuring out which column is which when every source uses different headers ("award_id" vs "AwardID" vs "Award Number" vs just "ID") required building a fuzzy matching system that could handle real-world naming chaos. Edge cases in column detection consumed more development time than the entire upload pipeline.

03

Batch processing unlocks workflow changes

Before the system, administrators processed files as they arrived, one at a time, interrupting other work. Once batch processing was reliable, the team shifted to a scheduled workflow: collect deliveries throughout the week, upload everything Friday afternoon, review the summary Monday morning. The tool didn't just speed up a task; it changed how the team organized their time.

Want this for
your institution?

Every project starts with a conversation. Tell us about your data processing challenges and let's figure out what an automated awards workflow could look like for you.

No pitch. No pressure. Just a conversation about what might work.