Computer vision safety monitoring system for PPE detection, violations, and compliance reports.

Context
Workplace safety checks often depend on supervisors walking through areas, reviewing camera footage manually, or relying on incident reports after a problem has already happened. In environments where helmets, vests, masks, gloves, goggles, or other protective equipment matter, inconsistent monitoring can make safety compliance harder to document.
Safety officers and supervisors need a practical way to review whether required equipment appears present in images or video frames. They also need records that can support follow-up, training, or incident review, especially when violations repeat across teams, shifts, or work areas.
Vostaro was built around that review workflow. The MVP uses computer vision as a detection aid, but the product is framed as safety review software. Its purpose is to surface possible PPE issues and make them inspectable, not to replace supervisor judgment.
Problem
Manual safety checks can miss issues because supervisors cannot watch every area continuously. Video footage can also be time-consuming to review, especially when the important moment is a short frame where a person is missing required equipment.
Even when a violation is noticed, documentation can be incomplete. A supervisor may need the image, timestamp, detected issue, confidence level, location, and review outcome to support follow-up. Without a system, those details can remain scattered or never be recorded at all.
The product problem was to turn visual detection into a reviewable workflow. Vostaro needed to detect people and PPE, show visual evidence, flag possible violations, and preserve the result as a safety record that a human can review or correct.
Solution
Vostaro analyzes uploaded images or sampled video frames to identify people and required PPE items. The interface can draw bounding boxes, show confidence scores, display compliance status, and flag possible missing equipment for review.
The system is designed around human review. A detection result is not treated as a final safety verdict; it becomes a record that a supervisor can inspect, confirm, dismiss, or use for follow-up. This is important because lighting, occlusion, camera angle, and PPE variation can affect model behavior.
The MVP also supports incident history so recurring issues can be monitored over time. That makes the product more useful than a raw model demo because it connects visual detection with records, status, and operational safety review.
My role
I built Vostaro as a solo full-stack MVP, covering the safety workflow, image and frame analysis flow, computer vision integration, detection-result display, incident records, and dashboard interface.
The implementation scope included media upload, inference results, bounding-box visualization, confidence display, compliance checks, violation records, incident history, and a review-friendly UI for supervisors or safety officers.
The key product decision was to avoid overclaiming automation. PPE detection is useful when it helps people review evidence faster, but a safety workflow still needs human confirmation, correction, and context before acting on a violation.
Product workflow
The workflow starts when a user uploads an image or when a video is sampled into frames for analysis. The system processes the media through a detection model and returns people, PPE objects, bounding boxes, confidence values, and possible compliance states.
The frontend overlays detection results on the original media so the reviewer can inspect the evidence directly. Instead of only listing a violation, the product shows where the system detected a person, which equipment appears present, and which required item may be missing.
If the result needs attention, it can become an incident or violation record with status, timestamp, media reference, detected issue, and review outcome. That record gives supervisors a practical basis for follow-up, correction, or training.
System architecture
Vostaro is structured around a Next.js and React frontend, Tailwind CSS interface, FastAPI backend, PostgreSQL records, YOLO-style object detection, OpenCV media handling, PyTorch inference, image uploads, detection results, and incident history.
The data model separates uploaded media, detected objects, confidence scores, PPE requirements, compliance states, violation records, review status, and incident history. This keeps the visual output connected to operational records instead of leaving detections as temporary screen overlays.
For video workflows, frames can be sampled at intervals to reduce processing load while still surfacing possible safety issues. That makes the MVP more realistic because continuous video analysis can be expensive and unnecessary for an early review system.
A production version would need model testing across varied lighting, camera angles, PPE styles, environments, and false-positive conditions. It would also need reviewer correction feedback, permission controls, and safety policy configuration per workplace.
Current status
Vostaro is a working MVP focused on practical safety review rather than raw model output alone. It demonstrates how images or frames can be processed, visualized, flagged, and stored as reviewable safety records.
The current version is strongest as a computer vision workflow proof of concept. It should not be described as a fully validated compliance enforcement system; the credible framing is PPE detection support with human review.
The next step would be testing detection behavior on varied images, improving confidence thresholds, adding reviewer correction feedback, and allowing organizations to configure which PPE items are required for each environment.
Outcomes
The main outcome of Vostaro is a safety workflow that turns media into inspectable detection results and reviewable incident records. Supervisors can see the evidence, check the confidence, and decide whether a violation should be confirmed.
From an engineering perspective, the project strengthened my work with computer vision integration, media handling, inference outputs, bounding-box visualization, full-stack dashboards, and record models for review workflows.
From a product perspective, Vostaro shows that visual AI becomes more useful when it is connected to operational review. Detection alone is not the product; the product is the evidence, status, and follow-up path around the detection.
Reflection
Vostaro reinforced that visual AI products need visible uncertainty. Confidence scores, bounding boxes, and review states matter because users should be able to inspect what the system saw before trusting the result.
The project also showed that safety software carries a different responsibility than a casual detection demo. If a system flags possible violations, it should support correction and human judgment instead of pretending every detection is final.
The broader lesson is that computer vision becomes credible when the product includes the workflow around the model. Vostaro gave that idea a practical safety-monitoring shape through media input, PPE detection, evidence display, and incident review.