Industrial Visual Inspection with Deep Learning
Visual quality control is one of the hardest things to automate on a factory floor. Vision systems with built-in AI have been around for years, but they cost thousands of euros per station, which prices out most small and medium-sized manufacturers. We wanted to see if it could be done differently, so we started building an alternative: a desktop app that uses deep learning and an ordinary webcam to inspect parts in real time. It's still in alpha, but the early results have pleasantly surprised us.
Why we needed an alternative
Anyone trying to automate a visual inspection tends to hit the same walls. Proprietary hardware with built-in AI has to be paid for at every inspection point, and it usually comes wrapped in a closed ecosystem: constraints on the hardware, dedicated configuration software, maintenance contracts. Often, simply changing what you inspect means reconfiguring or replacing the machine.
The software side is no easier. Classic machine learning approaches need datasets of thousands of images and skills that are rarely available on the shop floor. And many recent solutions run in the cloud, which adds latency, raises questions about the confidentiality of production data, and introduces one more point of failure every time the connection drops.
How it works
Everything stays local
The core decision was to keep everything on the machine. Training, inference, and the models themselves stay local, without a single byte leaving the PC: no dependency on the network, full control over production data, and no cloud or API bills. If a compatible GPU is present the system uses it to speed up training; if not, it runs perfectly well on the CPU alone.
Transfer learning: a few shots, one model
The engine is transfer learning. We start from neural networks already trained on millions of images, which bring a basic visual grammar of edges, textures, and shapes, and we specialize them with a few dozen photos of the actual part. In practice the operator photographs around twenty good pieces and as many defective ones, presses a button, and has a working model in minutes. They can pick from several network profiles, from the lightest and fastest to the most accurate, depending on how much precision or speed the line needs.
A tool for every check
Each kind of inspection lives in its own tool: checking a weld, reading a label, examining a surface finish. Every tool has its own model, its own region of interest on the video, and its own classification parameters. That lets a single workstation watch several quality checkpoints, with configurations you can export and move from one station to another.
No black boxes
A system that just says "pass" or "fail" without explaining why convinces no one, and rightly so. That's why we built in Grad-CAM, which draws a real-time heatmap over the image: when a part is rejected, the operator sees exactly which area triggered the decision. It helps make sense of false positives and, above all, it builds trust in the tool.
Real-time inference
The analysis runs continuously on the webcam feed without stopping production. The operator draws the area to check straight onto the video, and every frame is classified in milliseconds. The decision threshold can be adjusted on the fly, with no retraining, and a moving average smooths out the swings from one frame to the next. Where more robustness is needed, test-time augmentation kicks in, evaluating several variants of the same image to reduce sensitivity to lighting and positioning.
Where it stands
We're in alpha, so a long way from unsupervised use in production. That said, the numbers we're seeing in development and internal testing are better than we'd hoped. A new check goes from the first photos to a working model in under ten minutes; 20-30 images per class are enough where traditional approaches ask for thousands; the whole thing runs on a PC with a USB webcam, a few dozen euros of hardware, with real-time classification that keeps up with line speeds and no recurring costs.
One thing the project has confirmed for us: transfer learning has sharply lowered the barrier to entry for industrial AI. What only a few years ago meant huge datasets and a dedicated data science team is now within reach of anyone with a laptop and a handful of samples. And the desktop-first approach has proven to be the right call for the factory, where reliability, latency, and data control genuinely matter.
There's plenty left to do: holding up on edge cases, validating against real production volumes, multi-camera support, optimization for embedded hardware. We're not where we want to be yet, but for an alpha it's promising.
Actively in development. Next up: unsupervised anomaly detection, integration with SCADA systems, and porting to embedded hardware, bringing intelligent inspection steadily closer to the production line.
Stack: Python · PyTorch · OpenCV · CustomTkinter