AI in Manufacturing and Operations at...

NVIDIA leverages data science and machine learning to optimize chip manufacturing and operations workflows—from wafer fabrication and circuit probing to packaged chip testing. These stages generate terabytes of data, and turning that data into actionable insights at speed and scale is critical to ensuring quality, throughput, and cost efficiency. Over the years, we’ve developed robust ML pipelines that tackle problems like defect detection and test optimization.

This is the first in a series of blogs that will share our key learnings from deploying such pipelines using CUDA-X libraries like NVIDIA cuDF and NVIDIA cuML. While these lessons come from semiconductor manufacturing, the challenges and solutions are widely applicable across logistics, quality engineering, and supply chain optimization.

Let’s start with a real-world classification task: predicting whether a chip will pass or fail a specific test. In more advanced scenarios, the objective extends to predicting a chip’s performance bin—from L1 to L5—and formulating the problem as a multi-class classification problem.

In both cases, the model consumes rich measurement signals from multiple sources:

Sparse features: wafer-level metrology
Dense signals: die-level data from circuit probe (CP) tests, and high-fidelity functional test (FT) results from packaged units

These datasets often span hundreds of thousands of rows and several hundred features, quickly overwhelming traditional CPU-bound data processing workflows.

AI in Manufacturing and Operations at...

Tackling imbalanced datasets

Synthetic minority oversampling technique (SMOTE)

Stratified undersampling

Metrics that matter for evaluation

Weighted accuracy

Precision-recall curve

Interpretability

Conclusion