What are the types of downtime in manufacturing?

Downtime in manufacturing typically falls into planned and unplanned categories. Planned downtime includes scheduled maintenance, setups, changeovers, and breaks. Unplanned downtime encompasses equipment breakdowns, material shortages, quality issues, operator unavailability, and minor stops, all of which directly impact productivity and OEE.

Which of the following are examples of unplanned downtime?

Examples of unplanned downtime include machine breakdowns (e.g., spindle failure, motor issues), unexpected material quality defects, operator errors leading to stops, sudden power outages, and waiting for tools or supervisors. These events are unpredictable and directly erode machine uptime and production capacity.

What does equipment downtime mean?

Equipment downtime refers to any period when a machine or piece of equipment is not producing at its intended capacity or is completely stopped. It's a critical metric for manufacturing efficiency, indicating lost production time due to a variety of factors that prevent the machine from performing its primary function.

What is downtime tracking?

Downtime tracking is the process of systematically recording and analyzing all instances when a machine is scheduled for production but is not running. The goal is to identify the duration, frequency, and root cause of every stop to increase Overall Equipment Effectiveness (OEE). Modern systems like Machine Tracking automate this process to provide real-time visibility into production losses.

What is the equipment failure code?

An equipment failure code is a specific identifier assigned to a unique failure mode on a particular asset. For example, a CNC lathe might have code 'SPN-01' for a spindle bearing failure and 'TRT-04' for a turret indexing error. This level of granularity is essential for maintenance teams to quickly diagnose problems and for managers to spot recurring issues across similar machines.

Which of the following asset maintenance strategies uses sensor data and failure history to anticipate asset failures?

Predictive Maintenance (PdM) is the strategy that uses sensor data (like vibration, temperature) and historical failure data to anticipate asset failures before they occur. By analyzing trends and anomalies, PdM allows manufacturers to schedule maintenance precisely when needed, minimizing unplanned downtime and maximizing component lifecycle. This is a core benefit of modern AI-powered monitoring systems.

Manufacturing Downtime Event Tracking: Reason Codes, Pareto Analysis & Critical Data Fields

Q: How to analyze machine downtime?

Analyzing machine downtime involves collecting precise event tracking data, assigning accurate reason codes, and utilizing tools like Pareto analysis to identify the most frequent or impactful causes. Machine Tracking's AI Production Assistant automates this process, turning raw data into actionable insights, answering "Why is my machine down?" in plain English.

Q: What are failure codes?

Failure codes are standardized alphanumeric labels used in manufacturing to categorize the specific reason for a machine stoppage or equipment breakdown. They replace vague notes with structured data, allowing for accurate analysis of common issues like mechanical failures, electrical faults, or setup delays, which is critical for effective root cause analysis.

Matt Ulepic
Mar 17
10 min read

Updated: May 4

Mastering Manufacturing Downtime Event Tracking, Reason Codes, Pareto Analysis & Data Fields

For any manufacturing plant manager or CNC shop owner, understanding downtime is paramount to profitability. Without precise data on *when* and *why* machines stop, improving efficiency is a guessing game, leading to lost production and missed deadlines.

Machine Tracking eliminates this ambiguity, providing the granular insights needed to transform hidden idle time into tangible output gains, empowering data-driven decisions that directly impact your bottom line.

How to analyze machine downtime?

Machine Breakdown: How to Read It in Downtime Data

In many CNC shops, “machine breakdown” becomes a convenient label for anything that made a job miss: a true failure, a tooling fight, a program issue, or simply waiting because no one was available to respond. The problem isn’t the word—it’s what happens when that vague label gets converted into downtime minutes and treated like truth.

If your ERP says you were “down for breakdown” but the shop knows it was really a shift handoff, an operator reset loop, or waiting on approval, you don’t have a maintenance problem—you have a measurement problem. Fixing that measurement closes the gap between planned capacity and actual machine behavior, and it speeds up week-to-week decisions without turning into a reliability theory exercise.

TL;DR — Machine breakdown in downtime data

“Breakdown” should be an unplanned stop with a clear start/stop, not a catch-all for any delay.
Total breakdown minutes is incomplete without event counts and the distribution (few long vs many short).
Open events and late closeouts (shift change/weekends) routinely create artificial “6+ hour breakdowns.”
Separate waiting time from repair time; response delays are often the real constraint.
Frequent 8–20 minute “breakdowns” may be tooling/offset adjustments, not maintenance work.
Repeated resets that never get logged can show “high utilization” while throughput and quality slip.
Actionable cuts: Pareto by minutes and by frequency, breakdowns by shift, and repeat-event patterns.

Key takeaway A “machine breakdown” isn’t a single number—it’s a pattern in your downtime dataset. When you capture start/stop cleanly and split waiting from repair (especially across shifts), you expose utilization leakage that looks like “maintenance” in the ERP but is often a response, staffing, or logging issue that can be fixed before you spend on more machines.

Key Maintenance Management Failure Codes & Downtime Tracking Data Fields for CNC Shops

Vague downtime logs like "Machine Stopped" or "Operator Error" are useless for improving OEE because they lack the necessary context for root cause analysis. Establishing a standardized list of failure codes and data fields removes ambiguity and stops the finger-pointing between operators, maintenance, and management. Capturing this granular data is the first step in turning unplanned downtime into predictable, scheduled maintenance that directly boosts shop floor throughput and profitability. This structured approach ensures every minute of lost production becomes a data point for future improvement.

What are failure codes?

What “machine breakdown” looks like in downtime data (not on the floor)

In a usable downtime dataset, a breakdown is an unplanned stop event with a start time, an end time, and a reason code that means “the asset could not run as intended.” That’s different from “we chose to stop,” and different from “we were waiting on something upstream.” The goal is operational visibility: you want to see which machines are constraining capacity and why, with enough detail to act this week.

Most CNC job shops can get value from a small set of consistent fields:

Machine ID (and sometimes control type for legacy vs modern differences)
Shift (or crew) and operator (or cell)
Start timestamp, stop timestamp, and duration
Reason category (e.g., Breakdown) and optional secondary tag (Electrical, Control, Hydraulic)
Free-text note or fault code/symptom (short but specific)

One physical breakdown can become multiple records—and that’s not automatically “bad data.” A machine may stop, get restarted, fault again, then run for 20 minutes and stop again. If you only report total downtime minutes, you miss the most important distinction: a few long events usually demand response coverage, spares, or scheduling buffers, while many short events often point to nuisance faults, setup interactions, or operator workarounds that leak capacity all day.

If you want the broader framework for capturing all stop types (planned and unplanned) and turning them into an operational cadence, tie this back to machine downtime tracking—then treat breakdowns as one specific stop class that needs extra hygiene.

The most common ways breakdown data gets distorted

Breakdown data causes bad decisions when it’s distorted into a story the dataset can’t support. In 10–50 machine, multi-shift shops, these are the patterns that most often inflate or hide breakdown impact.

1) Open-ended events and late closeouts. Shift change and weekends are the classic culprits. A breakdown happens near the end of second shift, the stop stays open, and someone closes it the next morning as a single long event. The dataset now implies “the machine was being repaired for hours,” when much of that time was waiting for maintenance approval, waiting for a part, or simply no one touching the issue.

2) Overuse of “Breakdown” and “Other” as catch-alls. Tooling/offset adjustments, program prove-out, material issues, and fixture problems get logged as breakdown because it’s easy. That pushes you toward the wrong fixes (maintenance firefighting) and away from process fixes (standard offsets, tool life rules, program validation, staging).

3) Duplicate attribution across machines. One upstream constraint (bad material batch, missing inspection, no operator available) can cause multiple machines to stop. If each stop becomes “breakdown,” your reports imply multiple assets are unreliable when the constraint is organizational or upstream.

4) Missing micro-breakdowns that never get logged. Intermittent faults that operators reset repeatedly can disappear from the dataset. Utilization looks fine, but throughput suffers and quality risk rises because the machine is constantly in a fragile state. This is a quiet form of utilization leakage: time is lost in small chunks, and the schedule absorbs it until it can’t.

If you’re using a monitoring approach to capture stops consistently across a mixed fleet (new controls and legacy machines), keep the focus on measurement mechanics, not “dashboarding.” A neutral overview of what matters operationally is covered in machine monitoring systems.

How to separate a breakdown event into actionable components

To make breakdown data actionable, decompose the duration into phases that reflect how work actually happens in a CNC shop. This isn’t a maintenance system; it’s a simple way to tell whether you’re losing time to fixing the machine or to waiting on the organization.

A practical breakdown timeline looks like:

Detection → the stop begins (fault, alarm, crash, failure)
Response → someone acknowledges and takes ownership
Diagnosis → determine likely cause and needed resources
Repair → hands-on fix work
Verification → test cut, warm-up, restart, first-good part

The most important operational split is usually waiting time vs repair time. “Waiting” can include waiting for maintenance, waiting for an electrician, waiting for a supervisor approval, waiting for parts, or waiting for a vendor call-back. You can capture this with a secondary tag like:

Breakdown → Waiting for maintenance
Breakdown → Waiting for parts
Breakdown → Repair in progress
Breakdown → Control/Electrical (optional when known)

Track response time separately from fix time. When response time balloons on second shift or weekends, the “maintenance minutes” story is misleading. You don’t necessarily need better technicians—you may need clearer escalation, an on-call rule, pre-approved parts spend, or a handoff protocol at shift end.

Finally, use Unknown on purpose. Unknown is allowed, but it needs governance: for example, “Unknown breakdown must be resolved (re-coded with a better reason/secondary tag) within 24–48 hours.” That keeps the dataset from degrading into “Other,” while staying realistic about production pressure.

Dataset analyses that reveal utilization leakage from breakdowns

Once breakdown records are reasonably clean, the next step is analysis that changes decisions quickly. Avoid generic KPI lists; you want a few cuts that reveal constraints, shift-level differences, and repeated failure patterns that quietly chew up capacity.

Pareto by total minutes and by frequency. These lead to different actions. Minutes-heavy breakdowns often require spares, escalation rules, or schedule buffers. Frequency-heavy breakdowns often require standard work, parameter tweaks, tooling practices, or targeted troubleshooting windows.

Breakdowns by shift. This is where the ERP-vs-reality gap shows up. If second shift has “long breakdowns” but first shift has “short breakdowns,” it may not be asset reliability—it may be response coverage, approval friction, or closeout discipline. This is especially common when a stop begins near shift change and ownership gets blurry.

Repeat-event detection. Look for same machine + similar short duration + similar note pattern (e.g., “Servo alarm reset,” “lube low,” “door interlock”) showing up again and again. Individually, each event looks small; collectively, it becomes utilization leakage and schedule churn.

Calendar view by job/material/program family. If breakdowns cluster around a certain job type, material, or program family, the constraint might be process interaction (chip control, coolant filtration, fixturing rigidity, program aggressiveness) rather than the machine “randomly failing.” You don’t need a maintenance treatise—just enough tagging to see the association.

Capacity impact framing. Translate breakdown time into lost scheduled hours and the rescheduling churn it creates. The question isn’t “how many breakdown minutes did we have?” It’s “which breakdown pattern is stealing the most reliable capacity from the schedule?” That’s the path to recovery before you consider additional capital equipment. When you’re ready to connect stop patterns to capacity and loading decisions, machine utilization tracking software provides the adjacent context.

Mid-article diagnostic: pick one pacer machine. If you can’t answer (1) “Are we losing time to waiting or repair?” and (2) “Is this shift-specific?” from the last two weeks of events, your first win is data hygiene—not a new maintenance initiative.

Two mini examples: what the same breakdown story looks like in good vs bad data

Below are two simplified, anonymized event-log snapshots (the kind you can export from a downtime log). The point isn’t perfect timestamps—it’s showing how the same shop-floor reality turns into either actionable visibility or reporting noise.

Example A (good capture): shift-change breakdown split into waiting vs repair

Scenario: Second shift runs a horizontal mill. A breakdown happens near shift change. If you leave the stop open, it looks like “a 6-hour breakdown.” If you split phases, you see what really constrained capacity: waiting and response.

Machine	Shift	Start	End	Duration	Reason	Secondary Note
HMC-04	2nd	21:40	22:05	20–30 min	Breakdown	Waiting for maintenance Axis alarm operator notified
HMC-04	2nd	22:05	22:55	40–60 min	Breakdown	Waiting for approval/parts Need spare proximity switch
HMC-04	3rd	22:55	23:35	30–50 min	Breakdown	Repair in progress Replaced switch checked wiring
HMC-04	3rd	23:35	23:55	10–30 min	Breakdown	Verification Dry cycle + first part check
HMC-04	3rd	0:10	0:20	5–15 min	Breakdown	Repeat fault Alarm returned once; reset

Operational decision enabled: the “6-hour breakdown” is mostly waiting/response, not wrench time. That points to actions like tightening on-call coverage, pre-staging common spares for that HMC, and setting a simple response expectation across shifts. It also flags the repeat fault line as a nuisance issue to schedule into a controlled troubleshooting window, rather than letting it leak time in small chunks.

Example B (bad capture): one long breakdown event masks the real constraint

Same underlying event, but it’s left open across shift change and closed later as “Breakdown.” This is how bad data triggers the wrong narrative (the machine is unreliable; we need major repair; maybe we need capex).

Machine	Shift	Start	End	Duration	Reason	Note
HMC-04	2nd	21:40	3:45	~6 hours	Breakdown	Axis alarm
HMC-04	3rd	3:45	4:05	10–30 min	Run	Back up
HMC-04	3rd	4:20	4:30	5–15 min	Breakdown	Alarm again
LAT-02	2nd	22:10	22:25	10–20 min	Breakdown	“Cutting issue”
LAT-02	2nd	1:15	1:30	10–20 min	Breakdown	Offsets adjusted

Decision risk: HMC-04 becomes the “problem machine” because it shows a huge breakdown duration, even though much of that time may have been waiting. Meanwhile, LAT-02 shows frequent 8–20 minute “breakdowns” that look like maintenance work but are often tooling/offset issues handled by operators. If you treat both as the same category, you misdirect resources and can even justify the wrong capital spend instead of recovering hidden time loss through better response and classification.

A practical recoding exercise (even if it’s only for your top events): if you reclassify a portion of “Breakdown” into Tooling, Adjustment, Program, or Material based on the note patterns, your top constraint list changes. Maintenance gets focused on true failures; operations gets focused on repeatable process friction. The exact percentage will vary—treat it as a shop-specific clean-up, not a benchmark.

Minimum viable dataset hygiene rules to prevent recurrence: (1) no open downtime events at shift end, (2) “Breakdown” requires a short note or fault tag, (3) “Unknown breakdown” must be resolved within 24–48 hours, and (4) frequent short breakdowns trigger a quick review: is this really maintenance, or an operator-handled adjustment?

Practical rules for logging breakdowns without slowing production

Manual methods (clipboard logs, end-of-shift notes, spreadsheet “downtime minutes”) can work for a small shop, but they break down at 20–50 machines across multiple shifts. The failure mode is predictable: delayed entry, inconsistent codes, and long open events that turn into stories nobody trusts. The scalable evolution is to keep operator effort low and let the dataset enforce discipline.

1) Use a reason-code hierarchy that protects “Breakdown.” Keep three top-level buckets clear:

Breakdown = true equipment failure or alarm condition that prevents running
Adjustment/Tooling = offsets, inserts, tool changes due to wear, minor fixes handled by the operator
Waiting = resource constraint (maintenance not available, waiting for parts, waiting for approval)

This directly addresses the lathe scenario: frequent 8–20 minute “breakdown” entries that are really tooling/offset work. When you separate those, prioritization improves: maintenance focuses on genuine failures; supervisors focus on training, tool life rules, and standard offset practices.

2) Prompt operators for “what happened” in 5–10 seconds. Don’t ask for paragraphs. Ask for one of: fault code, symptom, last operation, or “what did you try?” That’s enough to reduce Unknown/Other and to find repeat-event patterns. The requirement should be lightweight so it works on second and third shift without constant supervision.

3) Closeout discipline at shift end. The rule is not “fix it by shift end.” The rule is “no open events.” If a machine is still down, the closeout is a handoff state: Waiting for maintenance, Waiting for parts, or Repair in progress. This prevents the classic 6-hour event that was really 2 hours of action and 4 hours of waiting.

4) Capture intermittent faults without overburdening operators. For the scenario where operators repeatedly reset a machine and don’t log it: you need a way to record short, repeated stops as events (even if they auto-capture) and then prompt for a simple tag when the pattern repeats. Otherwise, the dataset reports “high utilization,” but your throughput, scrap risk, and schedule reliability tell a different story. The goal is not to create paperwork; it’s to make micro-breakdowns visible enough to stop the bleeding.

5) Run a weekly audit loop. Review the top 10 breakdown events (by minutes and by frequency). Look for miscoding, training opportunities, and “waiting vs repair” splits that would change the action list. Keep it short and operational. If you want help interpreting patterns quickly (e.g., repeat faults, shift handoff issues, chronic nuisance stops) without turning it into a theory project, tools like an AI Production Assistant can help summarize what the dataset is implying so you can decide what to fix next.

Implementation note for pragmatic shops: prioritize coverage across your mixed fleet and keep IT friction low. Costs should be framed around how quickly you can make the data trustworthy and usable, not around flashy features. If you need to sanity-check rollout scope and what’s included without hunting for numbers in a sales conversation, start with the pricing page for implementation expectations and packaging context.

If your team wants to see what your breakdowns look like when captured cleanly (especially around shift handoffs, waiting vs repair, and nuisance reset patterns), the fastest next step is to walk through one pacer machine’s last 2–4 weeks of stops and apply the splits described above. When you’re ready to validate that on your own equipment and workflows, you can schedule a demo and bring a real downtime export or a list of your most argued-over “breakdowns.”

Manufacturing Downtime Event Tracking: Reason Codes, Pareto Analysis & Critical Data Fields

Mastering Manufacturing Downtime Event Tracking, Reason Codes, Pareto Analysis & Data Fields

Machine Breakdown: How to Read It in Downtime Data

TL;DR — Machine breakdown in downtime data

Key Maintenance Management Failure Codes & Downtime Tracking Data Fields for CNC Shops

What “machine breakdown” looks like in downtime data (not on the floor)

The most common ways breakdown data gets distorted

How to separate a breakdown event into actionable components

Dataset analyses that reveal utilization leakage from breakdowns

Two mini examples: what the same breakdown story looks like in good vs bad data

Example A (good capture): shift-change breakdown split into waiting vs repair

Example B (bad capture): one long breakdown event masks the real constraint

Practical rules for logging breakdowns without slowing production

Guide To Machine Data

Machine Data Insights

What's Happening Now

Machine Monitoring System for Mixed Machine Fleets

Machine Monitoring System Data Fields Checklist

Root Cause Analysis for Machine Stops: A CNC Workflow

About

Try The Utilization Revenue Calculator

Download The How To For Machine Data Collection