Manufacturing Dashboard: Live Status + Downtime Pareto

Matt Ulepic
4 hours ago
10 min read

A manufacturing dashboard for CNC shops: live run/idle/down visibility plus downtime Pareto by reason codes to recover hidden capacity without misleading KPIs

Manufacturing Dashboard: Live Status + Downtime Pareto

If your ERP says you’re on schedule but the floor still feels “mysteriously tight,” you don’t have a planning problem—you have a visibility problem. In most 10–50 machine CNC shops, the gap isn’t effort. It’s that manual updates, whiteboards, and after-the-fact reports can’t show what’s happening now, and they rarely explain what’s stealing capacity across shifts.

A useful manufacturing dashboard is not a prettier KPI page. It’s an operational control surface: it shows live run/idle/down status so leads can respond in minutes, and it converts downtime into a short list of reasons so supervisors can run a daily improvement loop instead of debating anecdotes.

TL;DR — Manufacturing dashboard

Limit scope to two views: a live run/idle/down board and a downtime Pareto by reason code.
Design the status board for immediate response: machine, state, and time-in-state are the minimum.
Lock definitions (run vs idle vs down) and set an idle threshold; otherwise “downtime” becomes a moving target.
Use a small, standardized reason-code list (start 6–12) with a simple hierarchy for speed + analysis.
Treat “Unknown/No reason” as a process defect; review it every shift.
Segment losses by shift and by “idle with work available” vs “idle no work” to expose dispatch and approval bottlenecks.
Pilot in one cell, train classification as standard work, and assign daily reason-code hygiene ownership.

Key takeaway A manufacturing dashboard only improves throughput when it closes the ERP-vs-reality gap: show live run/idle/down so the team reacts fast, then use a reason-code Pareto to explain losses by shift and assign owners. Without consistent definitions, shift boundary rules, and reason-code discipline, the “dashboard” becomes another report that looks official but hides utilization leakage.

The two views your manufacturing dashboard must nail: live status + loss explanation

For a CNC job shop, a shop-floor dashboard should answer two questions quickly: “What needs attention right now?” and “What keeps taking time away from production?” That’s why the most useful setup is not a stack of generic KPIs—it’s two connected views built for day-to-day accountability.

View 1 is a live run/idle/down board. It gives leads and supervisors immediate situational awareness at the machine and cell level—so response happens in minutes, not at the end of the shift. View 2 is a downtime Pareto by reason code. It converts “we lost time” into “we lost time for these specific reasons,” which is what enables assignment, escalation, and targeted fixes.

Combining the two prevents the most common failure modes of manual methods. A whiteboard or spreadsheet can list downtime after the fact, but it can’t reliably show the moment a pacer machine slips into a non-producing state. And if the shop only tracks minutes (or relies on memory), teams end up arguing about “maintenance” versus “material” versus “setup” with no consistent record. A dashboard that pairs live status with reason-coded loss explanation closes that loop.

The cadence should match how CNC shops actually run: minute-by-minute response for current stoppages, and a daily/shift review to decide what to fix next. For broader context on capturing and managing downtime events end-to-end, see machine downtime tracking.

Design the real-time run/idle/down board (what to show and how to group it)

Start with a simple, readable grid organized the way the shop is managed. In most job shops, that means by cell/department (mills, lathes, Swiss, grinding, EDM), or by a bottleneck family (the few machines that govern flow). The design goal is to help a lead spot exceptions fast—especially across multiple shifts when the owner or plant manager can’t “walk the whole floor” and instantly know what’s normal.

Recommended layout (practical, not fancy)

Each tile should show: machine name, current state (Run/Idle/Down) using a clear color, and time in state (duration). If you can include active job/operation, do it—but treat it as “optional but valuable” so the board doesn’t become dependent on perfect scheduling data.

Machine	State	Time in state	Job/Op (optional)
VMC-01	RUN	18–40 min	Job 24710 / Op 30
LATHE-03	IDLE	6–25 min	Job 24802 / Op 10
VMC-06	DOWN	12–90 min	Job 24698 / Op 20
SWISS-02	RUN	3–15 min	Job 24755 / Op 10

At the top of the screen, show simple counts: number running, idle, and down, plus a “% currently producing” indicator (a quick ratio of running machines to total monitored). This is not the place for 12 metrics. If the display can’t be read from across the cell or in a quick glance during a shift walk, it’s too busy.

If you’re evaluating tooling for this layer, focus on whether it supports mixed fleets and simple shop-floor adoption rather than executive BI polish. A deeper overview of what to expect from machine monitoring systems can help you frame scope correctly.

Make statuses trustworthy: operational definitions, thresholds, and edge cases

Dashboards fail when “run,” “idle,” and “down” are implied instead of defined. In a job shop, a machine can be non-cutting for legitimate reasons (setup, first-article, probing) or avoidable reasons (waiting on material, waiting on approval). If you don’t document rules, you’ll get misleading downtime totals and a Pareto chart that changes based on who’s working the shift.

Define run / idle / down in shop terms

Keep the definitions operational: “Run” means the machine is in-cycle (or otherwise producing by your chosen signal). “Idle” means available but not producing (brief gaps, waiting, or transitions). “Down” means a downtime event that requires attention or explanation. The exact signals vary by control and equipment, but the important part is consistency across your mixed fleet.

Set an idle threshold, and be explicit about the tradeoff

Most shops need a rule like: “If the machine is not running for more than X minutes, create a downtime event.” A shorter threshold captures micro-stops but can inflate event counts and burden operators with frequent classification. A longer threshold reduces noise but can hide repeated small losses that add up. Pick a starting range (for example, 2–10 minutes depending on cycle times), then adjust based on whether the dashboard drives better decisions or just more debating.

Edge cases you must decide up front

Warm-up and daily checks: classify as planned non-production (and make it visible so it’s not “mysterious”).
Probing and in-process gauging: decide whether it counts as run (producing) or idle (non-cutting) for your operation.
Door-open time during setups: don’t let it fall into “unknown”; give it a clear reason-code path.
First-article and prove-out: treat as expected in high-mix environments, but still track the time so approvals don’t become invisible.

Multi-shift boundary rules (avoid double-counting)

A common multi-shift confusion looks like this: second shift reports “the machine was down all night,” but first shift walks in and sees parts produced. What’s usually happening is long idle gaps, short micro-stops, or waiting states that were labeled as “down” in conversation. A shift-aware dashboard helps: events should roll over cleanly at shift change, preserving one continuous event with a clear start time, while reporting allocates minutes to the correct shift window. That keeps you from arguing about whose shift “owned” the stoppage.

In that scenario, the Pareto often shows the top loss as “waiting for material” (or “waiting for traveler/program”) rather than “maintenance.” The point is not to defend a shift—it’s to stop misclassifying idle time as equipment failure and to recover capacity without jumping to capital spend.

Finally, set operator input rules. Decide when a reason code is required (typically when an event crosses the idle threshold or when a state changes to Down), and decide what happens when it’s missing. If “no reason selected” is allowed to accumulate, your dashboard becomes a chart of uncertainty.

Build a downtime reason-code Pareto that actually drives action

A downtime Pareto should do one job: prioritize the few loss reasons that deserve attention next. Plot downtime minutes (or event counts) by reason code and include cumulative percentage so it’s clear where the “big rocks” are. But the Pareto only works if the reason codes are designed for quick entry and consistent meaning.

Use a 2-level hierarchy: category + specific

A two-level structure keeps operators fast while giving managers analysis resolution. For example: Category = “Setup,” Specific = “First-article,” “Program prove-out,” or “Waiting for setup approval.” Category = “Material,” Specific = “Material not staged” or “Wrong material.” This avoids the trap of either (a) too generic (“Setup” explains nothing) or (b) too granular (40 codes no one uses correctly).

Reason code (example)	Downtime minutes	Cumulative %
Material: waiting for material	(example) 220–320	(example) 28–35%
Setup: waiting for setup approval	(example) 140–220	(example) 45–55%
Programming: waiting for program	(example) 90–160	(example) 58–70%
Tooling: tool not available	(example) 60–120	(example) 68–80%
Quality: inspection hold	(example) 40–90	(example) 75–88%
Maintenance: machine fault	(example) 30–80	(example) 82–95%
Unknown / no reason selected	(example) 20–70	(example) 90–100%

Start small—often 6–12 codes is enough for the first pass. You can evolve the list once the team proves it will enter reasons consistently. If “Unknown/No reason selected” shows up as a meaningful bar, treat it as a workflow defect: review it daily, coach the standard, and adjust the list if the right code doesn’t exist.

Filters matter. The same “top reason” can flip by shift, by cell, or by machine family. Make it easy to switch between: today vs last 7 days, 1st vs 2nd shift, and bottleneck cell vs whole shop. If the long bar is only on one shift, you’ve found a management and process problem—not a machine problem.

When you connect this Pareto to utilization, keep the focus on recoverable time loss and operational control rather than scorecards. For context on utilization reporting tools, see machine utilization tracking software.

Connect live status to the Pareto: the ‘downtime loop’ for each shift

The dashboard becomes valuable when it drives a repeatable loop: respond fast, classify consistently, then review and assign. That’s how you reclaim capacity before considering more machines, overtime, or a bigger building.

Real-time response (first 5–10 minutes)

When a machine flips to Down, the lead’s job in the first 5–10 minutes is triage: confirm whether it’s a true stoppage versus a short transition, capture the correct reason code, and decide whether to fix now or escalate (maintenance, programming, material, quality). Without the live board, this often turns into “we’ll check it later,” which quietly compounds into lost hours across a week.

Daily/shift review (Pareto-driven accountability)

In a 10–15 minute shift huddle, pull up the downtime Pareto for that shift (or the last 24 hours). Pick the top 1–2 reasons and assign an owner plus a next check time. The intent is not to “review everything.” It’s to stop recurring losses from blending into background noise.

This is where required scenarios show up in real operations:

High-mix CNC job shop gray area: frequent setups create ambiguous time buckets. A clear set of codes like “setup/first-article,” “program prove-out,” and “tooling” prevents arguing. In many shops, the Pareto reveals the biggest loss isn’t “machine fault,” it’s “waiting for setup approval” (a people/process queue).
Dispatch/priority changes: a machine shows Idle while jobs are queued. If you distinguish “idle-with-work-available” from “idle-no-work,” you expose whether the constraint is scheduling/dispatch rules, programming readiness, or material staging—not “operator performance.”
Multi-shift confusion: when the narrative is “down all night” but parts exist, the combination of live states and the shift-filtered Pareto separates long idle gaps and micro-stops from true Down time, often pointing to “waiting for material” rather than maintenance.

Use short annotations to prevent tomorrow’s guesswork

Reason codes tell you the bucket; notes tell you what actually happened. Encourage brief annotations on the top events (one sentence is enough): “waiting for 4140 bar stock from saw,” “QA hold on first-article,” “program revised for chatter.” This makes the next shift handoff factual instead of interpretive.

If you need help turning raw events and notes into a consistent narrative for leads and supervisors, an interpretation layer can reduce time spent digging. The AI Production Assistant is one example of how shops can summarize patterns without turning the dashboard into an “AI feature list.”

A “classification” change example (why governance matters)

A simple rule change can completely alter what your Pareto says. Suppose a shop initially treats any non-running time over 2 minutes as a downtime event. On machines with short cycles, that captures frequent brief waits (door open, part swaps, chip clearing) and can push “operator waiting” to the top. If the shop later moves the threshold to 8–10 minutes, many of those micro-stops drop out of “downtime,” and the Pareto might shift toward fewer but longer issues like “waiting for material” or “programming.” Neither is “right” universally—the point is that definitions must be stable and documented so trends reflect operations, not moving measurement rules.

Implementation reality in a 10–50 machine shop: rollout steps and adoption traps

Manual methods (clipboard notes, end-of-shift spreadsheets, ERP labor tickets) can work when the owner can physically see every pacer machine. They break when the shop grows, adds shifts, or runs a mixed fleet where “what happened” becomes a debate. Automation is the scalable evolution—not because it’s trendy, but because it captures state changes consistently and makes the same truth available to operators, leads, and managers at the moment decisions are made.

Rollout steps that work

Pilot a cell first: pick 5–10 machines where flow matters, run on one shift, and validate the definitions and the reason-code list.
Train classification as standard work: a 10-minute “what code to pick when” guide beats a long policy no one reads.
Place the screen where decisions happen: near the cell lead desk, not in an office or a conference room.

Adoption traps to plan for

Trap 1: too many reason codes. If you dump a 40-code list into the shop, people will pick whatever is closest or default to “Other.” Start with a short list and add codes only when a recurring “Other” has a clear owner and fix path.

Trap 2: no one owns reason-code hygiene. Assign a daily reviewer (lead or supervisor) to check: are reasons missing, are codes being misused, and are “Unknown” minutes creeping up? This is lightweight governance, but it’s what keeps your Pareto actionable.

Trap 3: the dashboard becomes passive. If the board isn’t tied to a shift-level routine, it turns into wall art. Build the expectation: “We look at it every shift; we pick one fix; we check it again.” That’s how utilization leakage becomes manageable.

Cost framing (without guessing your numbers)

When you’re evaluating a dashboard approach, cost is less about a line item and more about friction: how quickly you can connect a mixed fleet, how much operator input is required, and how much time supervisors spend cleaning data. Before buying another machine to “solve capacity,” it’s worth making hidden time loss visible and repeatable to fix. If you want to understand packaging and what typically drives cost, review pricing with the lens of rollout scope (pilot cell vs full shop, number of machines, and how you’ll manage reason-code discipline).

If you’re solution-aware and deciding whether your shop needs a dashboard that’s built for real-time run/idle/down visibility plus a reason-code Pareto (instead of ERP spreadsheets or a generic BI board), the fastest way to gain clarity is to walk through your machines, shifts, and definitions against a real layout. You can schedule a demo to map your pilot cell, idle threshold, and starter reason-code list to a dashboard your leads can actually run each shift.