| Revision History | ||
|---|---|---|
| Rev | Date | Description |
| 1.0 | 2026-04-15 | Initial release |
The system is a single-loop thermal controller for an insulated oven, running on an ESP32 microcontroller. A MAX6675 thermocouple provides temperature feedback. A triac with phase-angle control modulates AC power to the heater. The controller combines model-predictive features with classical PI control, tuned via a built-in auto-tuner that identifies plant dynamics and applies SIMC rules.
flowchart LR
subgraph HW["Hardware Layer"]
TC["MAX6675\nThermocouple"]
ZCD["Zero-Cross\nDetector"]
TRIAC["Triac Gate\nDriver"]
HEATER["Heating\nElement"]
end
subgraph SW["Software Layer (ESP32)"]
FILT["Median + EMA\nFilter"]
MPC["MPC Predictor\n& Cooling Model"]
PID["PI Controller\nw/ Conditional\nIntegral"]
LUT["True-RMS\nPower LUT"]
AT["Auto-Tuner\n(SIMC)"]
end
TC -- raw °C --> FILT
FILT -- filtered °C --> MPC
MPC -- predicted °C --> PID
PID -- power 0–1 --> LUT
LUT -- delay µs --> ZCD
ZCD -- trigger --> TRIAC
TRIAC -- AC power --> HEATER
HEATER -. thermal .-> TC
AT -. "Kc, Ti" .-> PID
style HW fill:#e8f4fd,stroke:#4a90d9
style SW fill:#f0f1fa,stroke:#6c63ff
The loop() function uses non-blocking timing to run three concurrent tasks at different rates, plus an event-driven serial command handler. No RTOS is used — all scheduling is cooperative via millis() comparisons.
flowchart TD
LOOP["loop() entry"] --> T1{"250 ms\nelapsed?"}
T1 -- Yes --> READ["Read thermocouple\nApply median + EMA filter"]
READ --> BRANCH{"tune_state > 0 ?"}
BRANCH -- Yes --> TUNE["runAutoTuner()"]
BRANCH -- No --> CPID["computePID()"]
TUNE --> T2
CPID --> T2
T1 -- No --> T2{"1000 ms\nelapsed?"}
T2 -- Yes --> SLOPE["Compute slope\nover 10 s window"]
SLOPE --> COOL["Update self-learning\ncooling rate"]
COOL --> HIST["Record power_history\nand temp_history"]
HIST --> T3
T2 -- No --> T3{"500 ms\nelapsed?"}
T3 -- Yes --> TEL["printTelemetry()"]
T3 -- No --> T4
TEL --> T4{"Serial\navailable?"}
T4 -- Yes --> CMD["handleCommand()"]
T4 -- No --> LOOP
CMD --> LOOP
style READ fill:#d4edda,stroke:#28a745
style TUNE fill:#fff3cd,stroke:#ffc107
style CPID fill:#d1ecf1,stroke:#17a2b8
style SLOPE fill:#e2d5f1,stroke:#7c4dff
| Task | Period | Purpose |
|---|---|---|
| Control Loop | 250 ms | Read sensor, run filter, execute PID or auto-tuner |
| MPC Ledger | 1000 ms | Compute temperature slope, update cooling model, record power history |
| Telemetry | 500 ms | Print system state to serial |
| Serial Parser | Event-driven | Handle SET, TUNE, LAMBDA, PLANT commands |
Raw thermocouple readings are noisy. The system applies a two-stage filter: first a 7-sample median filter to reject spike outliers, then an exponential moving average (EMA) to smooth the result. This produces a clean current_temp value used by all downstream logic.
flowchart LR
RAW["Raw reading\nfrom MAX6675"] --> NAN{"isnan()?"}
NAN -- Yes --> HALT["pid_output = 0\nSENSOR ERROR"]
NAN -- No --> BUF["Insert into\n7-sample\nring buffer"]
BUF --> FULL{"Buffer\nfilled?"}
FULL -- No --> PASS["current_temp\n= raw_temp\n(passthrough)"]
FULL -- Yes --> MED["Median of\n7 samples"]
MED --> EMA["EMA\nα = 0.1"]
EMA --> OUT["current_temp"]
style HALT fill:#f8d7da,stroke:#dc3545
style OUT fill:#d4edda,stroke:#28a745
style MED fill:#e2d5f1,stroke:#7c4dff
style EMA fill:#d1ecf1,stroke:#17a2b8
The median filter uses a simple bubble sort on a copy of the 7-element window, returning the middle value. This is effective at rejecting single-sample spikes common with thermocouple interfaces.
Power is modulated via phase-angle control: the triac fires at a variable delay after each AC zero-crossing. A pre-computed lookup table maps desired power (0–100%) to a firing delay in microseconds. The mapping is based on the true-RMS relationship between firing angle and delivered power, not a simple linear approximation.
flowchart LR
PID_OUT["pid_output\n(0.0 – 1.0)"] --> IDX["Map to\nindex 0–100"]
IDX --> LUT["power_to_delay_lut\n(pre-computed\nat boot)"]
LUT --> DELAY["delay_µs =\n230 + LUT value"]
ZC["AC Zero-Cross\nInterrupt\n(FALLING)"] --> CHECK{"pid_output\n> 0.02?"}
CHECK -- No --> SKIP["No fire\nthis half-cycle"]
CHECK -- Yes --> TIMER["Start hw_timer\nfor delay_µs"]
TIMER --> FIRE["Triac ISR:\nGPIO HIGH 150µs\nthen LOW"]
style ZC fill:#fff3cd,stroke:#ffc107
style FIRE fill:#f8d7da,stroke:#dc3545
style LUT fill:#e2d5f1,stroke:#7c4dff
The 230 µs offset accounts for zero-cross detector latency. The 150 µs gate pulse is sufficient to latch the triac for the remainder of the half-cycle.
The auto-tuner identifies the plant as an integrating process (no self-regulation — the heater has no natural equilibrium). It applies a step test and measures the dead time and maximum heating slope to build a first-order-plus-dead-time (FOPDT) integrating model. The procedure runs in three states.
stateDiagram-v2
[*] --> Idle: System boots\nor tune completes
Idle --> State1_Wait: User sends\n"TUNE" command
State1_Wait: State 1 — Wait for Steady State
State1_Wait --> State1_Wait: |ΔT| > 0.15°C → reset timer
State1_Wait --> State2_DeadTime: 60 s stable\n(ΔT ≤ 0.15°C)
State2_DeadTime: State 2 — Find Dead Time (θ)
State2_DeadTime --> State3_Slope: T rises ≥ 0.5°C\nabove start
State3_Slope: State 3 — Measure Max Slope
State3_Slope --> Idle: After 60 s observation\n→ Compute K', θ → SIMC
note right of State1_Wait
pid_output = 0
Heater OFF
Recording baseline
end note
note right of State2_DeadTime
pid_output = tune_step_power (30%)
Step applied, waiting for
first measurable response
end note
note right of State3_Slope
Step power continues
ΔT over 60 s → S_max
K' = S_max / step_power
end note
| State | Entry Condition | Action | Exit Condition |
|---|---|---|---|
| 1 — Stabilize | "TUNE" command received | Heater OFF. Monitor temperature drift. Reset 60 s timer any time |ΔT| > 0.15°C | 60 s pass with temp within ±0.15°C |
| 2 — Dead Time | Steady state confirmed | Apply step power (30%). Record start temp and time. Watch for first response. | Temperature rises 0.5°C above start |
| 3 — Slope | Dead time measured | Continue step power. Measure total ΔT over next 60 s. Compute Smax = ΔT / 60 | 60 s elapsed → compute plant model → SIMC → return to Idle |
Once the plant model is identified, the Skogestad IMC (SIMC) rules compute PI controller gains. The system supports both integrating-process and FOPDT models, though the auto-tuner always identifies an integrating process (τ = 0). A user-adjustable λ (lambda) parameter sets the desired closed-loop speed.
flowchart TD
PLANT["Plant Parameters\nK', θ, τ"] --> LAMBDA["Compute actual λ\nλ_actual = θ × (λ_setting × 0.33)"]
LAMBDA --> CHECK{"τ == 0 ?"}
CHECK -- "Yes (Integrating)" --> INT_KC["Kc = 1 / (K' × (λ_actual + θ))"]
INT_KC --> INT_TI["Ti = 4 × (λ_actual + θ)"]
CHECK -- "No (FOPDT)" --> FOPDT_KC["Kc = (1/K) × (τ / (λ_actual + θ))"]
FOPDT_KC --> FOPDT_TI["Ti = min(τ, 4×(λ_actual + θ))"]
INT_TI --> SAVE["Save Kc, Ti, K', θ, τ\nto NVS (Preferences)"]
FOPDT_TI --> SAVE
style INT_KC fill:#d4edda,stroke:#28a745
style INT_TI fill:#d4edda,stroke:#28a745
style FOPDT_KC fill:#d1ecf1,stroke:#17a2b8
style FOPDT_TI fill:#d1ecf1,stroke:#17a2b8
style SAVE fill:#fff3cd,stroke:#ffc107
The controller uses a model-predictive approach to estimate where the temperature will be after the dead time θ, accounting for power already injected but not yet visible in the sensor reading. This predicted temperature drives the proportional term and (conditionally) the integral term.
flowchart TD
subgraph Inputs
CT["current_temp"]
CS["current_slope\n(°C/s over 10 s)"]
PH["power_history[]\n(300 s buffer)"]
PP["plant_K, plant_theta"]
NCR["natural_cooling_rate\n(self-learned)"]
end
subgraph Disturbance["Disturbance Filter"]
CS --> NEGCHECK{"slope < 0 ?"}
NEGCHECK -- No --> PASS_S["momentum_slope\n= current_slope"]
NEGCHECK -- Yes --> NATCHECK{"|slope| ≤ 1.1 ×\ncooling_rate?"}
NATCHECK -- "Yes (natural)" --> ZERO["momentum_slope = 0\n(ignore — it's expected)"]
NATCHECK -- "No (crash!)" --> EXCESS["momentum_slope =\nslope + cooling_rate\n(keep only the excess)"]
end
subgraph Prediction["Forward Projection"]
PH --> UOLD["u_old = power\nθ seconds ago"]
UOLD --> PSUM["power_sum = Σ of\n(power[i] − u_old)\nover θ-second window"]
end
CT --> PRED
PASS_S --> PRED
ZERO --> PRED
EXCESS --> PRED
PSUM --> PRED
PRED["predicted_temp =\ncurrent_temp\n+ momentum_slope × θ\n+ K' × power_sum"]
PRED --> PID_USE["→ Used by\nProportional Term\n& Conditional Integral"]
style PRED fill:#d4edda,stroke:#28a745
style ZERO fill:#d1ecf1,stroke:#17a2b8
style EXCESS fill:#f8d7da,stroke:#dc3545
style PID_USE fill:#fff3cd,stroke:#ffc107
The power_sum term captures energy that has been injected into the system during the dead time window but has not yet reached the sensor. This forward-looking estimate prevents the controller from over-driving during ramp-up, which is the primary cause of overshoot in pure-PI systems controlling thermal processes with significant transport delay.
The system continuously learns the natural cooling rate of the thermal mass. This is critical for the disturbance rejection logic: the controller needs to distinguish between normal heat loss and genuine disturbances like a door opening or cold fluid injection. When the drop is natural, the momentum projection is zeroed — the proportional term does not react. Instead, the steady correction for normal heat loss is handled entirely by the conditional integral operating in REAL mode. When the drop is anomalous, the excess beyond the learned rate is projected forward, and the proportional term engages.
flowchart TD
CHECK1{"pid_output\n== 0 ?"}
CHECK1 -- No --> SKIP["Don't update\n(heater is on)"]
CHECK1 -- Yes --> CHECK2{"current_slope\n< 0 ?"}
CHECK2 -- No --> SKIP2["Don't update\n(temp not dropping)"]
CHECK2 -- Yes --> CHECK3{"|slope| < K' × 0.5 ?"}
CHECK3 -- No --> SKIP3["Don't update\n(impossibly fast —\nsensor glitch?)"]
CHECK3 -- Yes --> LEARN["Update cooling rate:\nrate = 0.1 × |slope|\n+ 0.9 × rate_prev"]
LEARN --> USE["Used by MPC\nto classify\ndropping temp as\nnatural vs anomalous"]
style LEARN fill:#d4edda,stroke:#28a745
style USE fill:#d1ecf1,stroke:#17a2b8
style SKIP fill:#f0f0f0,stroke:#999
style SKIP2 fill:#f0f0f0,stroke:#999
style SKIP3 fill:#f8d7da,stroke:#dc3545
The 0.5 × K' guard rejects sensor errors or sudden physical disturbances from corrupting the learned rate. The slow EMA (α = 0.1) means the rate adapts over minutes, not seconds, giving a stable baseline.
Before feeding the temperature slope into the predictor, the system classifies any negative slope as either natural cooling or a genuine disturbance. When classified as natural, the momentum slope is zeroed so the proportional term does not project a crash into the future — the gradual heat loss correction is left to the conditional integral in REAL mode, which is the appropriate mechanism for slow, expected drift. When classified as anomalous, only the excess beyond the learned natural rate is projected forward, engaging the proportional term proportionally to the severity of the disturbance.
flowchart TD
S["current_slope"] --> NEG{"slope < 0 ?"}
NEG -- No --> POS["momentum_slope\n= current_slope\n(positive or zero —\nno filtering needed)"]
NEG -- Yes --> MAG{"|slope| ≤ 1.1 ×\nnatural_cooling_rate ?"}
MAG -- Yes --> NAT["momentum_slope = 0\n\nVerdict: NATURAL COOLING\nP-term does not react.\nCorrection left to the\nconditional integral (REAL mode)."]
MAG -- No --> CRASH["momentum_slope =\nslope + cooling_rate\n\nVerdict: ANOMALOUS DROP\nOnly the excess beyond\nnatural cooling is projected\ninto the future."]
POS --> PRED["→ predicted_temp"]
NAT --> PRED
CRASH --> PRED
style NAT fill:#d4edda,stroke:#28a745
style CRASH fill:#f8d7da,stroke:#dc3545
style POS fill:#d1ecf1,stroke:#17a2b8
This is the most nuanced part of the controller. The integral term's error source switches between the real temperature and the predicted temperature depending on the system's operating mode. This prevents integral windup during ramp-up while still eliminating steady-state error once near setpoint.
flowchart TD
START["Evaluate conditions"] --> C1{"|slope| ≤ flat_threshold\nAND error ≤ 10% SP\nAND error > 1.5% SP ?"}
C1 -- Yes --> MODE_A1["MODE A: Near Steady-State\ni_error = SP − T_real\nflatFlag = REAL\n\nSlope is flat, close to target.\nUse real temp to kill\nsteady-state offset."]
C1 -- No --> C2{"error ≤ 1.5% SP\nAND T < SP ?"}
C2 -- Yes --> MODE_A2["MODE A: Very Close\ni_error = SP − T_real\nflatFlag = REAL\n\nWithin 1.5% of target.\nAlways use real temp\nfor fine correction."]
C2 -- No --> C3{"T ≥ SP AND\nT_predicted ≥ 90% SP ?"}
C3 -- Yes --> MODE_A3["MODE A: Above Setpoint\ni_error = SP − T_real\nflatFlag = REAL\n\nTemp at or above target\nand predictor agrees.\nUse real temp."]
C3 -- No --> C4{"T ≥ SP AND\nT_predicted < 90% SP ?"}
C4 -- Yes --> MODE_B1["MODE B: Overshoot Warning\ni_error = SP − T_predicted\nflatFlag = PRED\n\nTemp is above SP but\npredictor sees a crash\ncoming. Use predicted\nto prepare."]
C4 -- No --> MODE_B2["MODE B: Active Ramp\ni_error = SP − T_predicted\nflatFlag = PRED\n\nActively heating up.\nUse predicted temp to\nprevent integral preloading."]
MODE_A1 --> INT["error_integral +=\ni_error × dt"]
MODE_A2 --> INT
MODE_A3 --> INT
MODE_B1 --> INT
MODE_B2 --> INT
INT --> CLAMP["Clamp integral:\nmax = Ti / Kc\nmin = 0"]
style MODE_A1 fill:#d4edda,stroke:#28a745
style MODE_A2 fill:#d4edda,stroke:#28a745
style MODE_A3 fill:#d4edda,stroke:#28a745
style MODE_B1 fill:#fff3cd,stroke:#ffc107
style MODE_B2 fill:#d1ecf1,stroke:#17a2b8
During active ramp-up (Mode B), the real temperature lags far behind the setpoint, which would cause massive integral accumulation if used directly. By feeding the predicted temperature error into the integral, the system accounts for energy already in the pipeline. Once near steady-state (Mode A), the real temperature is used so the integral can eliminate the last fraction of a degree of offset — something the predictor alone cannot resolve due to model inaccuracies.
flatThresholdFlag shown in telemetry as [REAL] or [PRED] indicates which mode is active, giving the operator real-time visibility into the integral's behavior.
The complete control computation combines the MPC prediction, proportional action, conditional integral, output clamping, and anti-windup. There is no derivative term — the predictive model serves that role.
flowchart TD
GUARD{"plant_K == 0 ?"}
GUARD -- Yes --> OFF["pid_output = 0\n(no plant model yet)"]
GUARD -- No --> PREDICT["Compute predicted_temp\n(see §7 MPC Predictor)"]
PREDICT --> PTERM["P-term = Kc × (SP − T_predicted)"]
PREDICT --> ISELECT["Select i_error source\n(see §10 Conditional Integral)"]
ISELECT --> ITERM["I-term = Kc × (1/Ti) × ∫ i_error · dt"]
PTERM --> SUM["u = P-term + I-term"]
ITERM --> SUM
SUM --> UCHECK{"u ≥ 1.0 ?"}
UCHECK -- Yes --> SAT_HI["u = 1.0\nAnti-windup:\nundo last integral\nstep if i_error > 0"]
UCHECK -- No --> LCHECK{"u ≤ 0.0 ?"}
LCHECK -- Yes --> SAT_LO["u = 0.0"]
LCHECK -- No --> OK["u = u (no clamping)"]
SAT_HI --> FLOOR
SAT_LO --> FLOOR
OK --> FLOOR
FLOOR["Floor: integral ≥ 0\n(no negative windup)"]
FLOOR --> OUTPUT["pid_output = u"]
style PREDICT fill:#e2d5f1,stroke:#7c4dff
style PTERM fill:#d1ecf1,stroke:#17a2b8
style ITERM fill:#d4edda,stroke:#28a745
style SAT_HI fill:#f8d7da,stroke:#dc3545
style OUTPUT fill:#fff3cd,stroke:#ffc107
| Parameter | Value | Description |
|---|---|---|
| Control loop period | 250 ms | PID/filter execution rate (dt = 0.25 s) |
| MPC ledger period | 1000 ms | Slope and power history update rate |
| Telemetry period | 500 ms | Serial output rate |
| Median window | 7 samples | Covers 1.75 s of readings |
| Slope window | 10 samples | 10 s rolling window for °C/s computation |
| Power history buffer | 300 samples | 5 minutes of power at 1 s resolution |
| Parameter | Value | Description |
|---|---|---|
| EMA α (temperature) | 0.1 | Post-median smoothing factor (slow, heavy filtering) |
| EMA α (cooling rate) | 0.1 | Learning rate for natural cooling model |
| Cooling rate guard | K' × 0.5 | Max credible cooling slope |
| Disturbance margin | 1.1× | 10% tolerance above learned cooling rate |
| Parameter | Value | Description |
|---|---|---|
| Stabilization time | 60 s | Minimum quiet period before step |
| Stability threshold | ±0.15°C | Max drift during stabilization |
| Step power | 30% | Open-loop excitation level |
| Dead time trigger | +0.5°C | Temperature rise that marks end of dead time |
| Slope observation | 60 s | Duration of slope measurement after θ |
| Parameter | Range | Default | Description |
|---|---|---|---|
| Setpoint | 20–200°C | 50°C | Target temperature |
| Lambda (λ) | 1.0–5.0 | 3.0 | Closed-loop aggressiveness (lower = faster) |
| Kc | Auto-tuned | 1.0 | Proportional gain |
| Ti | Auto-tuned | 100.0 | Integral time (seconds) |
| Parameter | Value | Description |
|---|---|---|
| Triac gate pulse | 150 µs | Duration of gate drive pulse |
| ZCD offset | 230 µs | Compensation for zero-cross detector latency |
| Half-cycle period | ~9500 µs | Assumed 50 Hz mains (10 ms half-cycle minus margins) |
| Minimum output | 2% | Below this, triac is not fired (noise floor) |
| Key | Type | Description |
|---|---|---|
setpoint | Float | Target temperature |
lambda | Float | Closed-loop tuning aggressiveness |
K | Float | Plant integrating gain |
tau | Float | Plant time constant (0 for integrating) |
theta | Float | Plant dead time |
Kc | Float | Controller proportional gain |
Ti | Float | Controller integral time |