Skip to content

Motion Webhook, Service, and Pipeline

Scope

This document describes the motion-triggered automation chain of the Wine Cellar Management & Vision Automation System for the scripts:

  • apps/winecellar/backend/app/routers/surveillance_hook.py
  • apps/winecellar/backend/app/services/surveillance_hook_service.py
  • workers/wine_inventory/src/pince_shelf/integration/motion_runtime.py

It complements the high-level architecture from index.md and focuses on the event-driven automated lane that begins when Synology Surveillance Station calls the FastAPI webhook and ends when the motion runtime finishes or times out.

The three scripts together now implement:

  1. Webhook ingress at the FastAPI router layer
  2. Validation, logging, quiet-window rescheduling, and process spawning in the backend service layer
  3. Debounce-until-quiet orchestration in the worker lane, including quiet waiting, light control, shelf visibility probing, and inventory pipeline execution

Position in the System

These scripts belong to the automated lane and bridge the FastAPI backend with the vision worker runtime.

flowchart LR
    subgraph AutomatedLane[Automated Lane]
        SS[Synology Surveillance Station]
        R[Router surveillance_hook.py]
        S[Service surveillance_hook_service.py]
        MR[motion_runtime.py]
        SR[stock_runtime.run]
        CAM[Tapo RTSP Camera]
        LIGHT[Cellar Light via Tapo]
        API2[FastAPI API]

        SS --> R
        R --> S
        S --> MR
        MR --> LIGHT
        MR --> CAM
        MR --> SR
        SR --> API2
    end

Key Behavioral Change

Old model

  • First motion event started a runtime.
  • Later motion events arriving inside cooldown were discarded.
  • The worker used a one-shot initial sleep and then proceeded.

New model

  • Every motion webhook is accepted if the caller IP is allowed.
  • Every accepted webhook updates the persisted motion state.
  • The state stores the last motion timestamp and the quiet-until deadline.
  • The backend ensures only one motion_runtime.py process is active.
  • If a runtime is already alive, the backend does not restart it; it only pushes the quiet deadline forward.
  • The worker loops until the cellar has been quiet for the full quiet period.
  • Only after the quiet window is reached does the vision workflow start.

This is a global cellar debounce design.


Configured Policy

The requested policy is:

  • quiet_period_seconds = 120
  • policy = keep-one-runtime and reschedule
  • scope = global cellar logic

Meaning:

  • all motion events are treated at cellar level
  • a new motion event does not spawn a second worker
  • a new motion event postpones the workflow deadline
  • vision starts only after 120 seconds of silence since the last motion

Runtime Storage and Log Locations

The motion chain uses the following concrete locations under the shared repository area:

flowchart TB
    SHARED[/home/pi/wine_platform/shared/]

    SHARED --> LOGS[/logs/]
    SHARED --> STATE[/state/]

    LOGS --> HOOKLOG[SS_HOOK_LOG_FILE webhook event log]
    LOGS --> MRLOG[motion_runtime.log runtime session log]

    STATE --> PID[motion_runtime.pid]
    STATE --> MOTIONSTATE[motion_last_trigger.json]
    STATE --> LOCK[motion_session.lock]

Files

  • SS_HOOK_LOG_FILE stores JSONL-style webhook/service decision logs.
  • shared/logs/motion_runtime.log stores runtime execution logs.
  • shared/state/motion_runtime.pid stores the current runtime PID.
  • shared/state/motion_last_trigger.json stores the latest motion timestamp and the quiet deadline.
  • shared/state/motion_session.lock prevents overlapping workers.

Motion State File

The state file is now deadline-oriented.

motion_last_trigger.json

{
  "last_motion_ts_epoch": 1773597161.2,
  "last_motion_ts_utc": "2026-03-15T17:52:41+00:00",
  "quiet_period_seconds": 120,
  "quiet_until_ts_epoch": 1773597281.2,
  "quiet_until_ts_utc": "2026-03-15T17:54:41+00:00",
  "remote_addr": "192.168.1.203",
  "payload": {
    "source": "synology_surveillance_station",
    "event": "motion"
  }
}

Meaning of the fields

  • last_motion_ts_epoch: last accepted motion time in epoch seconds
  • last_motion_ts_utc: same time in UTC ISO format
  • quiet_period_seconds: configured silence window before vision may start
  • quiet_until_ts_epoch: deadline after which vision may start if no new motion arrives
  • quiet_until_ts_utc: same deadline in UTC ISO format
  • remote_addr: caller address
  • payload: last request payload snapshot

End-to-End Sequence

sequenceDiagram
    autonumber
    participant SS as Synology Surveillance Station
    participant R as Router surveillance_hook.py
    participant S as Service surveillance_hook_service.py
    participant FS as shared logs and state
    participant MR as motion_runtime.py
    participant L as Tapo Light
    participant C as Camera and snapshot pipeline
    participant SR as stock_runtime

    SS->>R: POST /integration/surveillance/motion
    R->>R: Check feature flag and parse JSON
    R->>S: record_motion_event(...)
    S->>S: Check remote IP allow-list

    alt IP not allowed
        S->>FS: append rejected webhook log
        S-->>R: ip_allowed false
        R-->>SS: 403 Forbidden
    else accepted
        S->>FS: write motion_last_trigger.json
        S->>S: check current motion_runtime PID
        alt runtime already alive
            S->>FS: append reschedule log
            S-->>R: runtime_already_running true
            R-->>SS: 200 OK
        else spawn new runtime
            S->>MR: subprocess Popen python motion_runtime.py
            S->>FS: write PID and append spawn log
            S-->>R: runtime_started true
            R-->>SS: 200 OK
        end

        MR->>FS: append runtime log
        loop until now >= quiet_until
            MR->>FS: read motion_last_trigger.json
            alt new motion arrived
                MR->>MR: observe extended deadline and keep waiting
            else quiet window reached
                MR->>MR: continue to vision phase
            end
        end

        MR->>L: switch light on
        MR->>MR: wait stabilization delay
        loop until visible or timeout
            MR->>C: take_clean_snapshot.run(...)
        end
        alt shelf visible
            MR->>SR: stock_runtime.run()
        else timeout
            MR->>FS: timeout log
        end
        MR->>L: switch light off
        MR->>FS: remove motion_runtime.pid
    end

Architectural Decomposition

flowchart TB
    subgraph BackendAPI[Backend API Process]
        ROUTER[surveillance_hook.py]
        SERVICE[surveillance_hook_service.py]
        SETTINGS[app.core.config.settings]
        ROUTER --> SERVICE
        SERVICE --> SETTINGS
    end

    subgraph SharedFS[Shared Filesystem]
        MRLOG[motion_runtime.log]
        HOOKLOG[Webhook log file SS_HOOK_LOG_FILE]
        PID[motion_runtime.pid]
        STATE[motion_last_trigger.json]
        LOCK[motion_session.lock]
    end

    subgraph WorkerEnv[Worker Environment]
        MRUN[motion_runtime.py]
        CFG[load_pince_config pince_shelf.ini]
        SNAP[take_clean_snapshot.run]
        STOCK[stock_runtime.run]
        TAPO[switch_light_on off]

        MRUN --> CFG
        MRUN --> SNAP
        MRUN --> STOCK
        MRUN --> TAPO
    end

    SERVICE --> HOOKLOG
    SERVICE --> PID
    SERVICE --> STATE
    SERVICE --> MRUN
    MRUN --> MRLOG
    MRUN --> STATE
    MRUN --> PID
    MRUN --> LOCK

Router Layer: surveillance_hook.py

Purpose

This file defines the public FastAPI webhook endpoint that receives motion notifications from Synology Surveillance Station or a compatible caller. It remains a thin HTTP boundary layer and delegates actual decision-making to the service module.

Main responsibilities

  • expose POST /integration/surveillance/motion
  • enforce feature enablement
  • parse request JSON or optional empty body
  • gather request metadata
  • delegate to record_motion_event(...)
  • convert service result into HTTP response

The router does not contain debounce or process-management logic.


Service Layer: surveillance_hook_service.py

Purpose

The service owns the backend-side decision policy for motion events.

New responsibilities

  • validate caller IP
  • append structured motion webhook logs
  • always accept allowed motion events
  • update the persisted motion state on every accepted event
  • compute the new quiet deadline
  • ensure exactly one worker exists
  • spawn a worker only when none is already running
  • return structured result metadata to the router

Core policy

There is no longer a discard-based cooldown.

Instead, every accepted event performs:

  1. last_motion = now
  2. quiet_until = now + 120 seconds
  3. write state file
  4. keep current runtime alive or spawn one if missing

Important helper functions

is_remote_addr_allowed(remote_addr)

Checks the backend allow-list from settings.

append_motion_log(entry)

Writes one JSON line per webhook decision.

_runtime_already_running()

Reads motion_runtime.pid, validates that the process still exists, and removes stale PID files.

_write_motion_state(remote_addr, payload)

Persists the latest motion timestamp and quiet deadline.

_start_motion_runtime()

Spawns motion_runtime.py only when a currently running process does not already exist.

record_motion_event(...)

Main orchestration entry point called by the router.

Example decisions

New worker spawned

{
  "motion_accepted": true,
  "decision": "spawned_new_runtime",
  "deadline_extended": true,
  "runtime_started": true,
  "runtime_already_running": false
}

Existing worker rescheduled

{
  "motion_accepted": true,
  "decision": "rescheduled_existing_runtime",
  "deadline_extended": true,
  "runtime_started": false,
  "runtime_already_running": true
}

Rejected caller

{
  "motion_accepted": false,
  "decision": "rejected_ip_not_allowed"
}

Worker Layer: motion_runtime.py

Purpose

This worker is now a debounce-until-quiet supervisor.

It no longer behaves as a simple one-shot timer.

Runtime phases

Phase 1: startup and single-instance lock

  • initialize runtime logging via shared logger.py
  • redirect stdout/stderr into the log
  • load worker configuration from pince_shelf.ini
  • acquire non-blocking file lock from motion_session.lock
  • exit immediately if another runtime already owns the lock

Phase 2: wait-for-quiet loop

  • repeatedly read motion_last_trigger.json
  • inspect quiet_until_ts_epoch
  • compute remaining quiet time
  • sleep in small chunks, maximum 5 seconds
  • if a newer motion event extends the deadline, keep waiting
  • continue only when now >= quiet_until_ts_epoch

This is the key behavior that matches the real cellar use case.

Phase 3: light activation and stabilization

  • switch cellar light on once
  • wait configured stabilization period

Phase 4: visibility phase

  • probe shelf visibility using take_clean_snapshot.run(...)
  • if the shelf is still blocked or unstable, retry until motion.max_wait_sec

Phase 5: inventory pipeline

  • call stock_runtime.run()
  • report success or failure

Phase 6: cleanup

  • switch cellar light off
  • remove motion_runtime.pid
  • release lock automatically via context manager

Important helper functions

_state_file(cfg)

Derives motion_last_trigger.json next to the configured motion session file.

_pid_file(cfg)

Derives motion_runtime.pid from the same state directory.

_wait_until_quiet_window(cfg)

Implements the rescheduling loop.

_cleanup_pid_file(cfg)

Removes the PID file on normal or abnormal worker exit.

Hardening

The runtime includes a hard cap for the wait-for-quiet loop:

  • WAIT_LOOP_HARD_CAP_SEC = 1800

This prevents a worker from waiting forever in pathological situations.


Logging Strategy

Shared logger

The shared logger.py is reused as-is.

It already provides:

  • file logging to shared/logs
  • console logging
  • stdout/stderr capture and mirroring into the log

Service log fields

Recommended and implemented fields include:

  • motion_accepted
  • decision
  • deadline_extended
  • quiet_period_seconds
  • last_motion_ts_utc
  • quiet_until_ts_utc
  • runtime_started
  • runtime_already_running
  • runtime_pid
  • runtime_error

Runtime log messages

New key runtime log messages include:

  • motion_runtime starting (debounce mode)
  • waiting_for_quiet_window
  • new_motion_observed_deadline_extended
  • quiet_window_reached
  • starting_visibility_phase
  • starting_stock_runtime_pipeline

These messages make diagnosis much easier than the previous discard-only cooldown model.


Real-Life Example

A user enters the cellar and moves around:

17:51:00 motion detected -> quiet_until = 17:53:00
17:51:40 motion detected -> quiet_until = 17:53:40
17:52:10 motion detected -> quiet_until = 17:54:10
17:54:10 no more motion -> vision workflow starts

Practical consequence

  • no duplicate workers
  • no discard-based lost triggers
  • no need to kill and restart the worker
  • the final trigger moment is always the last motion, not the first one

Operational Notes

Why this design is safer than kill-and-respawn

A keep-one-runtime design avoids:

  • race conditions around repeated PID replacement
  • light flicker or repeated initialization
  • process storms under motion-heavy conditions
  • accidental overlap of stock runtime calls

Why the previous behavior felt wrong

Under the old model, later motion inside cooldown could be logged and accepted at HTTP level while still not moving the execution deadline forward. That made the real-world behavior feel like the workflow had stopped or been ignored.

Under the new model, later motion always updates the deadline.


Summary

The motion chain is now intentionally designed for the cellar use case:

  • motion webhooks are accepted continuously
  • a single worker waits for silence
  • every new motion reschedules the workflow
  • vision begins only after 120 seconds of quiet
  • logging clearly shows whether a request spawned a worker or merely extended the deadline

This turns the motion-triggered lane into a robust leave-the-cellar-then-run-vision workflow.