Skip to content

crop_with_json.py — Polygon-Based Cell Crop Generator

Overview

crop_with_json.py is the module that converts the rectified shelf image into individual per-cell crop images using a previously prepared polygon JSON file.

It is part of the downstream vision chain and is executed after:

  1. take_clean_snapshot.py creates shelf_rectified.jpg
  2. manual mask workflow creates cells_polygons.json
  3. optional interactive ROI correction refines the polygons

This script then performs the operational crop generation used by the detector stage.

Its role is simple but critical:

  • read the rectified shelf image
  • read the cell polygons
  • crop each shelf cell
  • optionally mask pixels outside the polygon
  • write one image per cell
  • generate basic crop debug artifacts

The output of this script becomes the input batch for:

onnx_batch.py

Position in the Vision Pipeline

flowchart TD

RECT["shelf_rectified.jpg"]
POLY["cells_polygons.json"]
CROP["crop_with_json.py"]
CELLPNG["per-cell crop PNG files"]
DETECT["onnx_batch.py"]

RECT --> CROP
POLY --> CROP
CROP --> CELLPNG
CELLPNG --> DETECT

Role in the Overall Data Chain

This script is the bridge between:

  • geometric shelf definition
  • machine-learning bottle-end detection

Without this step, the detector would have to work on the entire shelf image at once, which would be much less structured and less stable.

Instead, this script isolates each physical shelf cell into a separate crop.

That creates several advantages:

  • simpler per-cell inference
  • stable bin identity
  • easier debug of wrong detections
  • straightforward database comparison later

Main Inputs

By default, the script reads:

Rectified image

<snapshot_dir>/shelf_rectified.jpg

Polygon JSON

Preferred file:

<snapshot_dir>/cells_polygons.json

If that exact file is not found, the script tries to auto-detect the first JSON file inside snapshot_dir that contains a top-level:

{
  "cells": [...]
}

This makes the crop stage tolerant to alternative or versioned JSON filenames.


Main Outputs

Crop images

Written to:

<cfg.paths["crop_out"]>/<cell_id>.png

Example:

01_01.png
01_02.png
...

Debug overlay

Written to:

<cfg.paths["crop_out"]>/crop_cells_debug/00_polygons_overlay.png

Text summary

Written to:

<cfg.paths["crop_out"]>/crop_cells_debug/summary.txt

Architectural Design Goals

The script explicitly follows Wine Platform worker conventions:

  • callable run(cfg) for orchestration
  • simple main() for standalone debugging
  • no argparse
  • settings driven by INI/env/config
  • explicit logging
  • raises on failure so pipeline can stop clearly

This means it integrates cleanly into:

stock_runtime.py

Script-Level Constants

INI_FILENAME = "pince_shelf.ini"

Used by main() for manual execution.


INPUT_IMAGE_NAME = "shelf_rectified.jpg"

Defines the required input image name inside snapshot_dir.


POLYGONS_JSON_NAME = "cells_polygons.json"

Preferred polygon JSON filename.

The script still supports fallback auto-detection if this file is absent.


DEBUG_SUBDIR = "crop_cells_debug"

Subfolder under crop output directory where debug artifacts are written.


CROP_EXT = ".png"

Output file format for crop images.

PNG is appropriate because:

  • it is lossless
  • it preserves black masked backgrounds accurately
  • it is well suited for detector input debugging

MASK_POLYGON = True

This is one of the most important behavior switches.

If True

The crop is not just a bounding box.
A polygon mask is applied, and pixels outside the polygon become black.

If False

The crop is the raw bounding-box rectangle around the polygon.

Current configured behavior:

True

This means the detector sees only the valid polygon region, with the outside area blacked out.

This is important because later scripts use the black background for cell-shape interpretation.


Data Model

CellPoly

@dataclass(frozen=True)
class CellPoly:
    cell_id: str
    polygon: np.ndarray

This is the internal lightweight representation of one shelf cell.

cell_id

Expected format:

nn_nn

Example:

03_05

This is not generated here.
It is assumed to already be assigned upstream by the polygon-generation workflow.

polygon

A NumPy array of shape:

(N, 2)

with integer image coordinates.

This polygon defines the exact crop shape.


Directory Handling

_mkdir(p)

Simple helper that ensures required directories exist.

Used for:

  • snapshot dir
  • crop output dir
  • debug dir

The script always prepares its output directories before writing.


JSON Loading

_load_json(path)

Reads UTF-8 JSON file into a dictionary.

Used for loading the polygons definition.

The script assumes standard JSON structure and does not do schema version negotiation.


Polygon JSON Discovery

_find_polygons_json(snapshot_dir)

This helper selects the correct polygon JSON file.

Priority order:

1. Preferred exact filename

snapshot_dir / cells_polygons.json

2. Fallback scan

The first *.json file in snapshot_dir whose top-level object contains a non-empty:

"cells": [...]

This fallback is useful if multiple generated versions exist or if the canonical name was changed temporarily.

If no valid file is found, the function raises FileNotFoundError.


Parsing Polygon Cells

_parse_cells(data)

This helper converts raw JSON into validated CellPoly objects.


Expected JSON Structure

Each cell entry should look like:

{
  "id": "01_01",
  "polygon": [[x1, y1], [x2, y2], ...]
}

Validation Rules

A cell entry is accepted only if:

  • item is a dictionary
  • id is a string
  • polygon is a list
  • polygon has at least 3 points
  • each point is numeric and length 2

Invalid entries are skipped.

If no valid cells remain after parsing, the function raises ValueError.


Coordinate Conversion

Polygon points are rounded and converted to integer coordinates:

int(round(p[0])), int(round(p[1]))

This is appropriate because image coordinates for cropping and OpenCV masks must be discrete pixel coordinates.


Image Loading

_safe_imread(path)

Loads the input image using OpenCV.

If the image cannot be read, the function raises FileNotFoundError.

This protects the pipeline from silently operating on None.


Polygon Bounding Box Computation

_poly_bbox(poly, w, h)

This function computes the bounding box of a polygon.

Output:

(x0, y0, x1, y1)

where x1 and y1 are exclusive bounds.


Variable Roles

xs, ys

Coordinate vectors extracted from polygon vertices.

x0, y0

Minimum polygon coordinates.

x1, y1

Maximum polygon coordinates plus one pixel.

This makes the bounds suitable for NumPy slicing.


Boundary Clipping

The computed box is clipped to image dimensions:

  • never below 0
  • never beyond image width/height
  • never returns an empty invalid slice

This protects against malformed or slightly out-of-range polygons.


Cropping One Cell

_crop_one(img, cell, mask_polygon)

This is the core crop function.


Step 1 — Compute Bounding Box

The function first computes the polygon’s bounding rectangle inside the image.


Step 2 — Extract ROI

It slices the image:

roi = img[y0:y1, x0:x1].copy()

This creates the local rectangular crop region containing the polygon.


Step 3 — Optional Polygon Masking

If mask_polygon is False, the raw ROI is returned immediately.

If True, a polygon mask is constructed.


Polygon Masking Logic

The polygon is shifted from full-image coordinates into local ROI coordinates:

poly_local[:, 0] -= x0
poly_local[:, 1] -= y0

This is necessary because the ROI origin is no longer the global image origin.

Then a mask is created:

mask = np.zeros((roi_h, roi_w), dtype=np.uint8)
cv2.fillPoly(mask, [poly_local], 255)

Then applied with:

cv2.bitwise_and(roi, roi, mask=mask)

Result: - inside polygon stays visible - outside polygon becomes black

This black background is important for the later cell-shape logic in onnx_batch.py.


Debug Overlay Drawing

_draw_polys_debug(img, cells)

Creates a visualization of all polygons overlaid on the rectified shelf image.

For each cell:

  • draw green polygon outline
  • draw cell ID label near the first polygon point

This debug image helps verify:

  • polygon alignment
  • ID ordering
  • geometric completeness
  • whether the JSON and shelf image match

Public Pipeline Entry Point

run(cfg)

This is the callable function used by stock_runtime.py.

It performs the full crop generation workflow.


Detailed Execution Flow

flowchart TD

START["run(cfg)"]
SNAP["resolve snapshot_dir"]
IMG["load shelf_rectified.jpg"]
JSON["find and load polygons JSON"]
PARSE["parse cells into CellPoly list"]
DBG["write polygons overlay"]
LOOP["iterate over cells"]
CROP["crop one cell"]
WRITE["write PNG crop"]
SUM["write summary.txt"]
RET["return summary dict"]

START --> SNAP
SNAP --> IMG
IMG --> JSON
JSON --> PARSE
PARSE --> DBG
DBG --> LOOP
LOOP --> CROP
CROP --> WRITE
WRITE --> LOOP
LOOP --> SUM
SUM --> RET

Path Resolution Inside run(cfg)

snapshot_dir

Path(cfg.paths["snapshot"])

Contains: - shelf_rectified.jpg - cells_polygons.json

out_dir

Path(cfg.paths["crop_out"])

Contains: - one crop PNG per cell

debug_dir

out_dir / "crop_cells_debug"

Contains: - overlay debug image - summary file


Input Image Requirement

The script requires:

snapshot_dir / shelf_rectified.jpg

If missing, execution fails immediately.

That is intentional, because the crop stage must never guess a different source image.


Crop Iteration Order

Cells are processed in:

sorted(cells, key=lambda c: c.cell_id)

This preserves stable ordering based on existing upstream IDs.

It does not reassign, regroup, or reinterpret cell order.


Crop Output Naming

Each crop is saved as:

<cell_id>.png

This is important because the detector later derives bin_ID from the filename stem.

So stable crop naming is part of the end-to-end identity chain.


Debug Summary File

The script writes a small text summary:

input_image: ...
polygons_json: ...
mask_polygon: True
output_dir: ...
debug_dir: ...
cells_total: ...
crops_written: ...

This is useful for quick manual inspection without opening Python or JSON logs.


Returned Result Structure

run(cfg) returns a structured dictionary containing:

  • step name
  • input image path
  • polygons JSON path
  • output dir
  • debug dir
  • total parsed cells
  • successfully written crops
  • overlay path
  • summary path

This return structure is designed for orchestration and summary logging in the pipeline runtime.


Manual Debug Mode

main()

For standalone use:

  1. load config
  2. run crop stage
  3. print summary lines

This is useful during: - ROI development - polygon debugging - manual shelf tuning - downstream detector preparation


Relationship with Upstream Polygon Preparation

This script depends heavily on the quality of:

cells_polygons.json

which itself comes from:

  • manual blue-line mask creation
  • generate_mask_for_polc.py
  • optional ROI_ADjustment_from_PNG.py

This means cropping accuracy is only as good as the polygon preparation workflow.

The crop stage itself does not try to repair bad polygons.


Relationship with Downstream Detection

The outputs of this script are consumed by:

onnx_batch.py

The detector assumes:

  • one image per cell
  • black background outside masked polygon
  • stable filename-based bin identity
  • reasonably centered cell contents

That means crop_with_json.py is the geometric packaging stage for the ML detector.


Failure Modes

Typical failure cases:

Failure Meaning
missing shelf_rectified.jpg snapshot/rectification stage incomplete
missing polygon JSON ROI preparation stage incomplete
invalid JSON schema malformed polygon file
invalid polygons no valid crop geometry
unreadable image filesystem or path issue
crop write failure output storage issue

The script raises explicitly so the pipeline can stop rather than continue with partial data.


Summary

crop_with_json.py is the polygon-to-image extraction stage of the Wine Platform vision pipeline.

Its responsibilities are:

  • load rectified shelf image
  • load validated ROI polygons
  • crop each shelf cell
  • apply polygon masking
  • preserve stable cell identity
  • prepare detector-ready PNG inputs

It transforms the shelf-level geometric model into a practical set of per-bin crop images, making downstream bottle-end detection structured, stable, and debuggable.