AI Label Parsing Pipeline¶

Purpose¶

The AI Label Parsing pipeline extracts structured wine metadata from a bottle label image captured in the Winecellar kiosk.

The system is designed with three layers of intelligence to guarantee reliability:

OCR extraction using Google Cloud Vision
Heuristic parsing using local logic
AI enrichment using OpenAI structured extraction

This architecture ensures the kiosk workflow still functions even if the AI service is unavailable.

High-Level Architecture¶

flowchart TD

A[Label Image Upload] --> B[FastAPI Endpoint /label/parse]

B --> C[label.py Router]

C --> D[Google Vision OCR]
D --> E[Raw OCR Text]

E --> F[Heuristic Parsing]
F --> G[Basic Prefill]

G --> H{AI Enabled}

H -->|No| I[Return OCR Prefill]

H -->|Yes| J[OpenAI Label Parser]

J --> K[Structured Wine Metadata]

K --> L[Merge Results]

L --> M[Return Final Response]

Files Involved¶

Router¶

apps/winecellar/backend/app/api/routers/label.py

Handles the HTTP request and orchestrates the full parsing workflow.

OCR Service¶

apps/winecellar/backend/app/services/vision_ocr.py

Responsible for extracting text from the label image using Google Cloud Vision.

AI Label Parser¶

apps/winecellar/backend/app/services/ai_label_parser.py

Uses OpenAI models to convert OCR text into structured wine metadata.

Backend Configuration¶

apps/winecellar/backend/app/core/config.py

Loads runtime configuration and environment variables.

Detailed Execution Flow¶

1. Image Upload¶

The kiosk or frontend uploads a label image.

Request example:

POST /label/parse?ai=true
Content-Type: multipart/form-data
file=<label_image>

The request is handled by the router:

label.py

2. OCR Stage¶

The router calls the OCR service.

Function:

google_vision_ocr_bytes(image_bytes, language_hints)

This function uses Google Cloud Vision:

vision.ImageAnnotatorClient()
document_text_detection()

Result returned:

{
  "text": "raw OCR extracted text"
}

Important characteristics:

Extracts only text
Does not perform semantic parsing
Uses document_text_detection for high accuracy

3. Heuristic Parsing¶

After OCR, the router performs basic local parsing.

Function:

parse_basic(text)

This logic extracts minimal metadata using local rules.

Typical extracted values:

Field	Method
year	regex detection
wine name	first strong text line
appellation	keyword detection

Example extracted content:

2018
Chateau Margaux
Margaux

This step ensures the system works even without AI services.

4. Prefill Construction¶

Function:

build_prefill_from_basic()

Transforms parsed values into kiosk form fields.

Mapping example:

OCR Data	Kiosk Field
wine_name	description
appellation	region
vintage	year
producer	winery

Example prefill:

{
  "description": "Chateau Margaux",
  "region": "Margaux",
  "year": 2018,
  "color": "red",
  "winery": "Chateau Margaux"
}

5. Optional AI Parsing¶

If the request includes:

?ai=true

the router calls the AI parser.

File:

ai_label_parser.py

Function:

parse_wine_label_from_ocr_text(text)

The function sends OCR text to OpenAI.

Model typically used:

gpt-4o-mini

The prompt instructs the model to return structured JSON metadata.

Example output:

{
  "producer": "Chateau Margaux",
  "wine_name": "Margaux",
  "appellation": "Margaux",
  "region": "Bordeaux",
  "country": "France",
  "vintage": 2018,
  "color": "red",
  "grapes": ["Cabernet Sauvignon", "Merlot"],
  "confidence": 0.93
}

6. Result Merge Strategy¶

Function:

build_prefill_from_ai()

AI results are merged with heuristic results.

Priority order:

AI value
↓
heuristic value
↓
empty

This guarantees that the best available data is used.

API Response Example¶

Typical response:

{
  "provider": "google_vision+openai",

  "text": "2018 Chateau Margaux Margaux",

  "year": 2018,
  "name": "Chateau Margaux",
  "appellation": "Margaux",

  "prefill": {
    "description": "Chateau Margaux",
    "region": "Margaux",
    "year": 2018,
    "color": "red",
    "winery": "Chateau Margaux"
  },

  "ai": {
    "producer": "Chateau Margaux",
    "region": "Bordeaux"
  }
}

Configuration System¶

Configuration is handled by:

config.py

Using:

pydantic-settings

Environment File Discovery¶

config.py loads configuration in this order.

1. Explicit override¶

WINECELLAR_ENV_FILE

2. Shared platform configuration¶

shared/config/shared.env

3. Local development fallback¶

.env

API Credentials¶

The code does not contain hardcoded API keys.

Credentials are read from the runtime environment.

Google Vision Credentials¶

Expected environment variable:

GOOGLE_APPLICATION_CREDENTIALS

Example:

GOOGLE_APPLICATION_CREDENTIALS=/home/pi/secrets/google_vision_service_account.json

Used automatically by:

vision.ImageAnnotatorClient()

OpenAI Credentials¶

Expected environment variable:

OPENAI_API_KEY

Example:

OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxx

Used automatically by:

OpenAI()

Where Environment Variables Come From¶

Two sources can provide environment variables.

1. Shared Environment File¶

Location:

shared/config/shared.env

Example:

OPENAI_API_KEY=xxxx
GOOGLE_APPLICATION_CREDENTIALS=/home/pi/secrets/google.json

Loaded automatically by the backend configuration loader.

2. Systemd Service Environment¶

Location:

/etc/systemd/system/winecellar-api.service

Example configuration:

Environment=OPENAI_API_KEY=xxxx
Environment=GOOGLE_APPLICATION_CREDENTIALS=/path/file.json

Or:

EnvironmentFile=/home/pi/wine_platform/shared/config/shared.env

Configuration Precedence¶

When the backend runs via systemd, environment variables are injected before the application starts.

Runtime environment variables therefore take priority.

Reliability Model¶

Component	Dependency	Impact
Google OCR	Hard dependency	request fails if unavailable
OpenAI parser	Soft dependency	fallback to OCR
Heuristic parsing	Local	always available

Failure Handling¶

OpenAI Failure¶

Possible causes:

missing API key
network issue
quota exceeded

Behavior:

ai_error field returned in response

OCR results are still returned.

Google OCR Failure¶

Possible causes:

missing credentials
invalid credential file
network failure

Behavior:

RuntimeError raised

The request fails.

Troubleshooting¶

Inspect service configuration¶

systemctl cat winecellar-api.service

Look for environment variables:

OPENAI_API_KEY
GOOGLE_APPLICATION_CREDENTIALS

Verify environment file¶

shared/config/shared.env

Test endpoint manually¶

curl -X POST http://localhost:8000/label/parse

Summary¶

Component	Responsibility
label.py	API endpoint and orchestration
vision_ocr.py	OCR text extraction
ai_label_parser.py	AI metadata extraction
config.py	configuration loading

Pipeline:

label image
→ OCR extraction
→ heuristic parsing
→ optional AI enrichment
→ kiosk field prefill