scry/docs/CARD_RECOGNITION.md
Chris Kruining 54ba7496c6
.
2026-02-05 11:34:57 +01:00

15 KiB

Card Recognition Architecture

This document explores approaches for implementing robust MTG card recognition in Scry.

Goals

  1. Robustness - Work reliably across varying lighting, angles, and card conditions
  2. Speed - Fast enough for real-time scanning (<500ms per card)
  3. Accuracy - High precision to avoid misidentifying valuable cards
  4. Offline-capable - Core recognition should work without network

Data Sources

Scryfall API

Scryfall is the de-facto source of truth for MTG card data.

Key endpoints:

Endpoint Purpose
GET /cards/named?fuzzy={name} Fuzzy name lookup
GET /cards/{scryfall_id} Get card by ID
GET /cards/search?q={query} Full-text search
GET /bulk-data Daily JSON exports

Rate limits: 50-100ms between requests (~10/sec). Images at *.scryfall.io have no rate limit.

Bulk data options:

File Size Use Case
Oracle Cards ~161 MB One card per Oracle ID (recognition)
Unique Artwork ~233 MB One per unique art (art-based matching)
Default Cards ~501 MB Every English printing
All Cards ~2.3 GB Every card, every language

Recommended approach: Download "Unique Artwork" bulk data, extract image URLs and hashes for all cards. Update weekly or after new set releases.

Card Image Fields

{
  "id": "uuid",
  "oracle_id": "uuid",
  "name": "Lightning Bolt",
  "set": "2xm",
  "collector_number": "129",
  "illustration_id": "uuid",
  "image_uris": {
    "small": "https://cards.scryfall.io/.../small/...",
    "normal": "https://cards.scryfall.io/.../normal/...",
    "large": "https://cards.scryfall.io/.../large/...",
    "art_crop": "https://cards.scryfall.io/.../art_crop/..."
  }
}

Key identifiers:

  • id - Unique per printing
  • oracle_id - Same across reprints (same card conceptually)
  • illustration_id - Same across reprints with identical artwork

Recognition Approaches

How it works: Convert image to fixed-size fingerprint resistant to minor transformations.

Algorithm:

  1. Resize image to small size (e.g., 32x32)
  2. Convert to grayscale (or keep RGB for color-aware variant)
  3. Apply DCT (Discrete Cosine Transform)
  4. Keep low-frequency components
  5. Compute hash from median comparison

Variants:

Type Description Use Case
aHash Average hash Fast, less accurate
pHash Perceptual hash Good balance
dHash Difference hash Edge-focused
wHash Wavelet hash Most robust
Color pHash Separate RGB channel hashes Best for colorful art

Performance (from MTG Card Detector project):

  • Hash size 16 (256-bit with RGB): ~16ms per comparison
  • Hash size 64: ~65ms per comparison
  • Database of 30k+ cards: still feasible with proper indexing

Implementation:

// Pseudo-code for color-aware pHash
public byte[] ComputeColorHash(Image image)
{
    var resized = Resize(image, 32, 32);
    var rHash = ComputePHash(resized.RedChannel);
    var gHash = ComputePHash(resized.GreenChannel);
    var bHash = ComputePHash(resized.BlueChannel);
    return Concat(rHash, gHash, bHash); // 768-bit hash
}

public int HammingDistance(byte[] a, byte[] b)
{
    int distance = 0;
    for (int i = 0; i < a.Length; i++)
        distance += PopCount(a[i] ^ b[i]);
    return distance;
}

Matching strategy:

confidence = (mean_distance - best_match_distance) / (4 * std_deviation)

Accept match if best match is >4 standard deviations better than average.

2. OCR-Based Recognition (Fallback)

When to use: Stacked/overlapping cards where only name is visible.

Approach:

  1. Detect text regions in image
  2. Run OCR on card name area
  3. Fuzzy match against card database using SymSpell (edit distance ≤6)

Libraries:

  • Azure Computer Vision / Google Cloud Vision (best accuracy)
  • Tesseract (open source, but poor on stylized MTG fonts)
  • ML Kit (on-device, good for mobile)

Accuracy: ~90% on test sets with cloud OCR.

3. Art-Only Matching

When to use: Cards with same name but different art (reprints).

Approach:

  1. Detect card boundaries
  2. Crop to art box only (known position relative to card frame)
  3. Compute hash of art region
  4. Match against art-specific hash database

Benefits:

  • More robust to frame changes between editions
  • Smaller hash database (unique artwork only)
  • Less affected by card condition (art usually best preserved)

4. Neural Network (Future Enhancement)

Potential approaches:

Method Pros Cons
YOLO detection Finds cards in complex scenes Slow (~50-60ms/frame)
CNN classification High accuracy Needs training per card
CNN embeddings Similarity search Requires pre-trained model
Siamese networks Few-shot learning Complex training

Recommendation: Start with pHash, add neural detection for card localization only if contour detection proves insufficient.


Robustness Strategies

Pre-processing Pipeline

Input Image
    │
    ▼
┌─────────────────┐
│ Resize (max 1000px) │
└─────────────────┘
    │
    ▼
┌─────────────────┐
│ CLAHE Normalization │  ← Fixes uneven lighting
│ (LAB color space)   │
└─────────────────┘
    │
    ▼
┌─────────────────┐
│ Card Detection      │  ← Contour or ML-based
│ (find boundaries)   │
└─────────────────┘
    │
    ▼
┌─────────────────┐
│ Perspective Warp    │  ← Normalize to rectangle
└─────────────────┘
    │
    ▼
┌─────────────────┐
│ Hash Computation    │
└─────────────────┘
    │
    ▼
┌─────────────────┐
│ Database Matching   │
└─────────────────┘

CLAHE (Contrast Limited Adaptive Histogram Equalization)

Critical for handling varying lighting:

// Convert to LAB, apply CLAHE to L channel, convert back
var lab = ConvertToLab(image);
lab.L = ApplyCLAHE(lab.L, clipLimit: 2.0, tileSize: 8);
var normalized = ConvertToRgb(lab);

Multi-Threshold Card Detection

Use multiple thresholding approaches in parallel:

  1. Adaptive threshold on grayscale
  2. Separate thresholds on R, G, B channels
  3. Canny edge detection

Combine results to find card contours that appear in multiple methods.

Confidence Scoring

public class MatchResult
{
    public Card Card { get; set; }
    public float Confidence { get; set; }
    public int HashDistance { get; set; }
    public MatchMethod Method { get; set; }
}

public MatchResult Match(byte[] queryHash, CardDatabase db)
{
    var distances = db.Cards
        .Select(c => (Card: c, Distance: HammingDistance(queryHash, c.Hash)))
        .OrderBy(x => x.Distance)
        .ToList();
    
    var best = distances[0];
    var mean = distances.Average(x => x.Distance);
    var stdDev = StandardDeviation(distances.Select(x => x.Distance));
    
    // Z-score: how many std devs better than mean
    var zScore = (mean - best.Distance) / stdDev;
    
    return new MatchResult
    {
        Card = best.Card,
        Confidence = Math.Min(zScore / 4f, 1f), // Normalize to 0-1
        HashDistance = best.Distance
    };
}

Edge Cases

Scenario Strategy
Foil cards Pre-process to reduce glare; may need separate foil hash DB
Worn/played Lower confidence threshold, flag for manual review
Foreign language Match by art hash (language-independent)
Tokens/emblems Include in database with separate type flag
Partial visibility Fall back to OCR on visible portion
Similar cards Color-aware hashing helps; art-only match as tiebreaker

Phase 1: MVP (pHash + Scryfall)

┌─────────────────────────────────────────────────────┐
│                    Scry App                          │
├─────────────────────────────────────────────────────┤
│  ┌─────────────┐    ┌──────────────────┐            │
│  │ CameraView  │───▶│ CardRecognition  │            │
│  └─────────────┘    │ Service          │            │
│                     ├──────────────────┤            │
│                     │ • PreProcess()   │            │
│                     │ • DetectCard()   │            │
│                     │ • ComputeHash()  │            │
│                     │ • MatchCard()    │            │
│                     └────────┬─────────┘            │
│                              │                      │
│                     ┌────────▼─────────┐            │
│                     │ CardHashDatabase │            │
│                     │ (SQLite)         │            │
│                     └────────┬─────────┘            │
│                              │                      │
└──────────────────────────────┼──────────────────────┘
                               │ Weekly sync
                     ┌─────────▼─────────┐
                     │  Scryfall Bulk    │
                     │  Data API         │
                     └───────────────────┘

Components

  1. CardHashDatabase - SQLite with pre-computed hashes for all cards
  2. ImagePreprocessor - CLAHE, resize, normalize
  3. CardDetector - Contour detection, perspective correction
  4. HashComputer - Color-aware pHash implementation
  5. CardMatcher - Hamming distance search with confidence scoring
  6. ScryfallSyncService - Downloads bulk data, computes hashes, updates DB

Database Schema

The schema mirrors Scryfall's data model with three main tables:

-- Abstract game cards (oracle)
CREATE TABLE oracles (
    id TEXT PRIMARY KEY,          -- Scryfall oracle_id
    name TEXT NOT NULL,
    mana_cost TEXT,
    cmc REAL,
    type_line TEXT,
    oracle_text TEXT,
    colors TEXT,                  -- JSON array
    color_identity TEXT,          -- JSON array
    keywords TEXT,                -- JSON array
    reserved INTEGER DEFAULT 0,
    legalities TEXT,              -- JSON object
    power TEXT,
    toughness TEXT
);

-- MTG sets
CREATE TABLE sets (
    id TEXT PRIMARY KEY,          -- Scryfall set id
    code TEXT NOT NULL UNIQUE,    -- e.g., "lea", "mh2"
    name TEXT NOT NULL,           -- e.g., "Limited Edition Alpha"
    set_type TEXT,                -- e.g., "expansion", "core"
    released_at TEXT,
    card_count INTEGER,
    icon_svg_uri TEXT,
    digital INTEGER DEFAULT 0,
    parent_set_code TEXT,
    block TEXT
);

-- Card printings with perceptual hashes
CREATE TABLE cards (
    id TEXT PRIMARY KEY,          -- Scryfall card ID (printing)
    oracle_id TEXT NOT NULL,      -- FK to oracles
    set_id TEXT NOT NULL,         -- FK to sets
    set_code TEXT,
    name TEXT NOT NULL,
    collector_number TEXT,
    rarity TEXT,
    artist TEXT,
    illustration_id TEXT,         -- Same across printings with identical art
    image_uri TEXT,
    hash BLOB,                    -- Perceptual hash for matching
    lang TEXT DEFAULT 'en',
    prices_usd REAL,
    prices_usd_foil REAL,
    FOREIGN KEY (oracle_id) REFERENCES oracles(id),
    FOREIGN KEY (set_id) REFERENCES sets(id)
);

CREATE INDEX idx_cards_oracle_id ON cards(oracle_id);
CREATE INDEX idx_cards_set_id ON cards(set_id);
CREATE INDEX idx_cards_name ON cards(name);

Phase 2: Enhanced (Add OCR Fallback)

Add ML Kit or Tesseract for OCR when hash matching confidence is low.

Phase 3: Advanced (Neural Detection)

Replace contour-based card detection with YOLO or similar for complex scenes (multiple overlapping cards, cluttered backgrounds).


Libraries & Tools

.NET/MAUI Compatible

Library Purpose Platform
SkiaSharp Image processing All
OpenCvSharp4 Advanced CV Android/iOS/Windows
ImageSharp Image manipulation All
Emgu.CV OpenCV wrapper All
ML.NET Machine learning All
Plugin.Maui.OCR On-device OCR Android/iOS
<PackageReference Include="SkiaSharp" Version="2.88.7" />
<PackageReference Include="SkiaSharp.Views.Maui.Controls" Version="2.88.7" />
<PackageReference Include="ImageHash" Version="3.1.0" />  <!-- If porting from Python -->
<PackageReference Include="Microsoft.Data.Sqlite" Version="9.0.0" />

For perceptual hashing in C#, we'll need to implement it using SkiaSharp (no direct port of Python's imagehash exists).


Test Image Categories

The TestImages/ directory contains reference images for testing:

TestImages/
├── varying_quality/     # Different lighting, blur, exposure
│   ├── black.jpg
│   ├── counterspell_bgs.jpg
│   ├── dragon_whelp.jpg
│   ├── evil_eye.jpg
│   ├── instill_energy.jpg
│   ├── ruby.jpg
│   ├── card_in_plastic_case.jpg
│   ├── test1.jpg
│   ├── test2.jpg
│   └── test3.jpg
├── hands/               # Cards held in hand (partial visibility)
│   ├── hand_of_card_1.png
│   ├── hand_of_card_green_1.jpg
│   ├── hand_of_card_green_2.jpeg
│   ├── hand_of_card_ktk.png
│   ├── hand_of_card_red.jpeg
│   └── hand_of_card_tron.png
├── angled/              # Perspective distortion
│   ├── tilted_card_1.jpg
│   └── tilted_card_2.jpg
└── multiple_cards/      # Multiple cards in frame
    ├── alpha_deck.jpg
    ├── geyser_twister_fireball.jpg
    ├── lands_and_fatties.jpg
    ├── pro_tour_table.png
    └── pro_tour_side.png

Test Scenarios to Add

  • Foil cards with glare
  • Heavily played/worn cards
  • Cards under glass/sleeve
  • Low-light conditions
  • Overexposed images
  • Cards with shadows across them
  • Non-English cards
  • Tokens and emblems
  • Old frame vs new frame cards

References