466 lines
15 KiB
Markdown
466 lines
15 KiB
Markdown
# Card Recognition Architecture
|
|
|
|
This document explores approaches for implementing robust MTG card recognition in Scry.
|
|
|
|
## Goals
|
|
|
|
1. **Robustness** - Work reliably across varying lighting, angles, and card conditions
|
|
2. **Speed** - Fast enough for real-time scanning (<500ms per card)
|
|
3. **Accuracy** - High precision to avoid misidentifying valuable cards
|
|
4. **Offline-capable** - Core recognition should work without network
|
|
|
|
## Data Sources
|
|
|
|
### Scryfall API
|
|
|
|
Scryfall is the de-facto source of truth for MTG card data.
|
|
|
|
**Key endpoints:**
|
|
|
|
| Endpoint | Purpose |
|
|
|----------|---------|
|
|
| `GET /cards/named?fuzzy={name}` | Fuzzy name lookup |
|
|
| `GET /cards/{scryfall_id}` | Get card by ID |
|
|
| `GET /cards/search?q={query}` | Full-text search |
|
|
| `GET /bulk-data` | Daily JSON exports |
|
|
|
|
**Rate limits:** 50-100ms between requests (~10/sec). Images at `*.scryfall.io` have no rate limit.
|
|
|
|
**Bulk data options:**
|
|
|
|
| File | Size | Use Case |
|
|
|------|------|----------|
|
|
| Oracle Cards | ~161 MB | One card per Oracle ID (recognition) |
|
|
| Unique Artwork | ~233 MB | One per unique art (art-based matching) |
|
|
| Default Cards | ~501 MB | Every English printing |
|
|
| All Cards | ~2.3 GB | Every card, every language |
|
|
|
|
**Recommended approach:** Download "Unique Artwork" bulk data, extract image URLs and hashes for all cards. Update weekly or after new set releases.
|
|
|
|
### Card Image Fields
|
|
|
|
```json
|
|
{
|
|
"id": "uuid",
|
|
"oracle_id": "uuid",
|
|
"name": "Lightning Bolt",
|
|
"set": "2xm",
|
|
"collector_number": "129",
|
|
"illustration_id": "uuid",
|
|
"image_uris": {
|
|
"small": "https://cards.scryfall.io/.../small/...",
|
|
"normal": "https://cards.scryfall.io/.../normal/...",
|
|
"large": "https://cards.scryfall.io/.../large/...",
|
|
"art_crop": "https://cards.scryfall.io/.../art_crop/..."
|
|
}
|
|
}
|
|
```
|
|
|
|
Key identifiers:
|
|
- `id` - Unique per printing
|
|
- `oracle_id` - Same across reprints (same card conceptually)
|
|
- `illustration_id` - Same across reprints with identical artwork
|
|
|
|
---
|
|
|
|
## Recognition Approaches
|
|
|
|
### 1. Perceptual Hashing (Recommended Primary)
|
|
|
|
**How it works:** Convert image to fixed-size fingerprint resistant to minor transformations.
|
|
|
|
**Algorithm:**
|
|
1. Resize image to small size (e.g., 32x32)
|
|
2. Convert to grayscale (or keep RGB for color-aware variant)
|
|
3. Apply DCT (Discrete Cosine Transform)
|
|
4. Keep low-frequency components
|
|
5. Compute hash from median comparison
|
|
|
|
**Variants:**
|
|
|
|
| Type | Description | Use Case |
|
|
|------|-------------|----------|
|
|
| aHash | Average hash | Fast, less accurate |
|
|
| pHash | Perceptual hash | Good balance |
|
|
| dHash | Difference hash | Edge-focused |
|
|
| wHash | Wavelet hash | Most robust |
|
|
| Color pHash | Separate RGB channel hashes | Best for colorful art |
|
|
|
|
**Performance (from MTG Card Detector project):**
|
|
- Hash size 16 (256-bit with RGB): ~16ms per comparison
|
|
- Hash size 64: ~65ms per comparison
|
|
- Database of 30k+ cards: still feasible with proper indexing
|
|
|
|
**Implementation:**
|
|
```csharp
|
|
// Pseudo-code for color-aware pHash
|
|
public byte[] ComputeColorHash(Image image)
|
|
{
|
|
var resized = Resize(image, 32, 32);
|
|
var rHash = ComputePHash(resized.RedChannel);
|
|
var gHash = ComputePHash(resized.GreenChannel);
|
|
var bHash = ComputePHash(resized.BlueChannel);
|
|
return Concat(rHash, gHash, bHash); // 768-bit hash
|
|
}
|
|
|
|
public int HammingDistance(byte[] a, byte[] b)
|
|
{
|
|
int distance = 0;
|
|
for (int i = 0; i < a.Length; i++)
|
|
distance += PopCount(a[i] ^ b[i]);
|
|
return distance;
|
|
}
|
|
```
|
|
|
|
**Matching strategy:**
|
|
```
|
|
confidence = (mean_distance - best_match_distance) / (4 * std_deviation)
|
|
```
|
|
Accept match if best match is >4 standard deviations better than average.
|
|
|
|
### 2. OCR-Based Recognition (Fallback)
|
|
|
|
**When to use:** Stacked/overlapping cards where only name is visible.
|
|
|
|
**Approach:**
|
|
1. Detect text regions in image
|
|
2. Run OCR on card name area
|
|
3. Fuzzy match against card database using SymSpell (edit distance ≤6)
|
|
|
|
**Libraries:**
|
|
- Azure Computer Vision / Google Cloud Vision (best accuracy)
|
|
- Tesseract (open source, but poor on stylized MTG fonts)
|
|
- ML Kit (on-device, good for mobile)
|
|
|
|
**Accuracy:** ~90% on test sets with cloud OCR.
|
|
|
|
### 3. Art-Only Matching
|
|
|
|
**When to use:** Cards with same name but different art (reprints).
|
|
|
|
**Approach:**
|
|
1. Detect card boundaries
|
|
2. Crop to art box only (known position relative to card frame)
|
|
3. Compute hash of art region
|
|
4. Match against art-specific hash database
|
|
|
|
**Benefits:**
|
|
- More robust to frame changes between editions
|
|
- Smaller hash database (unique artwork only)
|
|
- Less affected by card condition (art usually best preserved)
|
|
|
|
### 4. Neural Network (Future Enhancement)
|
|
|
|
**Potential approaches:**
|
|
|
|
| Method | Pros | Cons |
|
|
|--------|------|------|
|
|
| YOLO detection | Finds cards in complex scenes | Slow (~50-60ms/frame) |
|
|
| CNN classification | High accuracy | Needs training per card |
|
|
| CNN embeddings | Similarity search | Requires pre-trained model |
|
|
| Siamese networks | Few-shot learning | Complex training |
|
|
|
|
**Recommendation:** Start with pHash, add neural detection for card localization only if contour detection proves insufficient.
|
|
|
|
---
|
|
|
|
## Robustness Strategies
|
|
|
|
### Pre-processing Pipeline
|
|
|
|
```
|
|
Input Image
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Resize (max 1000px) │
|
|
└─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ CLAHE Normalization │ ← Fixes uneven lighting
|
|
│ (LAB color space) │
|
|
└─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Card Detection │ ← Contour or ML-based
|
|
│ (find boundaries) │
|
|
└─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Perspective Warp │ ← Normalize to rectangle
|
|
└─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Hash Computation │
|
|
└─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Database Matching │
|
|
└─────────────────┘
|
|
```
|
|
|
|
### CLAHE (Contrast Limited Adaptive Histogram Equalization)
|
|
|
|
Critical for handling varying lighting:
|
|
|
|
```csharp
|
|
// Convert to LAB, apply CLAHE to L channel, convert back
|
|
var lab = ConvertToLab(image);
|
|
lab.L = ApplyCLAHE(lab.L, clipLimit: 2.0, tileSize: 8);
|
|
var normalized = ConvertToRgb(lab);
|
|
```
|
|
|
|
### Multi-Threshold Card Detection
|
|
|
|
Use multiple thresholding approaches in parallel:
|
|
1. Adaptive threshold on grayscale
|
|
2. Separate thresholds on R, G, B channels
|
|
3. Canny edge detection
|
|
|
|
Combine results to find card contours that appear in multiple methods.
|
|
|
|
### Confidence Scoring
|
|
|
|
```csharp
|
|
public class MatchResult
|
|
{
|
|
public Card Card { get; set; }
|
|
public float Confidence { get; set; }
|
|
public int HashDistance { get; set; }
|
|
public MatchMethod Method { get; set; }
|
|
}
|
|
|
|
public MatchResult Match(byte[] queryHash, CardDatabase db)
|
|
{
|
|
var distances = db.Cards
|
|
.Select(c => (Card: c, Distance: HammingDistance(queryHash, c.Hash)))
|
|
.OrderBy(x => x.Distance)
|
|
.ToList();
|
|
|
|
var best = distances[0];
|
|
var mean = distances.Average(x => x.Distance);
|
|
var stdDev = StandardDeviation(distances.Select(x => x.Distance));
|
|
|
|
// Z-score: how many std devs better than mean
|
|
var zScore = (mean - best.Distance) / stdDev;
|
|
|
|
return new MatchResult
|
|
{
|
|
Card = best.Card,
|
|
Confidence = Math.Min(zScore / 4f, 1f), // Normalize to 0-1
|
|
HashDistance = best.Distance
|
|
};
|
|
}
|
|
```
|
|
|
|
### Edge Cases
|
|
|
|
| Scenario | Strategy |
|
|
|----------|----------|
|
|
| Foil cards | Pre-process to reduce glare; may need separate foil hash DB |
|
|
| Worn/played | Lower confidence threshold, flag for manual review |
|
|
| Foreign language | Match by art hash (language-independent) |
|
|
| Tokens/emblems | Include in database with separate type flag |
|
|
| Partial visibility | Fall back to OCR on visible portion |
|
|
| Similar cards | Color-aware hashing helps; art-only match as tiebreaker |
|
|
|
|
---
|
|
|
|
## Recommended Architecture
|
|
|
|
### Phase 1: MVP (pHash + Scryfall)
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────┐
|
|
│ Scry App │
|
|
├─────────────────────────────────────────────────────┤
|
|
│ ┌─────────────┐ ┌──────────────────┐ │
|
|
│ │ CameraView │───▶│ CardRecognition │ │
|
|
│ └─────────────┘ │ Service │ │
|
|
│ ├──────────────────┤ │
|
|
│ │ • PreProcess() │ │
|
|
│ │ • DetectCard() │ │
|
|
│ │ • ComputeHash() │ │
|
|
│ │ • MatchCard() │ │
|
|
│ └────────┬─────────┘ │
|
|
│ │ │
|
|
│ ┌────────▼─────────┐ │
|
|
│ │ CardHashDatabase │ │
|
|
│ │ (SQLite) │ │
|
|
│ └────────┬─────────┘ │
|
|
│ │ │
|
|
└──────────────────────────────┼──────────────────────┘
|
|
│ Weekly sync
|
|
┌─────────▼─────────┐
|
|
│ Scryfall Bulk │
|
|
│ Data API │
|
|
└───────────────────┘
|
|
```
|
|
|
|
### Components
|
|
|
|
1. **CardHashDatabase** - SQLite with pre-computed hashes for all cards
|
|
2. **ImagePreprocessor** - CLAHE, resize, normalize
|
|
3. **CardDetector** - Contour detection, perspective correction
|
|
4. **HashComputer** - Color-aware pHash implementation
|
|
5. **CardMatcher** - Hamming distance search with confidence scoring
|
|
6. **ScryfallSyncService** - Downloads bulk data, computes hashes, updates DB
|
|
|
|
### Database Schema
|
|
|
|
The schema mirrors Scryfall's data model with three main tables:
|
|
|
|
```sql
|
|
-- Abstract game cards (oracle)
|
|
CREATE TABLE oracles (
|
|
id TEXT PRIMARY KEY, -- Scryfall oracle_id
|
|
name TEXT NOT NULL,
|
|
mana_cost TEXT,
|
|
cmc REAL,
|
|
type_line TEXT,
|
|
oracle_text TEXT,
|
|
colors TEXT, -- JSON array
|
|
color_identity TEXT, -- JSON array
|
|
keywords TEXT, -- JSON array
|
|
reserved INTEGER DEFAULT 0,
|
|
legalities TEXT, -- JSON object
|
|
power TEXT,
|
|
toughness TEXT
|
|
);
|
|
|
|
-- MTG sets
|
|
CREATE TABLE sets (
|
|
id TEXT PRIMARY KEY, -- Scryfall set id
|
|
code TEXT NOT NULL UNIQUE, -- e.g., "lea", "mh2"
|
|
name TEXT NOT NULL, -- e.g., "Limited Edition Alpha"
|
|
set_type TEXT, -- e.g., "expansion", "core"
|
|
released_at TEXT,
|
|
card_count INTEGER,
|
|
icon_svg_uri TEXT,
|
|
digital INTEGER DEFAULT 0,
|
|
parent_set_code TEXT,
|
|
block TEXT
|
|
);
|
|
|
|
-- Card printings with perceptual hashes
|
|
CREATE TABLE cards (
|
|
id TEXT PRIMARY KEY, -- Scryfall card ID (printing)
|
|
oracle_id TEXT NOT NULL, -- FK to oracles
|
|
set_id TEXT NOT NULL, -- FK to sets
|
|
set_code TEXT,
|
|
name TEXT NOT NULL,
|
|
collector_number TEXT,
|
|
rarity TEXT,
|
|
artist TEXT,
|
|
illustration_id TEXT, -- Same across printings with identical art
|
|
image_uri TEXT,
|
|
hash BLOB, -- Perceptual hash for matching
|
|
lang TEXT DEFAULT 'en',
|
|
prices_usd REAL,
|
|
prices_usd_foil REAL,
|
|
FOREIGN KEY (oracle_id) REFERENCES oracles(id),
|
|
FOREIGN KEY (set_id) REFERENCES sets(id)
|
|
);
|
|
|
|
CREATE INDEX idx_cards_oracle_id ON cards(oracle_id);
|
|
CREATE INDEX idx_cards_set_id ON cards(set_id);
|
|
CREATE INDEX idx_cards_name ON cards(name);
|
|
```
|
|
|
|
### Phase 2: Enhanced (Add OCR Fallback)
|
|
|
|
Add ML Kit or Tesseract for OCR when hash matching confidence is low.
|
|
|
|
### Phase 3: Advanced (Neural Detection)
|
|
|
|
Replace contour-based card detection with YOLO or similar for complex scenes (multiple overlapping cards, cluttered backgrounds).
|
|
|
|
---
|
|
|
|
## Libraries & Tools
|
|
|
|
### .NET/MAUI Compatible
|
|
|
|
| Library | Purpose | Platform |
|
|
|---------|---------|----------|
|
|
| SkiaSharp | Image processing | All |
|
|
| OpenCvSharp4 | Advanced CV | Android/iOS/Windows |
|
|
| ImageSharp | Image manipulation | All |
|
|
| Emgu.CV | OpenCV wrapper | All |
|
|
| ML.NET | Machine learning | All |
|
|
| Plugin.Maui.OCR | On-device OCR | Android/iOS |
|
|
|
|
### Recommended Stack
|
|
|
|
```xml
|
|
<PackageReference Include="SkiaSharp" Version="2.88.7" />
|
|
<PackageReference Include="SkiaSharp.Views.Maui.Controls" Version="2.88.7" />
|
|
<PackageReference Include="ImageHash" Version="3.1.0" /> <!-- If porting from Python -->
|
|
<PackageReference Include="Microsoft.Data.Sqlite" Version="9.0.0" />
|
|
```
|
|
|
|
For perceptual hashing in C#, we'll need to implement it using SkiaSharp (no direct port of Python's imagehash exists).
|
|
|
|
---
|
|
|
|
## Test Image Categories
|
|
|
|
The `TestImages/` directory contains reference images for testing:
|
|
|
|
```
|
|
TestImages/
|
|
├── varying_quality/ # Different lighting, blur, exposure
|
|
│ ├── black.jpg
|
|
│ ├── counterspell_bgs.jpg
|
|
│ ├── dragon_whelp.jpg
|
|
│ ├── evil_eye.jpg
|
|
│ ├── instill_energy.jpg
|
|
│ ├── ruby.jpg
|
|
│ ├── card_in_plastic_case.jpg
|
|
│ ├── test1.jpg
|
|
│ ├── test2.jpg
|
|
│ └── test3.jpg
|
|
├── hands/ # Cards held in hand (partial visibility)
|
|
│ ├── hand_of_card_1.png
|
|
│ ├── hand_of_card_green_1.jpg
|
|
│ ├── hand_of_card_green_2.jpeg
|
|
│ ├── hand_of_card_ktk.png
|
|
│ ├── hand_of_card_red.jpeg
|
|
│ └── hand_of_card_tron.png
|
|
├── angled/ # Perspective distortion
|
|
│ ├── tilted_card_1.jpg
|
|
│ └── tilted_card_2.jpg
|
|
└── multiple_cards/ # Multiple cards in frame
|
|
├── alpha_deck.jpg
|
|
├── geyser_twister_fireball.jpg
|
|
├── lands_and_fatties.jpg
|
|
├── pro_tour_table.png
|
|
└── pro_tour_side.png
|
|
```
|
|
|
|
### Test Scenarios to Add
|
|
|
|
- [ ] Foil cards with glare
|
|
- [ ] Heavily played/worn cards
|
|
- [ ] Cards under glass/sleeve
|
|
- [ ] Low-light conditions
|
|
- [ ] Overexposed images
|
|
- [ ] Cards with shadows across them
|
|
- [ ] Non-English cards
|
|
- [ ] Tokens and emblems
|
|
- [ ] Old frame vs new frame cards
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [Scryfall API Docs](https://scryfall.com/docs/api)
|
|
- [MTG Card Detector (Python)](https://github.com/hj3yoo/mtg_card_detector)
|
|
- [Magic Card Detector Blog](https://tmikonen.github.io/quantitatively/2020-01-01-magic-card-detector/)
|
|
- [mtgscan (OCR approach)](https://pypi.org/project/mtgscan/)
|
|
- [Moss Machine (pHash + sorting)](https://github.com/KairiCollections/Moss-Machine---Magic-the-Gathering-recognition-and-sorting-machine)
|