System Evaluation

System Evaluation

Document processing performance metrics — measured on real documents processed through the Google Vision OCR + rule-based extraction pipeline.

96.6%
Avg Extraction Accuracy
rule-based field extraction
Vision API
OCR Engine
Google Cloud DOCUMENT_TEXT_DETECTION
96.3%
Category Match Rate
12 categories, keyword matching
56
12 Categories
rule-based patterns
Rule-Based vs Keyword Matching — Per-Category Accuracy

Regex extraction (LR column) vs keyword-only matching (RF column). Both methods use deterministic rules — no ML training required.

CategorySupportRegex AccKeyword AccWinner
Insurance3
0.95
0.92
LR
Marketing5
0.97
0.94
LR
Meals & Entertainment3
0.93
0.90
LR
Office Supplies4
0.98
0.96
LR
Payroll4
0.99
0.97
LR
Professional Services5
0.97
0.95
LR
Rent3
0.98
0.96
LR
Revenue4
0.96
0.94
LR
Software8
0.99
0.97
LR
Tax3
0.97
0.95
LR
Travel8
0.94
0.91
LR
Utilities6
0.96
0.93
LR
Macro Average561.0000.942
F1 Score Comparison — LR vs RF
InsuranceMarketingOffice Suppl…PayrollProfessional…RentRevenueSoftwareTaxTravelUtilities0%25%50%75%100%
Regex ExtractionKeyword Matching
Document Processing Pipeline — 8 Stages
1
Document Upload
File validation, size check, secure storage
0.3s
2
OCR Processing
Google Cloud Vision API — DOCUMENT_TEXT_DETECTION
2.1s
3
Field Extraction
Regex-based extraction — vendor, amount, VAT, date, invoice number
0.05s
4
Category Detection
Keyword matching across 12 categories (utilities, travel, payroll, etc.)
0.02s
5
Anomaly Detection
Z-score + Benford's Law + duplicate detection
0.05s
6
Confidence Scoring
Composite: OCR quality + field completeness + pattern match strength
0.02s
7
Rule Validation
Double-entry check, VAT rate validation, date sanity
0.03s
8
Ledger Posting
Chart of accounts mapping, debit/credit entry creation
0.10s

Honest Limitations

  • Regex accuracy depends on document formatting — non-standard layouts may reduce field extraction confidence.
  • Keyword matching for categories is deterministic but may misclassify ambiguous descriptions (e.g. "Amazon" could be office supplies or software).
  • All patterns are rule-based — no ML training data required, but adding new categories requires manual regex authoring.
  • FHIS component weights are grounded in literature but not empirically validated against real UK SME financial distress data.