Document Intelligence: From 40,000 Unclassified Files to Structured Data
The Problem
A small business lending company came to us with a familiar story: years of accumulated loan documents — promissory notes, guarantees, UCC filings, insurance certificates — sitting in a shared drive with no classification, no indexing, and no way to answer basic questions like "which borrowers have expiring insurance?"
40,000 documents. Zero structure.
The Approach
Rather than building a custom NLP pipeline from scratch, we deployed a three-stage process:
Stage 1: Classification
An AI model trained on lending document archetypes classified each file into one of 23 document types. Accuracy after the first pass: 91%. After a human-in-the-loop correction cycle on the edge cases: 97%.
Stage 2: Extraction
Key fields pulled from each document — borrower name, loan number, dates, amounts, covenants — and mapped to the existing loan management system schema.
Stage 3: Monitoring
Automated alerts for expiring documents, missing stips, and covenant triggers. The ops team went from reactive firefighting to proactive portfolio management.
The Results
- Processing time: 40,000 documents classified in 72 hours
- Accuracy: 97% classification accuracy
- Ongoing value: Automated monitoring catches 15+ exceptions per week that were previously missed
- Team impact: 2 FTEs reallocated from manual review to higher-value work
Key Takeaway
Document intelligence isn't about replacing people. It's about giving your team the structured data they need to make better decisions faster.