← Back to Cases
Intelligent Invoice Data Extraction 📋 Architected

NFSe-IA

Azure Functions | Durable Functions | Document Intelligence | Donut ONNX | AI Search | Cosmos DB | Service Bus | .NET 8

The Challenge

Momentum client needed to automate extraction, validation, and categorization of thousands of Service Invoices with varying layouts from 4+ different municipalities, maintaining competitive cost (< R$ 0.55/invoice) and high accuracy for ERP integration.

The Solution

Azure serverless system with dual AI pipeline: Document Intelligence as primary engine and Donut ONNX as intelligent fallback when confidence < 0.7. Includes active learning with automatic nightly retraining and multi-environment (DEV/HML/PROD) with cost control.

Architecture

  • Azure Functions (.NET 8 Isolated) for PDF ingestion
  • Durable Functions (PipelineOrchestrator) for orchestration
  • Azure Service Bus Standard (nfse-main queue)
  • Azure Document Intelligence container v3.1 (08-16h window)
  • Donut-pt-invoice ONNX as fallback (confidence < 0.7)
  • Azure AI Search Basic Vector for semantic categorization
  • Cosmos DB Serverless with 5-year TTL (LGPD)
  • Hot Blob Storage for PDFs (30-day retention)
  • Key Vault with Managed Identity for secrets
  • Application Insights (25% sampling) for telemetry

Metrics

  • Extraction accuracy: ≥ 95% (F1 score)
  • Categorization accuracy: ≥ 90%
  • Average latency: ≤ 8 seconds
  • Unit cost: R$ 0.45-0.55/invoice (2 pages)
  • Fixed infra cost: R$ 420-720/month

Differentiators

  • Intelligent fallback: DI → Donut ONNX when confidence < 0.7
  • Active learning with nightly retrain via GitHub Actions
  • Scheduled DI container (08-16h) for cost savings
  • Multi-environment with budgets and alerts (DEV R$300, PROD R$20k)
  • Automatic mocks in DEV (USE_MOCK_DOCINT=true)
  • VNet + Private Endpoints + TLS 1.2+ + LGPD compliant

Results

  • Volume processed: 2,317 invoices/month (April)
  • Support for 4 different municipality layouts
  • Timeline: 8-12 business days to reach target accuracy
  • 7 delivery milestones defined

Scale: 2,317 invoices/month | 4 layouts | 1,000+ PDFs dataset | Client: Momentum