How Azure Handles Large File Uploads: From Blob Storage to Event-Driven Processing (and What Breaks at 2AM)

Uploading a large file to Azure sounds simple — until you need to process it reliably, at scale, with retries, alerts, and zero surprises at 2AM.

This article walks through how Azure actually handles large file uploads, using a 10-GB video as a concrete example, and then dives into real-world failure modes that show up only in production.

We’ll cover:

  • How Azure uploads large files safely
  • When and how events are emitted
  • How Functions and queues fit together
  • Why retries and poison queues exist
  • What silently breaks when nobody is watching

Azure Blob Storage: Large Files, Small Pieces

Azure Blob Storage supports extremely large files — but never uploads them in a single request.

Most files are stored as block blobs, which are composed of many independently uploaded blocks.

Block blob limits (the important ones)

  • Max block size: 4 GiB
  • Max blocks per blob: 50,000
  • Max blob size: ~190 TiB

Example: Uploading a 10-GB video

A 10-GB video is uploaded as:

  • Block 1: 4 GB
  • Block 2: 4 GB
  • Block 3: ~2 GB

Each block is uploaded with Put Block, and once all blocks are present, a final Put Block List call commits the blob.

Key insight: Blocks are an upload implementation detail. Once committed, the blob is treated as a single file.

Client tools like AzCopy, Azure SDKs, and Storage Explorer handle this chunking automatically.


When Does Azure Emit an Event?

Uploading blocks does not trigger processing.

Events are emitted only after the blob is fully committed.

This is where Azure Event Grid comes in.

BlobCreated event flow

  1. Final Put Block List completes
  2. Blob Storage emits a BlobCreated event
  3. Event Grid routes the event to subscribers

Important: Event Grid fires once per blob, not once per block.

This guarantees downstream systems never see partial uploads.


Azure Functions: Reacting to Blob Uploads

Azure Functions does not poll Blob Storage in modern designs. Instead, it reacts to events.

Two trigger models (only one you should use)

  • Event Grid trigger (recommended)
    Push-based, near real-time, scalable
  • Classic Blob trigger (legacy)
    Polling-based, slower, less predictable

In production architectures, Event Grid–based triggers are the standard.


Why Queues Are Inserted into the Pipeline

Direct processing works — until load increases or dependencies slow down.

This is why many designs add a queue:

Azure Storage Queue

Blob uploaded
   ↓
Event Grid event
   ↓
Azure Function
   ↓
Message written to queue

Queues provide:

  • Backpressure
  • Retry handling
  • Isolation between ingestion and processing
  • Protection against traffic spikes

Visibility Timeouts: How Retries Actually Work

Storage queues don’t use acknowledgments. Instead, they rely on visibility timeouts.

What is a visibility timeout?

When a worker dequeues a message:

  • The message becomes invisible for a configured period
  • If processing succeeds → message is deleted
  • If processing fails → message becomes visible again

Each retry increments DequeueCount.

This is the foundation of retry behavior in Azure Storage Queues.


Poison Queues: When Retries Must Stop

Retries should never be infinite.

With Azure Functions + Storage Queues:

  • Once maxDequeueCount is exceeded
  • The message is automatically moved to: <queue-name>-poison

Poison queues:

  • Prevent endless retry loops
  • Preserve failed messages for investigation
  • Enable alerting and replay workflows

Failure Modes: “What Breaks at 2AM?”

This is where systems separate happy-path demos from production-ready architectures.

Most failures don’t look like outages — they look like silent degradation.


1️⃣ Event Grid Delivery Failures

Symptom: Blob exists, but processing never starts.

Cause

  • Subscription misconfiguration
  • Endpoint unavailable
  • Permission or auth issues

Mitigation

  • Enable Event Grid dead-lettering
  • Monitor delivery failure metrics
  • Build replay logic

2AM reality: Files are uploaded — nothing processes them.


2️⃣ Duplicate Event Delivery

Symptom: Same file processed twice.

Why
Event Grid guarantees at-least-once delivery, not exactly-once.

Mitigation

  • Idempotent processing
  • Track blob names, ETags, or IDs
  • Reject duplicates at the application layer

2AM reality: Duplicate records, duplicate invoices, duplicate emails.


3️⃣ Function Timeouts on Large Files

Symptom: Processing restarts or never completes.

Cause

  • Large file downloads
  • CPU-heavy transformations
  • Insufficient plan sizing

Mitigation

  • Increase visibility timeout
  • Stream blobs instead of loading into memory
  • Offload heavy work to batch or container jobs

2AM reality: Queue backlog grows quietly.


4️⃣ Queue Backlog Explosion

Symptom: Queue depth grows uncontrollably.

Cause

  • Ingestion spikes
  • Downstream throttling
  • Scaling limits

Mitigation

  • Monitor queue length and age
  • Scale consumers
  • Add rate limiting or backpressure

2AM reality: Customers ask why files are “stuck.”


5️⃣ Poison Queue Flood

Symptom: Many messages land in -poison.

Cause

  • Bad file formats
  • Schema changes
  • Logic bugs

Mitigation

  • Alert on poison queue count > 0
  • Log full failure context
  • Build replay workflows

2AM reality: Work is failing — but nobody is alerted.


6️⃣ Storage Cost Spikes from Retries

Symptom: Azure Storage bill jumps unexpectedly.

Cause

  • Short visibility timeouts
  • Repeated blob downloads
  • Excessive retries

Mitigation

  • Tune visibility timeouts
  • Cache progress
  • Monitor transaction counts, not just data size

2AM reality: Finance notices before engineering does.


7️⃣ Partial or Corrupted Uploads

Symptom: Function triggers but input file is invalid.

Cause

  • Client aborted uploads
  • Corrupted block lists
  • Non-atomic upload logic

Mitigation

  • Validate file size and checksum
  • Enforce minimum size thresholds
  • Delay processing until integrity checks pass

8️⃣ Downstream Dependency Failures

Symptom: Upload succeeds — final destination fails (SharePoint, APIs, DBs).

Mitigation

  • Exponential backoff
  • Dead-letter after max retries
  • Store intermediate results for replay

2AM reality: Azure is healthy — the external system isn’t.


9️⃣ Silent Failure (The Worst One)

Symptom: System is broken — nobody knows.

Fix
Monitor:

  • Function failure rates
  • Queue depth and age
  • Poison queue counts
  • Event Grid delivery failures

Final Takeaway

Large files in Azure Blob Storage are uploaded in blocks, but Event Grid emits a single event only after the blob is fully committed. Azure Functions react to that event, often enqueueing work for durable processing. Visibility timeouts handle retries, poison queues stop infinite failures, and production readiness depends on designing for duplicate events, backlogs, cost creep, and observability — not just the happy path.

FavoriteLoadingAdd to favorites

Author: Shahzad Khan

Software developer / Architect

Leave a Reply