ETL vs API vs Events: Integration Pattern Selection

Overview

Different integration patterns serve different use cases. This decision framework helps architects and developers choose the right integration pattern based on requirements for data volume, latency, directionality, and system coupling.

Core Principle: Choose the integration pattern that best matches your requirements for data volume, latency, directionality, and system coupling. There is no one-size-fits-all solution.

Prerequisites

Required Knowledge:

Recommended Reading:

When to Use Each Pattern

Use ETL (Extract, Transform, Load) When

Common Use Cases:

Use API (REST/SOAP) When

Common Use Cases:

Use Events (Platform Events, CDC, Streaming API) When

Common Use Cases:

Decision Framework

Step 1: Assess Data Volume

High Volume (Millions of records):

Medium Volume (Thousands to hundreds of thousands):

Low Volume (Hundreds to thousands):

Step 2: Assess Latency Requirements

Real-time (Seconds to minutes):

Near-real-time (Minutes to hours):

Batch (Hours to days):

Step 3: Assess Directionality

Unidirectional (Salesforce β†’ External):

Unidirectional (External β†’ Salesforce):

Bidirectional:

Step 4: Assess System Coupling

Tight Coupling Acceptable:

Loose Coupling Required:

Step 5: Assess Transformation Complexity

Simple Mapping:

Complex Transformation:

Pattern Comparison Matrix

Criteria ETL API Events
Data Volume High (millions) Low-Medium (thousands) Medium-High (thousands-millions)
Latency Batch (hours-days) Real-time (seconds) Near-real-time (minutes)
Directionality Bidirectional Unidirectional/Bidirectional Unidirectional (typically)
Coupling Medium Tight Loose
Transformation Complex Simple-Medium Simple
Error Handling Batch retry Immediate feedback Async retry
Cost Low (bulk operations) Medium (per API call) Low (event subscriptions)
Scalability High Medium High
Complexity Medium-High Low-Medium Medium

Hybrid Approaches

ETL + Events

Pattern: Use ETL for initial bulk load, then Events for incremental updates.

Use Case: Migrate historical data via ETL, then use CDC for real-time incremental sync.

Example:

API + Events

Pattern: Use API for synchronous operations, Events for asynchronous notifications.

Use Case: Create record via API (immediate response), publish event for downstream systems.

Example:

ETL + API

Pattern: Use ETL for bulk operations, API for real-time operations.

Use Case: Bulk import via ETL, real-time updates via API.

Example:

Common Anti-Patterns

❌ Using API for High-Volume Batch Operations

Problem: Making millions of API calls for bulk data operations.

Solution: Use ETL tools with Bulk API or Data Loader.

Impact: Governor limit violations, performance issues, high costs.

❌ Using ETL for Real-Time Requirements

Problem: Using batch ETL when real-time sync is required.

Solution: Use API or Events for real-time requirements.

Impact: Business delays, poor user experience.

❌ Using Events for Simple Request-Response

Problem: Using events when synchronous request-response is needed.

Solution: Use API for synchronous operations.

Impact: Unnecessary complexity, delayed responses.

❌ Tight Coupling with Events

Problem: Subscribers directly depend on event structure.

Solution: Use event versioning and transformation layers.

Impact: Brittle integrations, difficult maintenance.

Q&A

Q: When should I use ETL instead of API?

A: Use ETL when: (1) High data volume (millions of records), (2) Batch processing acceptable (hourly/daily syncs), (3) Complex transformations needed, (4) Bidirectional sync required, (5) Cost optimization important (bulk operations are cheaper). ETL is designed for bulk data movement and transformation.

Q: When should I use API instead of ETL?

A: Use API when: (1) Real-time requirements (immediate synchronization), (2) Low to medium volume (thousands of records), (3) Transactional operations (create orders, process payments), (4) User-initiated actions (user interactions), (5) Immediate error feedback needed. API provides synchronous request-response interactions.

Q: When should I use Events instead of API?

A: Use Events when: (1) Event-driven architecture (decoupled systems), (2) Multiple subscribers (multiple systems react to same event), (3) Near-real-time notifications (minutes latency acceptable), (4) System decoupling required (source doesn’t know about subscribers), (5) Scalability important (high event volumes). Events enable asynchronous, decoupled communication.

Q: Can I use multiple integration patterns together?

A: Yes, hybrid approaches are common: (1) ETL + Events (bulk load via ETL, incremental via events), (2) API + Events (synchronous operations via API, async notifications via events), (3) ETL + API (bulk operations via ETL, real-time via API). Choose patterns based on specific use case requirements.

Q: How do I choose between REST API and SOAP API?

A: Choose REST API for: (1) Modern integrations (simpler, JSON-based), (2) Mobile apps (lightweight), (3) Stateless operations (REST is stateless). Choose SOAP API for: (1) Enterprise integrations (WS-Security, transactions), (2) Complex operations (SOAP supports complex types), (3) Legacy systems (many legacy systems use SOAP). REST is generally preferred for new integrations.

Q: How do I choose between Platform Events and Change Data Capture?

A: Choose Platform Events for: (1) Custom events (business events, not just data changes), (2) Event orchestration (complex event-driven workflows), (3) Custom payloads (structured event data). Choose Change Data Capture for: (1) Standard object changes (automatic change notifications), (2) Data synchronization (sync data to external systems), (3) Audit and compliance (track all data changes). Use CDC for data changes, Platform Events for business events.

Q: What are common mistakes when choosing integration patterns?

A: Common mistakes: (1) Using API for high-volume batch (use ETL instead), (2) Using ETL for real-time requirements (use API/Events), (3) Using Events for simple request-response (use API), (4) Tight coupling with events (use event versioning), (5) Not considering hybrid approaches (combine patterns when needed). Choose patterns based on requirements, not convenience.

Q: How do I handle errors in each integration pattern?

A: ETL: (1) Batch retry (retry failed records in next batch), (2) Error logging (log errors to error object), (3) Manual intervention (review error reports). API: (1) Immediate feedback (return errors in response), (2) Retry logic (implement retry with exponential backoff), (3) Error handling (try-catch blocks, error responses). Events: (1) Async retry (subscribers implement retry), (2) Dead letter queue (failed events to DLQ), (3) Event replay (replay events for recovery).