Why Microservices?
After building and maintaining several monolithic applications, I learned the hard way when to (and when not to) use microservices. Here’s what 5+ years of experience taught me.
When to Choose Microservices
Good Candidates ✅
- Large teams (10+ developers)
- Multiple business domains (e-commerce: inventory, orders, payments)
- Different scaling needs (payment service needs 10x more resources)
- Independent deployments required
- Technology diversity needed (Python for ML, Go for performance)
Stay Monolithic ❌
- Small teams (<5 developers)
- Single domain problem
- Tight coupling between features
- Limited resources for DevOps
- MVP or prototype stage
Core Principles
1. Single Responsibility
Each service should do ONE thing well:
✅ Good:
- user-service (authentication, profiles)
- order-service (order management)
- payment-service (payment processing)
❌ Bad:
- core-service (everything)
- api-service (all endpoints)
2. Database per Service
Each service owns its data:
// user-service/database
interface UserDB {
id: string;
name: string;
email: string;
passwordHash: string;
}
// order-service/database
interface OrderDB {
id: string;
userId: string; // Reference only, no JOIN
items: OrderItem[];
total: number;
}
3. API Gateway Pattern
Centralize routing and cross-cutting concerns:
// gateway/routes.ts
import express from 'express';
const app = express();
// Authentication middleware
app.use(async (req, res, next) => {
const token = req.headers.authorization;
const user = await validateToken(token);
req.user = user;
next();
});
// Route to services
app.use('/api/users', proxyTo('http://user-service:3001'));
app.use('/api/orders', proxyTo('http://order-service:3002'));
app.use('/api/payments', proxyTo('http://payment-service:3003'));
// Rate limiting
app.use(rateLimit({
windowMs: 15 * 60 * 1000,
max: 100
}));
Communication Patterns
Synchronous: REST/gRPC
For real-time request-response:
// order-service calling payment-service
async function createOrder(orderData: CreateOrderDTO) {
// 1. Create order
const order = await db.orders.create(orderData);
// 2. Process payment synchronously
try {
const payment = await fetch('http://payment-service/api/payments', {
method: 'POST',
body: JSON.stringify({
orderId: order.id,
amount: order.total,
userId: order.userId
})
});
if (!payment.ok) {
// Rollback order
await db.orders.delete(order.id);
throw new Error('Payment failed');
}
return order;
} catch (error) {
await db.orders.delete(order.id);
throw error;
}
}
Asynchronous: Message Queue
For eventual consistency and resilience:
// Using RabbitMQ/Redis/Kafka
import { publishEvent, subscribeEvent } from './messageQueue';
// order-service: Publish event
async function createOrder(orderData: CreateOrderDTO) {
const order = await db.orders.create({
...orderData,
status: 'pending'
});
// Publish event for other services
await publishEvent('order.created', {
orderId: order.id,
userId: order.userId,
total: order.total
});
return order;
}
// payment-service: Subscribe to event
subscribeEvent('order.created', async (event) => {
const { orderId, userId, total } = event;
const payment = await processPayment(userId, total);
// Publish result
await publishEvent(
payment.success ? 'payment.completed' : 'payment.failed',
{ orderId, paymentId: payment.id }
);
});
// notification-service: Subscribe to event
subscribeEvent('payment.completed', async (event) => {
await sendEmail(event.orderId, 'Payment successful!');
});
Service Discovery
Using Docker Compose (Development)
version: '3.8'
services:
api-gateway:
build: ./gateway
ports:
- "3000:3000"
environment:
USER_SERVICE_URL: http://user-service:3001
ORDER_SERVICE_URL: http://order-service:3002
user-service:
build: ./services/user
ports:
- "3001:3001"
depends_on:
- postgres-users
order-service:
build: ./services/order
ports:
- "3002:3002"
depends_on:
- postgres-orders
- redis
postgres-users:
image: postgres:16
environment:
POSTGRES_DB: users
postgres-orders:
image: postgres:16
environment:
POSTGRES_DB: orders
redis:
image: redis:7-alpine
Using Kubernetes (Production)
# user-service-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
spec:
replicas: 3
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
spec:
containers:
- name: user-service
image: myregistry/user-service:latest
ports:
- containerPort: 3001
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: user-db-secret
key: url
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
name: user-service
spec:
selector:
app: user-service
ports:
- port: 3001
targetPort: 3001
type: ClusterIP
Monitoring & Observability
Health Checks
// Every service should have health endpoints
app.get('/health', (req, res) => {
res.json({ status: 'healthy', timestamp: new Date() });
});
app.get('/health/ready', async (req, res) => {
try {
await db.raw('SELECT 1');
res.json({ status: 'ready' });
} catch (error) {
res.status(503).json({ status: 'not ready', error: error.message });
}
});
Distributed Tracing
// Using OpenTelemetry
import { trace } from '@opentelemetry/api';
const tracer = trace.getTracer('order-service');
async function createOrder(orderData: CreateOrderDTO) {
const span = tracer.startSpan('createOrder');
try {
// Add attributes
span.setAttribute('user.id', orderData.userId);
span.setAttribute('order.total', orderData.total);
const order = await db.orders.create(orderData);
// Child span for payment
const paymentSpan = tracer.startSpan('processPayment', {
parent: span
});
const payment = await processPayment(order);
paymentSpan.end();
span.setStatus({ code: SpanStatusCode.OK });
return order;
} catch (error) {
span.setStatus({
code: SpanStatusCode.ERROR,
message: error.message
});
throw error;
} finally {
span.end();
}
}
Centralized Logging
// Structured logging with correlation IDs
import winston from 'winston';
const logger = winston.createLogger({
format: winston.format.json(),
defaultMeta: { service: 'order-service' },
transports: [
new winston.transports.File({ filename: 'error.log', level: 'error' }),
new winston.transports.File({ filename: 'combined.log' })
]
});
// Add correlation ID middleware
app.use((req, res, next) => {
req.correlationId = req.headers['x-correlation-id'] || crypto.randomUUID();
res.setHeader('x-correlation-id', req.correlationId);
next();
});
// Log with correlation ID
logger.info('Order created', {
correlationId: req.correlationId,
orderId: order.id,
userId: order.userId
});
Common Challenges & Solutions
1. Data Consistency
Problem: Distributed transactions across services
Solution: Saga pattern
// Order saga orchestrator
async function createOrderSaga(orderData: CreateOrderDTO) {
const sagaId = crypto.randomUUID();
const compensation: (() => Promise<void>)[] = [];
try {
// Step 1: Create order
const order = await orderService.create(orderData);
compensation.push(() => orderService.delete(order.id));
// Step 2: Reserve inventory
await inventoryService.reserve(order.items);
compensation.push(() => inventoryService.release(order.items));
// Step 3: Process payment
const payment = await paymentService.charge(order.total);
compensation.push(() => paymentService.refund(payment.id));
// Success!
await orderService.confirm(order.id);
return order;
} catch (error) {
// Compensate in reverse order
for (const compensate of compensation.reverse()) {
await compensate().catch(err =>
logger.error('Compensation failed', { sagaId, error: err })
);
}
throw error;
}
}
2. Service Discovery
Problem: Services need to find each other dynamically
Solution: Service mesh (Istio, Linkerd) or DNS-based discovery
3. Testing
Problem: Integration testing is complex
Solution: Contract testing + component testing
// Contract test using Pact
import { Pact } from '@pact-foundation/pact';
describe('Order Service -> Payment Service', () => {
const provider = new Pact({
consumer: 'order-service',
provider: 'payment-service'
});
it('should process payment successfully', async () => {
await provider.addInteraction({
state: 'user has valid payment method',
uponReceiving: 'a payment request',
withRequest: {
method: 'POST',
path: '/api/payments',
body: { amount: 99.99, userId: 'user-123' }
},
willRespondWith: {
status: 200,
body: { paymentId: 'pay-456', status: 'completed' }
}
});
// Test against contract
const result = await orderService.processPayment({
amount: 99.99,
userId: 'user-123'
});
expect(result.status).toBe('completed');
});
});
Real-World Example: E-Learning Platform
I migrated our monolithic LMS to microservices. Here’s the architecture:
Services
- auth-service - JWT tokens, OAuth
- user-service - Profiles, enrollments
- course-service - Course content, chapters
- submission-service - Code submissions, grading
- notification-service - Email, push notifications
- analytics-service - Usage tracking, reports
Results
- Deployment frequency: 1/month → 10/day
- Bug fix time: 2 days → 4 hours
- Scalability: Manual → Auto-scaling
- Team productivity: +40%
- Downtime: 99.5% → 99.9% uptime
Conclusion
Microservices are powerful but complex. Key takeaways:
- Start with a monolith, migrate when needed
- Design for failure (circuit breakers, retries)
- Invest in observability from day one
- Automate everything (CI/CD, testing, deployment)
- Document service contracts
- Choose async communication when possible
Remember: Microservices solve organizational problems, not technical ones. If your team isn’t growing, a monolith might be the right choice.
Happy architecting! 🏗️