Security & Performance
Scaling Applications
Scaling Applications
Application scaling ensures your system can handle growth in traffic, data, and users while maintaining performance and reliability.
Horizontal vs Vertical Scaling
Scaling Strategies:
- Vertical Scaling (Scale Up): Increase resources on a single server (more CPU, RAM, disk)
- Horizontal Scaling (Scale Out): Add more servers to distribute load
- Hybrid: Combination of both approaches
// Vertical Scaling Pros & Cons
Pros:
- Simpler to implement
- No application changes needed
- Consistent data (no synchronization)
Cons:
- Hardware limits (maximum capacity)
- Single point of failure
- Downtime during upgrades
- More expensive at scale
// Horizontal Scaling Pros & Cons
Pros:
- Nearly unlimited scalability
- High availability (redundancy)
- Cost-effective (commodity hardware)
- No downtime for scaling
Cons:
- Application complexity (stateless design)
- Data synchronization challenges
- Load balancing required
- Network overhead
Pros:
- Simpler to implement
- No application changes needed
- Consistent data (no synchronization)
Cons:
- Hardware limits (maximum capacity)
- Single point of failure
- Downtime during upgrades
- More expensive at scale
// Horizontal Scaling Pros & Cons
Pros:
- Nearly unlimited scalability
- High availability (redundancy)
- Cost-effective (commodity hardware)
- No downtime for scaling
Cons:
- Application complexity (stateless design)
- Data synchronization challenges
- Load balancing required
- Network overhead
Load Balancing
Distribute traffic across multiple servers:
# Nginx load balancer configuration
upstream app_servers {
least_conn; # Route to server with fewest connections
server app1.example.com:8000 weight=3;
server app2.example.com:8000 weight=2;
server app3.example.com:8000 weight=1;
server app4.example.com:8000 backup; # Failover server
# Health checks
keepalive 32;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://app_servers;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Connection settings
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
}
upstream app_servers {
least_conn; # Route to server with fewest connections
server app1.example.com:8000 weight=3;
server app2.example.com:8000 weight=2;
server app3.example.com:8000 weight=1;
server app4.example.com:8000 backup; # Failover server
# Health checks
keepalive 32;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://app_servers;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Connection settings
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
}
Load Balancing Algorithms
// Round Robin - Distribute requests sequentially
upstream app {
server app1:8000;
server app2:8000;
server app3:8000;
}
// Least Connections - Route to least busy server
upstream app {
least_conn;
server app1:8000;
server app2:8000;
}
// IP Hash - Same client always goes to same server
upstream app {
ip_hash;
server app1:8000;
server app2:8000;
}
// Weighted - Distribute by server capacity
upstream app {
server app1:8000 weight=3; # High capacity
server app2:8000 weight=1; # Low capacity
}
upstream app {
server app1:8000;
server app2:8000;
server app3:8000;
}
// Least Connections - Route to least busy server
upstream app {
least_conn;
server app1:8000;
server app2:8000;
}
// IP Hash - Same client always goes to same server
upstream app {
ip_hash;
server app1:8000;
server app2:8000;
}
// Weighted - Distribute by server capacity
upstream app {
server app1:8000 weight=3; # High capacity
server app2:8000 weight=1; # Low capacity
}
Auto-Scaling
Automatically adjust capacity based on demand:
# AWS Auto Scaling Policy (example)
Resources:
AppAutoScalingTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 10
MinCapacity: 2
ResourceId: service/my-cluster/my-service
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
ScaleUpPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: ScaleUp
PolicyType: TargetTrackingScaling
TargetTrackingScalingPolicyConfiguration:
PredefinedMetricSpecification:
PredefinedMetricType: ECSServiceAverageCPUUtilization
TargetValue: 70.0 # Scale when CPU > 70%
ScaleInCooldown: 300
ScaleOutCooldown: 60
Resources:
AppAutoScalingTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 10
MinCapacity: 2
ResourceId: service/my-cluster/my-service
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
ScaleUpPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: ScaleUp
PolicyType: TargetTrackingScaling
TargetTrackingScalingPolicyConfiguration:
PredefinedMetricSpecification:
PredefinedMetricType: ECSServiceAverageCPUUtilization
TargetValue: 70.0 # Scale when CPU > 70%
ScaleInCooldown: 300
ScaleOutCooldown: 60
Stateless Architecture
Design for horizontal scalability:
// ❌ BAD: Stateful - stores data in server memory
class SessionController
{
private $userData = []; // In-memory storage
public function store($userId, $data)
{
$this->userData[$userId] = $data;
}
}
// ✅ GOOD: Stateless - stores data in shared cache
class SessionController
{
public function store($userId, $data)
{
Cache::put("user_session_{$userId}", $data, 3600);
}
public function get($userId)
{
return Cache::get("user_session_{$userId}");
}
}
// Configure Laravel for stateless sessions
// .env
SESSION_DRIVER=redis
CACHE_DRIVER=redis
QUEUE_CONNECTION=redis
// config/session.php
'driver' => env('SESSION_DRIVER', 'redis'),
class SessionController
{
private $userData = []; // In-memory storage
public function store($userId, $data)
{
$this->userData[$userId] = $data;
}
}
// ✅ GOOD: Stateless - stores data in shared cache
class SessionController
{
public function store($userId, $data)
{
Cache::put("user_session_{$userId}", $data, 3600);
}
public function get($userId)
{
return Cache::get("user_session_{$userId}");
}
}
// Configure Laravel for stateless sessions
// .env
SESSION_DRIVER=redis
CACHE_DRIVER=redis
QUEUE_CONNECTION=redis
// config/session.php
'driver' => env('SESSION_DRIVER', 'redis'),
Database Scaling: Replication
Scale reads with read replicas:
// config/database.php - Master-Slave configuration
'mysql' => [
'read' => [
'host' => [
'192.168.1.2', // Read replica 1
'192.168.1.3', // Read replica 2
],
],
'write' => [
'host' => ['192.168.1.1'], // Master
],
'sticky' => true, // Read your own writes
],
// Explicitly use read connection
$users = DB::connection('mysql')->table('users')->get();
// Force write connection
$user = DB::connection('mysql::write')->table('users')->find($id);
'mysql' => [
'read' => [
'host' => [
'192.168.1.2', // Read replica 1
'192.168.1.3', // Read replica 2
],
],
'write' => [
'host' => ['192.168.1.1'], // Master
],
'sticky' => true, // Read your own writes
],
// Explicitly use read connection
$users = DB::connection('mysql')->table('users')->get();
// Force write connection
$user = DB::connection('mysql::write')->table('users')->find($id);
Database Scaling: Sharding
Partition data across multiple databases:
// Horizontal sharding by user ID
class ShardedDatabase
{
const SHARD_COUNT = 4;
public static function getShardForUser($userId)
{
$shardId = $userId % self::SHARD_COUNT;
return "mysql_shard_{$shardId}";
}
public static function getUserData($userId)
{
$connection = self::getShardForUser($userId);
return DB::connection($connection)
->table('users')
->where('id', $userId)
->first();
}
}
// config/database.php - Multiple shards
'connections' => [
'mysql_shard_0' => [...],
'mysql_shard_1' => [...],
'mysql_shard_2' => [...],
'mysql_shard_3' => [...],
],
class ShardedDatabase
{
const SHARD_COUNT = 4;
public static function getShardForUser($userId)
{
$shardId = $userId % self::SHARD_COUNT;
return "mysql_shard_{$shardId}";
}
public static function getUserData($userId)
{
$connection = self::getShardForUser($userId);
return DB::connection($connection)
->table('users')
->where('id', $userId)
->first();
}
}
// config/database.php - Multiple shards
'connections' => [
'mysql_shard_0' => [...],
'mysql_shard_1' => [...],
'mysql_shard_2' => [...],
'mysql_shard_3' => [...],
],
Sharding Challenges: Cross-shard queries are expensive, joins across shards are difficult, rebalancing shards is complex, and choosing the right shard key is critical.
Microservices Scaling
Scale services independently:
# docker-compose.yml - Independent service scaling
version: '3.8'
services:
web:
image: my-app/web
deploy:
replicas: 3
resources:
limits:
cpus: '0.5'
memory: 512M
api:
image: my-app/api
deploy:
replicas: 5 # API needs more instances
resources:
limits:
cpus: '1.0'
memory: 1G
worker:
image: my-app/worker
deploy:
replicas: 2
resources:
limits:
cpus: '2.0'
memory: 2G
version: '3.8'
services:
web:
image: my-app/web
deploy:
replicas: 3
resources:
limits:
cpus: '0.5'
memory: 512M
api:
image: my-app/api
deploy:
replicas: 5 # API needs more instances
resources:
limits:
cpus: '1.0'
memory: 1G
worker:
image: my-app/worker
deploy:
replicas: 2
resources:
limits:
cpus: '2.0'
memory: 2G
CDN Scaling
Scale static content delivery globally:
// CloudFront configuration example
const cloudfront = new CloudFront({
origins: [
{
id: 'S3-my-bucket',
domainName: 'my-bucket.s3.amazonaws.com',
},
],
defaultCacheBehavior: {
targetOriginId: 'S3-my-bucket',
viewerProtocolPolicy: 'redirect-to-https',
allowedMethods: ['GET', 'HEAD', 'OPTIONS'],
cachedMethods: ['GET', 'HEAD'],
compress: true,
minTTL: 0,
defaultTTL: 86400, // 24 hours
maxTTL: 31536000, // 1 year
},
priceClass: 'PriceClass_All', // Global distribution
});
// Laravel - Serve assets from CDN
// config/app.php
'asset_url' => env('ASSET_URL', 'https://cdn.example.com'),
// In blade templates
<img src="{{ asset('images/logo.png') }}">
// Outputs: https://cdn.example.com/images/logo.png
const cloudfront = new CloudFront({
origins: [
{
id: 'S3-my-bucket',
domainName: 'my-bucket.s3.amazonaws.com',
},
],
defaultCacheBehavior: {
targetOriginId: 'S3-my-bucket',
viewerProtocolPolicy: 'redirect-to-https',
allowedMethods: ['GET', 'HEAD', 'OPTIONS'],
cachedMethods: ['GET', 'HEAD'],
compress: true,
minTTL: 0,
defaultTTL: 86400, // 24 hours
maxTTL: 31536000, // 1 year
},
priceClass: 'PriceClass_All', // Global distribution
});
// Laravel - Serve assets from CDN
// config/app.php
'asset_url' => env('ASSET_URL', 'https://cdn.example.com'),
// In blade templates
<img src="{{ asset('images/logo.png') }}">
// Outputs: https://cdn.example.com/images/logo.png
Queue-Based Scaling
Process background jobs with scalable workers:
// Dispatch jobs to queue
ProcessVideoJob::dispatch($video)->onQueue('videos');
SendEmailJob::dispatch($email)->onQueue('emails');
// Scale workers independently by queue
# Start 10 workers for video processing
php artisan queue:work --queue=videos --tries=3 &
php artisan queue:work --queue=videos --tries=3 &
# ... (repeat 10 times)
# Start 5 workers for email sending
php artisan queue:work --queue=emails --tries=3 &
# ... (repeat 5 times)
# Supervisor config for auto-restart
[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/artisan queue:work --sleep=3 --tries=3
autostart=true
autorestart=true
numprocs=8 # Number of worker processes
user=www-data
ProcessVideoJob::dispatch($video)->onQueue('videos');
SendEmailJob::dispatch($email)->onQueue('emails');
// Scale workers independently by queue
# Start 10 workers for video processing
php artisan queue:work --queue=videos --tries=3 &
php artisan queue:work --queue=videos --tries=3 &
# ... (repeat 10 times)
# Start 5 workers for email sending
php artisan queue:work --queue=emails --tries=3 &
# ... (repeat 5 times)
# Supervisor config for auto-restart
[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/artisan queue:work --sleep=3 --tries=3
autostart=true
autorestart=true
numprocs=8 # Number of worker processes
user=www-data
Caching for Scale
Reduce database load with multi-layer caching:
// Layer 1: Application cache (Redis)
$users = Cache::remember('active_users', 3600, function () {
return DB::table('users')->where('active', 1)->get();
});
// Layer 2: Query result cache
$posts = DB::table('posts')
->where('published', 1)
->remember(3600) // Cache for 1 hour
->get();
// Layer 3: HTTP cache headers
return response()->view('posts.index', compact('posts'))
->header('Cache-Control', 'public, max-age=3600');
// Layer 4: Reverse proxy cache (Varnish/Nginx)
// Nginx configuration
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m;
location / {
proxy_cache my_cache;
proxy_cache_valid 200 1h;
proxy_pass http://app_servers;
}
$users = Cache::remember('active_users', 3600, function () {
return DB::table('users')->where('active', 1)->get();
});
// Layer 2: Query result cache
$posts = DB::table('posts')
->where('published', 1)
->remember(3600) // Cache for 1 hour
->get();
// Layer 3: HTTP cache headers
return response()->view('posts.index', compact('posts'))
->header('Cache-Control', 'public, max-age=3600');
// Layer 4: Reverse proxy cache (Varnish/Nginx)
// Nginx configuration
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m;
location / {
proxy_cache my_cache;
proxy_cache_valid 200 1h;
proxy_pass http://app_servers;
}
Monitoring Scaling Metrics
// Key metrics to monitor when scaling
$metrics = [
// Infrastructure
'cpu_usage_percent' => 70, // Trigger: > 80%
'memory_usage_percent' => 65, // Trigger: > 85%
'disk_usage_percent' => 50, // Trigger: > 90%
// Application
'requests_per_second' => 1200, // Capacity planning
'avg_response_time_ms' => 150, // Trigger: > 500ms
'error_rate_percent' => 0.1, // Trigger: > 1%
// Database
'db_connections_used' => 45, // Max: 100
'query_time_p95_ms' => 200, // Trigger: > 1000ms
'cache_hit_rate_percent' => 92, // Target: > 90%
// Queue
'queue_size' => 150, // Trigger: > 1000
'queue_processing_time_ms' => 500,
];
$metrics = [
// Infrastructure
'cpu_usage_percent' => 70, // Trigger: > 80%
'memory_usage_percent' => 65, // Trigger: > 85%
'disk_usage_percent' => 50, // Trigger: > 90%
// Application
'requests_per_second' => 1200, // Capacity planning
'avg_response_time_ms' => 150, // Trigger: > 500ms
'error_rate_percent' => 0.1, // Trigger: > 1%
// Database
'db_connections_used' => 45, // Max: 100
'query_time_p95_ms' => 200, // Trigger: > 1000ms
'cache_hit_rate_percent' => 92, // Target: > 90%
// Queue
'queue_size' => 150, // Trigger: > 1000
'queue_processing_time_ms' => 500,
];
Best Practice: Start with vertical scaling for simplicity, migrate to horizontal scaling before hitting hardware limits, design for statelessness early, use managed services (RDS, ElastiCache, SQS) to reduce operational complexity, and always load test before production.
Exercise: Design a scaling architecture for your application. Identify stateful components and plan how to make them stateless. Configure session storage in Redis. Set up a simple load balancer with Nginx for 2+ app instances. Test horizontal scaling by measuring performance improvements.