From health monitoring to automated alerting, discover how teams use the Pharos SDK to build more reliable applications and respond faster to incidents.
Track the health of your microservices across multiple environments. Send periodic health pings to ensure services are alive and responding.
import { PharosClient } from 'pharos-sdk';
const pharos = new PharosClient({
apiKey: process.env.PHAROS_API_KEY,
serviceName: 'payment-service',
environment: 'production'
});
// Send health ping every 5 minutes
setInterval(async () => {
await pharos.health.ping();
}, 5 * 60 * 1000);Monitor all your background jobs, cron tasks, and async operations. Track success rates, failure patterns, and execution durations.
// Track any background job
await pharos.jobs.start({ jobName: 'data-export' });
try {
await exportDataToS3();
await pharos.jobs.complete({
jobName: 'data-export',
duration: 1500
});
} catch (error) {
await pharos.jobs.fail({
jobName: 'data-export',
error: error.message
});
}Automatically create incidents when critical errors occur in your application. No manual incident creation needed.
// Automatically create incident on critical error
try {
await connectToDatabase();
} catch (error) {
await pharos.incidents.create({
title: 'Database Connection Failed',
severity: 'critical',
description: `Unable to connect: ${error.message}`,
metadata: {
serviceName: 'api-server',
errorCode: error.code
}
});
throw error;
}Get your entire team notified when incidents occur or jobs fail. Email alerts sent automatically to all users and contacts.
// All team members receive email when this fails
await pharos.jobs.fail({
jobName: 'email-batch-send',
error: 'SMTP timeout after 30s',
duration: 30000
});
// Email subject: [Alert] Job Failed: email-batch-send
// All active users + contacts notified automaticallyMonitor database connection pools, query performance, and availability. Create incidents when database issues are detected.
// Monitor database connection pool
const pool = await getConnectionPool();
if (pool.activeConnections > pool.maxConnections * 0.9) {
await pharos.incidents.create({
title: 'Database Pool Near Limit',
severity: 'high',
description: `Pool at ${pool.activeConnections}/${pool.maxConnections}`
});
}Track job execution times and detect when performance degrades. Set up automated alerts when operations take longer than expected.
const startTime = Date.now();
await pharos.jobs.start({ jobName: 'report-generation' });
const result = await generateReport();
const duration = Date.now() - startTime;
// Alert if job takes longer than 5 seconds
if (duration > 5000) {
await pharos.incidents.create({
title: 'Report Generation Slow',
severity: 'medium',
description: `Took ${duration}ms (expected <5000ms)`
});
}
await pharos.jobs.complete({ jobName: 'report-generation', duration });Track API usage and create incidents when approaching rate limits. Prevent service disruptions from hitting third-party limits.
// Monitor third-party API usage
const usage = await checkStripeAPIUsage();
if (usage.remaining < usage.limit * 0.1) {
await pharos.incidents.create({
title: 'Stripe API Rate Limit Warning',
severity: 'medium',
description: `Only ${usage.remaining} requests remaining`
});
}Send health pings after deployments to verify services started correctly. Create incidents if post-deployment health checks fail.
// After deployment
async function verifyDeployment() {
await new Promise(resolve => setTimeout(resolve, 10000)); // Wait 10s
try {
await pharos.health.ping();
console.log('✅ Deployment successful');
} catch (error) {
await pharos.incidents.create({
title: 'Deployment Failed - Service Unhealthy',
severity: 'critical',
description: 'Service failed health check post-deployment'
});
}
}Track the same services across production, staging, and development. Identify environment-specific issues quickly.
// Production instance
const pharosProd = new PharosClient({
apiKey: process.env.PHAROS_API_KEY,
serviceName: 'api-server',
environment: 'production'
});
// Staging instance
const pharosStaging = new PharosClient({
apiKey: process.env.PHAROS_API_KEY,
serviceName: 'api-server',
environment: 'staging'
});
// Both appear separately in dashboardCreate high-severity incidents when security events are detected. Track failed login attempts, unauthorized access, or suspicious activity.
// Track security events
async function handleFailedLogin(username, ip) {
failedAttempts.increment(ip);
if (failedAttempts.get(ip) > 5) {
await pharos.incidents.create({
title: 'Possible Brute Force Attack',
severity: 'high',
description: `${failedAttempts.get(ip)} failed attempts from ${ip}`,
metadata: { username, ip, timestamp: Date.now() }
});
}
}When SDK creates incidents, all team members are notified. Use the dashboard to track resolution status and collaborate.
// SDK creates incident - all team members notified
await pharos.incidents.create({
title: 'Payment Gateway Timeout',
severity: 'critical',
description: 'Stripe API timing out after 30s'
});
// Team sees incident in dashboard
// Updates status: investigating → identified → resolved
// All team members receive status change emailsMonitor service uptime and job success rates. Track SLA compliance and identify services that need attention.
// SDK Activity dashboard shows:
// - Job success rates (95%, 87%, 100%)
// - Last run times for each job
// - Service health status (healthy, stale, inactive)
// - Incident history and resolution times
// Perfect for SLA reporting and customer communicationTrack any custom events or metrics using the SDK. Store metadata for later analysis and debugging.
// Track custom events with metadata
await pharos.jobs.complete({
jobName: 'data-sync',
duration: 2500,
metadata: {
recordsProcessed: 15000,
dataSource: 'postgres',
compressionRatio: 0.73,
s3Bucket: 'backups-prod'
}
});
// View metadata in SDK Activity dashboardUse the SDK Activity dashboard to debug production issues. See exactly when jobs failed, what errors occurred, and service health status.
// SDK Activity dashboard shows:
// 1. Health Pings - Which services are alive
// 2. Jobs - Which jobs failed and why
// 3. Incidents - What went wrong and when
// Example: "Why did data-export fail yesterday?"
// → Check Jobs tab → See failure at 3:42 AM
// → Error: "S3 bucket permission denied"
// → Duration: 5000ms (usually 1500ms)Install the Pharos SDK and start monitoring your services in minutes.
Install via npm:
npm install pharos-sdk