skysignal:agent

v1.0.1Published yesterday

SkySignal Agent

Official APM agent for monitoring Meteor.js applications with SkySignal.

Features

  • System Metrics Monitoring - CPU, memory, disk, and network usage
  • Method Performance Traces - Track Meteor Method execution with operation-level profiling
  • Publication Monitoring - Monitor publication performance and subscriptions
  • Error Tracking - Automatic error capture with client context
  • HTTP Request Monitoring - Track outgoing HTTP requests
  • Database Query Monitoring - MongoDB query performance tracking
  • Real User Monitoring (RUM) - Browser-side Core Web Vitals (LCP, FID, CLS, TTFB, FCP, TTI) with automatic performance warnings
  • SPA Route Tracking - Automatic performance collection on every route change
  • Session Tracking - 30-minute user sessions with localStorage persistence
  • Browser Context - Automatic device, browser, OS, and network information collection
  • Batch Processing - Efficient batching and async delivery to minimize performance impact

Installation

Add the package to your Meteor application:

meteor add skysignal:agent

Quick Start

1. Get Your API Key

Sign up at SkySignal and create a new site to get your API key.

2. Configure the Agent

In your Meteor server startup code (e.g., server/main.js):

1import { Meteor } from 'meteor/meteor';
2import { SkySignalAgent } from 'meteor/skysignal:agent';
3
4Meteor.startup(() => {
5  // Configure the agent
6  SkySignalAgent.configure({
7    apiKey: process.env.SKYSIGNAL_API_KEY || 'your-api-key-here',
8    enabled: true,
9    host: 'my-app-server-1', // Optional: defaults to hostname
10    appVersion: '1.2.3', // Optional: auto-detected from package.json
11
12    // Optional: Customize collection intervals
13    systemMetricsInterval: 60000, // 1 minute (default)
14    flushInterval: 10000, // 10 seconds (default)
15    batchSize: 50, // Max items per batch (default)
16
17    // Optional: Sampling for high-traffic apps
18    traceSampleRate: 1.0, // 100% of traces (reduce for high volume)
19
20    // Optional: Feature toggles
21    collectTraces: true,
22    collectMongoPool: true,
23    collectDDPConnections: true,
24    collectJobs: true
25  });
26
27  // Start monitoring
28  SkySignalAgent.start();
29});

3. Add to Settings File

For production, use Meteor settings. The agent auto-initializes from settings if configured:

settings-production.json:

1{
2  "skysignal": {
3    "apiKey": "sk_your_api_key_here",
4    "enabled": true,
5    "host": "production-server-1",
6    "appVersion": "1.2.3",
7    "traceSampleRate": 0.5,
8    "collectTraces": true,
9    "collectMongoPool": true,
10    "collectDDPConnections": true,
11    "collectJobs": true,
12    "captureIndexUsage": true,
13    "indexUsageSampleRate": 0.05
14  },
15  "public": {
16    "skysignal": {
17      "publicKey": "pk_your_public_key_here",
18      "rum": {
19        "enabled": true,
20        "sampleRate": 0.5
21      }
22    }
23  }
24}

The agent auto-starts when it finds valid configuration in Meteor.settings.skysignal.

Manual initialization (optional):

1import { SkySignalAgent } from 'meteor/skysignal:agent';
2
3Meteor.startup(() => {
4  // Only needed if not using settings auto-initialization
5  const config = Meteor.settings.skysignal;
6
7  if (config && config.apiKey) {
8    SkySignalAgent.configure(config);
9    SkySignalAgent.start();
10  } else {
11    console.warn('⚠️ SkySignal not configured - monitoring disabled');
12  }
13});

Configuration Options

API Configuration

OptionTypeDefaultDescription
apiKeyStringrequiredYour SkySignal API key (sk_ prefix)
endpointStringhttps://dash.skysignal.appSkySignal API endpoint
enabledBooleantrueEnable/disable the agent

Host & Version Identification

OptionTypeDefaultDescription
hostStringos.hostname()Host identifier for this instance
appVersionStringAuto-detectApp version from package.json or manually configured
buildHashStringAuto-detectBuild hash for source map lookup. Auto-detects from BUILD_HASH or GIT_SHA environment variables

Batching Configuration

OptionTypeDefaultDescription
batchSizeNumber50Max items per batch before auto-flush
batchSizeBytesNumber262144Max bytes (256KB) per batch
flushIntervalNumber10000Interval (ms) to flush batched data

Sampling Rates

OptionTypeDefaultDescription
traceSampleRateNumber1.0Server trace sample rate (0-1). Set to 0.1 for 10%
rumSampleRateNumber0.5RUM sample rate (0-1). 50% by default for high-volume

Collection Intervals

OptionTypeDefaultDescription
systemMetricsIntervalNumber60000System metrics collection interval (1 minute)
mongoPoolIntervalNumber60000MongoDB pool metrics interval (1 minute)
collectionStatsIntervalNumber300000Collection stats interval (5 minutes)
ddpConnectionsIntervalNumber30000DDP connection updates interval (30 seconds)
jobsIntervalNumber30000Background job stats interval (30 seconds)

Feature Flags

OptionTypeDefaultDescription
collectSystemMetricsBooleantrueCollect system metrics (CPU, memory, disk, network)
collectTracesBooleantrueCollect method/publication traces
collectErrorsBooleantrueCollect errors and exceptions
collectHttpRequestsBooleantrueCollect HTTP request metrics
collectMongoPoolBooleantrueCollect MongoDB connection pool metrics
collectCollectionStatsBooleantrueCollect MongoDB collection statistics
collectDDPConnectionsBooleantrueCollect DDP/WebSocket connection metrics
collectLiveQueriesBooleantrueCollect Meteor live query (oplog/polling) metrics
collectJobsBooleantrueCollect background job metrics
collectRUMBooleanfalseClient-side RUM (disabled by default, requires publicKey)

MongoDB Pool Configuration

OptionTypeDefaultDescription
mongoPoolFixedConnectionMemoryNumbernullOptional: fixed bytes per connection for memory estimation

Method Tracing Configuration

OptionTypeDefaultDescription
traceMethodArgumentsBooleantrueCapture method arguments (sanitized)
maxArgLengthNumber1000Max string length for arguments
traceMethodOperationsBooleantrueCapture detailed operation timeline

Index Usage Tracking

OptionTypeDefaultDescription
captureIndexUsageBooleantrueCapture MongoDB index usage via explain()
indexUsageSampleRateNumber0.05Sample 5% of queries for explain()
explainVerbosityStringexecutionStatsqueryPlanner | executionStats | allPlansExecution
explainSlowQueriesOnlyBooleanfalseOnly explain queries exceeding slow threshold

Performance Safeguards

OptionTypeDefaultDescription
maxBatchRetriesNumber3Max retries for failed batches
requestTimeoutNumber3000API request timeout (3 seconds)
maxMemoryMBNumber50Max memory (MB) for batches

Worker Offload (Large Pools)

OptionTypeDefaultDescription
useWorkerThreadBooleanfalseEnable worker thread for large pools
workerThresholdNumber50Spawn worker if pool size exceeds this

Background Job Monitoring

OptionTypeDefaultDescription
collectJobsBooleantrueEnable background job monitoring
jobsIntervalNumber30000Job stats collection interval (30 seconds)
jobsPackageStringnullAuto-detect, or specify: "msavin:sjobs"

What Gets Monitored

System Metrics (Automatic)

The agent automatically collects:

  • CPU Usage - Overall CPU utilization percentage
  • CPU Cores - Number of CPU cores available
  • Load Average - 1m, 5m, 15m load averages
  • Memory Usage - Total, used, free, and percentage
  • Disk Usage - Disk space utilization (platform-dependent)
  • Network Traffic - Bytes in/out (platform-dependent)
  • Process Count - Number of running processes (platform-dependent)

Collected every 60 seconds by default.

Method Traces

Automatic instrumentation of Meteor Methods:

  • Method name and execution time
  • Operation-level breakdown (DB queries, async operations, compute time)
  • Detailed MongoDB operation tracking with explain() support
  • N+1 query detection and slow query analysis
  • this.unblock() analysis with optimization recommendations
  • Wait time tracking (DDP queue, connection pool)
  • Error tracking with stack traces
  • User context and session correlation

Publication Monitoring

Track publication performance:

  • Publication name and execution time
  • Subscription lifecycle tracking
  • Document counts (added, changed, removed)
  • Data transfer size estimation
  • Live query efficiency (oplog vs polling)

DDP Connection Monitoring

Real-time WebSocket connection tracking:

  • Active connection count and status
  • Message volume (sent/received) by type
  • Bandwidth usage per connection
  • Latency measurements (ping/pong)
  • Subscription tracking per connection

MongoDB Pool Monitoring

Connection pool health and performance:

  • Pool configuration (min/max size, timeouts)
  • Active vs available connections
  • Checkout wait times (avg, max, P95)
  • Queue length and timeout tracking
  • Memory usage estimation

Live Query Monitoring

Meteor reactive query tracking:

  • Observer count by collection
  • Oplog vs polling efficiency
  • Document update rates
  • Performance ratings (optimal/good/slow)
  • Query signature deduplication

Background Job Monitoring

Track msavin:sjobs (Steve Jobs) and other job packages:

  • Job execution times and status
  • Queue length and worker utilization
  • Failed job tracking with error details
  • Job type categorization

Error Tracking

Automatic error capture:

  • Server-side errors with stack traces
  • Client-side errors with browser context
  • Error grouping and fingerprinting
  • Affected users and methods
  • Build hash correlation for source maps

Real User Monitoring (RUM) - Client-Side

Automatic browser-side performance monitoring collecting Core Web Vitals and providing PageSpeed-style performance warnings.

What Gets Collected

Core Web Vitals:

  • LCP (Largest Contentful Paint) - Measures loading performance
    • Good: <2.5s | Needs Improvement: 2.5-4s | Poor: >4s
  • FID (First Input Delay) - Measures interactivity
    • Good: <100ms | Needs Improvement: 100-300ms | Poor: >300ms
  • CLS (Cumulative Layout Shift) - Measures visual stability
    • Good: <0.1 | Needs Improvement: 0.1-0.25 | Poor: >0.25
  • TTFB (Time to First Byte) - Measures server response time
    • Good: <800ms | Needs Improvement: 800-1800ms | Poor: >1800ms
  • FCP (First Contentful Paint) - Measures perceived load speed
    • Good: <1.8s | Needs Improvement: 1.8-3s | Poor: >3s
  • TTI (Time to Interactive) - Measures time until page is fully interactive
    • Good: <3.8s | Needs Improvement: 3.8-7.3s | Poor: >7.3s

Additional Context:

  • Browser name and version
  • Device type (mobile, tablet, desktop)
  • Operating system
  • Network connection type, downlink speed, RTT
  • Viewport and screen dimensions
  • User ID (via Meteor.userId() for correlation with server-side traces)
  • Session ID (30-minute sessions with localStorage persistence)
  • Page route and referrer
  • Top 10 slowest resources

Configuration

RUM monitoring auto-initializes from your Meteor settings.

settings-development.json:

1{
2  "skysignal": {
3    "apiKey": "sk_your_server_api_key_here",
4    "endpoint": "http://localhost:3000"
5  },
6  "public": {
7    "skysignal": {
8      "publicKey": "pk_your_public_key_here",
9      "endpoint": "http://localhost:3000",
10      "rum": {
11        "enabled": true,
12        "sampleRate": 1.0,
13        "debug": false
14      }
15    }
16  }
17}

Configuration Options:

OptionTypeDefaultDescription
publicKeyStringrequiredSkySignal Public Key (pk_ prefix) - Safe for client-side use
endpointString(same origin)Base URL of SkySignal API (e.g., http://localhost:3000 or https://dash.skysignal.app)
rum.enabledBooleantrueEnable/disable RUM collection
rum.sampleRateNumberAutoSample rate (0-1). Auto: 100% for localhost, 50% for production
rum.debugBooleanfalseEnable console logging for debugging

Key Security Note:

  • API Key (sk_ prefix): Server-side only, keep in private settings.skysignal. Used for server-to-server communication.
  • Public Key (pk_ prefix): Client-side safe, can be in settings.public.skysignal. Used for browser RUM collection.
  • This follows the Stripe pattern of separating public/private keys for security.

The agent automatically:

  • Collects Core Web Vitals using Google's web-vitals library
  • Tracks SPA route changes and collects metrics for each route
  • Batches measurements and sends via fire-and-forget HTTP with keepalive: true
  • Provides PageSpeed-style console warnings for poor performance
  • Correlates metrics with server-side traces via Meteor.userId()

SPA Route Change Tracking

The RUM client automatically detects route changes in single-page applications by:

  • Overriding history.pushState and history.replaceState
  • Listening for popstate events (browser back/forward)
  • Listening for hashchange events (hash-based routing)

Each route change triggers a new performance collection, allowing you to track performance across your entire application navigation flow.

Performance Warnings

When Core Web Vitals exceed recommended thresholds, the RUM collector logs PageSpeed-style warnings to the console:

[SkySignal RUM] Largest Contentful Paint (LCP) is slow: 4200ms. LCP should be under 2.5s for good user experience. Consider optimizing images, removing render-blocking resources, and improving server response times.

These warnings help developers identify performance issues during development and testing.

Manual Usage (Advanced)

While RUM auto-initializes, you can also use it manually:

1import { SkySignalRUM } from 'meteor/skysignal:agent';
2
3// Check if initialized
4if (SkySignalRUM.isInitialized()) {
5  // Get current session ID
6  const sessionId = SkySignalRUM.getSessionId();
7
8  // Get current metrics (for debugging)
9  const metrics = SkySignalRUM.getMetrics();
10
11  // Get performance warnings (for debugging)
12  const warnings = SkySignalRUM.getWarnings();
13
14  // Manually track a page view (for custom routing)
15  SkySignalRUM.trackPageView('/custom-route');
16}

How It Works

  1. Session Management - Creates a 30-minute session in localStorage, renews on user activity
  2. Core Web Vitals Collection - Uses Google's web-vitals library for accurate measurements
  3. Browser Context Collection - Detects browser, device, OS, network info from user agent and Navigator API
  4. Performance Warnings - Compares metrics against PageSpeed thresholds and logs warnings
  5. Batching - Batches measurements (default: 10 per batch, 5-second flush interval)
  6. HTTP Transmission - Sends to /api/v1/rum endpoint with keepalive: true for reliability
  7. SPA Detection - Automatically resets and re-collects metrics on route changes

Advanced Usage

Manual Metric Collection

Send custom metrics to SkySignal:

1import { SkySignalAgent } from 'meteor/skysignal:agent';
2
3// Send a custom metric
4SkySignalAgent.client.addCustomMetric({
5  timestamp: new Date(),
6  metricName: 'checkout_completed',
7  metricValue: 1,
8  unit: 'count',
9  tags: {
10    product: 'premium',
11    region: 'us-east-1'
12  }
13});

Manual Trace Submission

Track custom operations:

1const startTime = Date.now();
2
3// Your code here...
4
5SkySignalAgent.client.addTrace({
6  traceType: 'method',
7  methodName: 'myCustomOperation',
8  timestamp: new Date(startTime),
9  duration: Date.now() - startTime,
10  userId: this.userId,
11  operations: [
12    { type: 'start', time: 0, details: {} },
13    { type: 'db', time: 50, details: { collection: 'users', func: 'findOne' } },
14    { type: 'complete', time: 150, details: {} }
15  ]
16});

Stopping the Agent

To gracefully stop the agent (e.g., during shutdown):

1SkySignalAgent.stop();

This will:

  1. Stop all collectors
  2. Flush any remaining batched data
  3. Clear all intervals

Performance Impact

The agent is designed to have minimal performance impact:

  • Batching - Data is batched and sent asynchronously
  • Non-blocking - HTTP requests use Meteor.defer() to avoid blocking
  • Configurable Intervals - Adjust collection frequency based on your needs
  • Automatic Retries - Failed requests are re-queued automatically

Typical overhead:

  • CPU: < 1% additional usage
  • Memory: ~10-20MB for batching
  • Network: ~1KB per metric, sent in batches

Troubleshooting

Agent Not Sending Data

  1. Check that your API key is correct
  2. Verify enabled: true in configuration
  3. Check server logs for error messages
  4. Verify network connectivity to SkySignal API

High Memory Usage

If you notice high memory usage:

  1. Reduce batchSize to flush data more frequently
  2. Reduce collection intervals
  3. Disable collectors you don't need

Missing System Metrics

Some system metrics (disk, network, process count) require platform-specific APIs:

  • Use the systeminformation npm package for comprehensive cross-platform metrics
  • These metrics may return null on certain platforms

API Reference

SkySignalAgent

Main agent singleton instance.

Methods

  • configure(options) - Configure the agent with options
  • start() - Start all collectors and monitoring
  • stop() - Stop all collectors and flush data

Properties

  • client - HTTP client instance for manual data submission
  • config - Current configuration object
  • collectors - Active collector instances

Support

Changelog

v2.0.0 (Full APM Release)

  • Complete Method Tracing - Automatic instrumentation with operation-level profiling
  • MongoDB Query Analysis - explain() support, N+1 detection, slow query analysis
  • this.unblock() Analysis - Optimization recommendations for blocking methods
  • DDP Connection Monitoring - Real-time WebSocket tracking with latency metrics
  • MongoDB Pool Monitoring - Connection pool health, checkout times, queue tracking
  • Live Query Monitoring - Oplog vs polling efficiency tracking
  • Background Job Monitoring - Support for msavin:sjobs with extensible adapter system
  • HTTP Request Monitoring - Automatic tracking of server HTTP requests
  • Collection Stats - MongoDB collection size and index statistics
  • App Version Tracking - Auto-detection from package.json with manual override
  • Build Hash Tracking - Source map correlation via BUILD_HASH/GIT_SHA env vars
  • Performance Safeguards - Memory limits, request timeouts, batch retries

v1.1.0 (RUM Release)

  • Real User Monitoring (RUM) - Client-side Core Web Vitals collection (LCP, FID, CLS, TTFB, FCP, TTI)
  • PageSpeed-Style Warnings - Automatic performance threshold warnings in console
  • SPA Route Tracking - Automatic performance collection on every route change
  • Session Management - 30-minute sessions with localStorage persistence
  • Browser Context Collection - Automatic device, browser, OS, network information
  • User Correlation - Uses Meteor.userId() to correlate with server-side traces
  • Fire-and-Forget HTTP - Reliable transmission with keepalive during page unload
  • Configurable Sampling - Auto-detects environment (100% dev, 50% prod) or manual configuration
  • web-vitals Integration - Uses Google's official Core Web Vitals library

v1.0.0 (Initial Release)

  • System metrics monitoring (CPU, memory, load average)
  • HTTP client with batching and auto-flush
  • Configurable collection intervals
  • Basic error handling and retry logic
  • Multi-tenant ready architecture