Testing Strategy

The Ominis Cluster Manager uses a compartmentalized testing approach that separates tests by scope, speed, and dependencies. This enables fast development cycles while maintaining comprehensive coverage for production deployments.

Overview

The testing strategy is built on three core principles:

Fast Feedback: Unit and API tests run in milliseconds, enabling rapid iteration
Comprehensive Coverage: 100+ endpoints tested with error scenarios
Infrastructure Testing: E2E tests validate Kubernetes pod lifecycle separately

All tests use production-ready defaults (e.g., longest-idle-agent strategy, 1000 max sessions) to catch configuration issues early.

Test Structure

The test suite is organized into four compartments, each serving a specific purpose:

Directory Structure

tests/
├── unit/                           # Unit tests (no external dependencies)
│   ├── test_auth.py               # API key authentication logic
│   ├── test_error_handlers.py    # Error handling and response formatting
│   └── test_middleware.py        # Middleware (CORS, Prometheus)
│
├── api/                            # API contract tests (mocked backends)
│   ├── test_queues_comprehensive.py          # Queue CRUD endpoints
│   ├── test_extensions_comprehensive.py      # Extension management
│   ├── test_callcenter_comprehensive.py      # Agents/tiers/members
│   ├── test_telephony_comprehensive.py       # Call control endpoints
│   ├── test_campaigns_comprehensive.py       # Campaign management
│   ├── test_acl_comprehensive.py             # ACL management
│   ├── test_call_control_comprehensive.py    # n8n call control API
│   ├── test_directory_comprehensive.py       # FreeSWITCH XML-CURL
│   ├── test_ivr_n8n_comprehensive.py         # IVR n8n integration
│   └── test_channels_comprehensive.py        # Channel monitoring
│
├── e2e/                            # End-to-end infrastructure tests
│   ├── test_01_health.py          # Health/connectivity checks
│   ├── test_02_queue_lifecycle.py # Queue pod creation/deletion (K8s)
│   ├── test_03_acl_reload.py      # ACL ConfigMap + FreeSWITCH reload
│   └── test_04_ivr_lifecycle.py   # IVR pod creation/deletion (K8s)
│
├── integration/                    # Multi-step workflow tests
│   ├── test_queue_workflow.py     # Create queue → agent → tier → call
│   ├── test_extension_workflow.py # Create extension → register → call
│   └── test_campaign_workflow.py  # Create campaign → contacts → monitor
│
└── helpers/                        # Shared test utilities
    ├── endpoint_registry.py       # Complete endpoint registry
    └── schema_validators.py       # Schema validation helpers

Test Types

Unit Tests

Purpose: Test individual functions/classes in isolation

Speed: Very fast (<10ms per test)
Dependencies: None (fully mocked)
Coverage: Business logic, validators, utilities
Location: tests/unit/

@pytest.mark.unit
@pytest.mark.asyncio
class TestAPIAuthentication:
    async def test_missing_api_key_returns_401(self, unauthenticated_client):
        response = await unauthenticated_client.get("/v1/queues")
        
        assert response.status_code == status.HTTP_401_UNAUTHORIZED
        data = response.json()
        assert data["code"] == "MISSING_API_KEY"

API Tests

Purpose: Test API contracts, schemas, and error handling

Speed: Fast (10-100ms per test)
Dependencies: Mocked (database, XML-RPC, Kubernetes)
Coverage: Every endpoint has happy path + error scenarios
Location: tests/api/

@pytest.mark.api
@pytest.mark.kubernetes
@pytest.mark.asyncio
class TestQueuesListEndpoint:
    async def test_list_queues_success(self, api_client):
        response = await api_client.get("/v1/queues")
        
        assert response.status_code == status.HTTP_200_OK
        data = response.json()
        assert "queues" in data
        assert "total" in data

E2E Tests

Purpose: Test infrastructure integration (Kubernetes, FreeSWITCH)

Speed: Slow (1-5 minutes per test with pod creation)
Dependencies: Real infrastructure required
Coverage: Pod lifecycle, ConfigMap changes, system readiness
Location: tests/e2e/

@pytest.mark.asyncio
@pytest.mark.timeout(300)
async def test_queue_pod_lifecycle(api_client):
    # Create queue
    create_response = await api_client.post("/v1/queues", json=queue_data)
    assert create_response.status_code == 201
    
    # Wait for pod ready
    await wait_for_pod_ready("queue-sales", timeout=120)
    
    # Verify pod exists
    status_response = await api_client.get("/v1/queues/sales/status")
    assert status_response.json()["pod_status"] == "Running"

Integration Tests

Purpose: Test multi-step workflows across endpoints

Speed: Medium (100ms-1s per test)
Dependencies: Real or mocked depending on workflow
Coverage: User journeys and cross-cutting concerns
Location: tests/integration/

@pytest.mark.integration
@pytest.mark.asyncio
async def test_complete_queue_workflow(api_client):
    # 1. Create queue
    queue_response = await api_client.post("/v1/queues", json=queue_data)
    queue_name = queue_response.json()["name"]
    
    # 2. Add agent
    agent_response = await api_client.post("/v1/agents", json=agent_data)
    
    # 3. Create tier
    tier_response = await api_client.post("/v1/tiers", json={
        "queue": queue_name,
        "agent": agent_data["name"]
    })
    
    # 4. Verify call can be placed
    call_response = await api_client.post(f"/v1/telephony/originate", 
        json=originate_data)
    assert call_response.status_code == 200

Backend Classification

Tests are tagged by backend type to track infrastructure dependencies and identify bottlenecks:

Marker Definitions

Marker	Backend	Endpoint Count	Example Endpoints
`@pytest.mark.xml_rpc`	FreeSWITCH XML-RPC	~35	`/v1/telephony/originate`, `/v1/campaigns/{id}/start`
`@pytest.mark.database`	PostgreSQL	~40	`/v1/extensions`, `/v1/agents`
`@pytest.mark.kubernetes`	K8s API	~15	`/v1/queues`, `/v1/acl`
`@pytest.mark.hybrid`	Multiple backends	~20	`/v1/freeswitch/directory`, `/v1/extensions/{id}/reload`
`@pytest.mark.static`	Configuration/health	~10	`/health`, `/metrics`

Usage Example

@pytest.mark.api
@pytest.mark.xml_rpc  # This test uses FreeSWITCH XML-RPC
@pytest.mark.asyncio
async def test_originate_call_success(api_client):
    response = await api_client.post("/v1/telephony/originate", json={
        "destination": "1001@default",
        "caller_id_number": "5551234"
    })
    assert response.status_code == 200

Running Tests

Quick Commands

# Fast development loop (recommended)
pytest tests/api/ -v                    # All API tests (~10-100ms each)
pytest tests/api/test_queues_comprehensive.py -v  # Specific router

# Run by test type
pytest tests/unit/ -v                   # Unit tests only
pytest tests/e2e/ -v --timeout=300      # E2E tests (slow)
pytest tests/integration/ -v            # Integration workflows

# Run all tests
make test                               # Uses pytest -q
pytest                                  # All tests with verbose output

Run by Backend Type

Filter tests by infrastructure dependency:

# Run only database tests
pytest -m database -v

# Run only XML-RPC tests  
pytest -m xml_rpc -v

# Run only Kubernetes tests
pytest -m kubernetes -v

# Run hybrid backend tests
pytest -m hybrid -v

# Skip slow E2E tests
pytest -m "not e2e" -v

Makefile Integration

# Test targets in Makefile
make test           # Run all tests (pytest -q)
make lint           # Run linters (ruff + black)
make doctor         # Environment checks

# Combined workflow
make lint && make test

Test Execution Flow

Coverage Strategy

Endpoint Coverage Goals

✅ 100% endpoint coverage: Every endpoint tested
✅ Error scenarios: 401, 404, 400, 409, 500
✅ 90%+ line coverage: On router files
✅ Schema validation: All fields validated

Coverage by Router

Router	Endpoints	Happy Path	Error Cases	Status
Queues	8	✅	✅	Complete
Extensions	12	✅	✅	Complete
Callcenter Direct	15	✅	✅	Complete
Telephony	25+	✅	✅	Complete
Campaigns	10	✅	✅	Complete
ACL	4	✅	✅	Complete
Call Control	13	✅	✅	Complete
Directory	1	✅	✅	Complete
IVR n8n	7	✅	✅	Complete
Channels	3	✅	✅	Complete
TOTAL	98+	✅	✅	Complete

Generate Coverage Report

# Run tests with coverage tracking
pytest --coverage-report=reports/test-coverage-report.json

# View summary
cat reports/test-coverage-report.json | jq '.summary'

# View backend breakdown
cat reports/test-coverage-report.json | jq '.backends'

# View endpoint details
cat reports/test-coverage-report.json | jq '.endpoints[] | select(.tested == false)'

Example Coverage Report Output

{
  "summary": {
    "total_endpoints": 98,
    "tested_endpoints": 98,
    "coverage_percentage": 100.0,
    "test_count": 156,
    "passed": 156,
    "failed": 0
  },
  "backends": {
    "xml_rpc": {
      "count": 35,
      "tested": 35,
      "coverage": 100.0
    },
    "database": {
      "count": 40,
      "tested": 40,
      "coverage": 100.0
    },
    "kubernetes": {
      "count": 15,
      "tested": 15,
      "coverage": 100.0
    }
  }
}

Production-Ready Test Data

All test data uses production-appropriate values to catch configuration issues early:

Queue Configuration

# ✅ CORRECT - Production defaults
queue_data = {
    "name": "sales-queue",
    "strategy": "longest-idle-agent",  # NOT "ring-all"
    "max_sessions": 1000,              # NOT 10000
    "max_wait_time": 300,              # 5 minutes
    "tier_rules_apply": True,
    "agent_timeout": 30
}

# ❌ WRONG - Non-production values
queue_data = {
    "strategy": "ring-all",           # Testing artifact
    "max_sessions": 10000             # Unrealistic
}

Extension Configuration

# ✅ CORRECT - Strong passwords
extension_data = {
    "extension_number": "1001",
    "password": "SecurePass123!",     # 12+ characters, strong
    "display_name": "John Doe",
    "domain": "default"
}

# ❌ WRONG - Weak passwords
extension_data = {
    "password": "1234"                # Too weak for production
}

Schema Validation

Every test validates:

Required fields present
Optional fields have sensible defaults
Types match Pydantic models
Constraints enforced (min/max, regex)

# Example schema validation
validator = SchemaValidator()
warnings = validator.validate_queue_config(queue_data)

# Warns on non-production values
if warnings:
    pytest.warn(f"Non-production config detected: {warnings}")

CI/CD Integration

GitHub Actions Example

name: Test Suite

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v3
      
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.12'
      
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
      
      - name: Run linters
        run: make lint
      
      - name: Run unit tests
        run: pytest tests/unit/ -v
      
      - name: Run API tests
        run: pytest tests/api/ -v
      
      - name: Generate coverage report
        run: pytest --coverage-report=reports/test-coverage-report.json
      
      - name: Upload coverage report
        uses: actions/upload-artifact@v3
        with:
          name: test-coverage-report
          path: reports/test-coverage-report.json

Pre-commit Hooks

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: pytest-unit
        name: pytest unit tests
        entry: pytest tests/unit/ -v
        language: system
        pass_filenames: false
        always_run: true
      
      - id: pytest-api
        name: pytest api tests
        entry: pytest tests/api/ -v
        language: system
        pass_filenames: false
        always_run: true

Test Examples

Unit Test Example

"""Unit tests for error handling."""

import pytest
from fastapi import status

@pytest.mark.unit
@pytest.mark.asyncio
class TestErrorHandlers:
    async def test_http_exception_handler(self, api_client):
        """Test HTTPException is formatted correctly."""
        # Trigger 404 error
        response = await api_client.get("/v1/queues/nonexistent")
        
        assert response.status_code == status.HTTP_404_NOT_FOUND
        data = response.json()
        
        # Validate ErrorResponse structure
        assert "code" in data
        assert "message" in data
        assert "details" in data
        assert data["code"] == "QUEUE_NOT_FOUND"

API Test Example

"""Comprehensive API tests for Extensions."""

import pytest
from fastapi import status

@pytest.mark.api
@pytest.mark.database
@pytest.mark.asyncio
class TestExtensionsCreateEndpoint:
    async def test_create_extension_success(self, api_client, test_extension):
        """Test creating extension with valid data."""
        response = await api_client.post("/v1/extensions", json=test_extension)
        
        assert response.status_code == status.HTTP_201_CREATED
        data = response.json()
        
        # Validate response structure
        assert data["extension_number"] == test_extension["extension_number"]
        assert data["domain"] == test_extension["domain"]
        assert "created_at" in data
        
    async def test_create_extension_duplicate_returns_409(self, api_client):
        """Test creating duplicate extension returns 409 Conflict."""
        extension_data = {
            "extension_number": "1001",
            "password": "SecurePass123!",
            "domain": "default"
        }
        
        # Create first time
        await api_client.post("/v1/extensions", json=extension_data)
        
        # Try to create duplicate
        response = await api_client.post("/v1/extensions", json=extension_data)
        
        assert response.status_code == status.HTTP_409_CONFLICT
        data = response.json()
        assert data["code"] == "EXTENSION_EXISTS"

E2E Test Example

"""End-to-end infrastructure tests."""

import pytest
import asyncio

@pytest.mark.e2e
@pytest.mark.asyncio
@pytest.mark.timeout(300)
async def test_queue_pod_lifecycle(api_client, k8s_helper):
    """Test complete queue pod creation and deletion."""
    queue_name = f"test-queue-{int(time.time())}"
    
    # 1. Create queue (triggers pod creation)
    create_response = await api_client.post("/v1/queues", json={
        "name": queue_name,
        "strategy": "longest-idle-agent"
    })
    assert create_response.status_code == 201
    
    # 2. Wait for pod to be ready
    pod_ready = await k8s_helper.wait_for_pod_ready(
        f"queue-{queue_name}",
        namespace="client-demo-client",
        timeout=120
    )
    assert pod_ready
    
    # 3. Verify pod is running
    status_response = await api_client.get(f"/v1/queues/{queue_name}/status")
    assert status_response.json()["pod_status"] == "Running"
    
    # 4. Delete queue (triggers pod deletion)
    delete_response = await api_client.delete(f"/v1/queues/{queue_name}")
    assert delete_response.status_code == 200
    
    # 5. Verify pod is deleted
    pod_deleted = await k8s_helper.wait_for_pod_deleted(
        f"queue-{queue_name}",
        namespace="client-demo-client",
        timeout=60
    )
    assert pod_deleted

Integration Workflow Example

"""Integration workflow tests."""

import pytest

@pytest.mark.integration
@pytest.mark.asyncio
async def test_complete_extension_workflow(api_client):
    """Test complete extension lifecycle workflow."""
    extension_number = f"test-{int(time.time())}"
    
    # 1. Create extension
    create_response = await api_client.post("/v1/extensions", json={
        "extension_number": extension_number,
        "password": "SecurePass123!",
        "display_name": "Test User",
        "domain": "default"
    })
    assert create_response.status_code == 201
    
    # 2. Verify extension exists
    get_response = await api_client.get(f"/v1/extensions/{extension_number}")
    assert get_response.status_code == 200
    
    # 3. Update extension
    update_response = await api_client.put(
        f"/v1/extensions/{extension_number}",
        json={"display_name": "Updated User"}
    )
    assert update_response.status_code == 200
    
    # 4. Reload Sofia profile (triggers FreeSWITCH reload)
    reload_response = await api_client.post(
        f"/v1/extensions/{extension_number}/reload"
    )
    assert reload_response.status_code == 200
    
    # 5. Delete extension
    delete_response = await api_client.delete(
        f"/v1/extensions/{extension_number}"
    )
    assert delete_response.status_code == 200
    
    # 6. Verify deletion
    get_deleted_response = await api_client.get(
        f"/v1/extensions/{extension_number}"
    )
    assert get_deleted_response.status_code == 404

Testing Best Practices

1. Use Production-Ready Defaults

Always use realistic configuration values:

# ✅ GOOD
queue_data = {
    "strategy": "longest-idle-agent",
    "max_sessions": 1000,
    "agent_timeout": 30
}

# ❌ BAD
queue_data = {
    "strategy": "ring-all",  # Testing artifact
    "max_sessions": 10000    # Unrealistic
}

2. Test Error Scenarios

Every endpoint should test error cases:

async def test_get_queue_success(api_client):
    """Test happy path."""
    pass

async def test_get_queue_not_found(api_client):
    """Test 404 error."""
    pass

async def test_get_queue_unauthorized(unauthenticated_client):
    """Test 401 error."""
    pass

3. Use Appropriate Markers

Tag tests with correct backend markers:

@pytest.mark.api
@pytest.mark.database  # Uses PostgreSQL
@pytest.mark.asyncio
async def test_create_extension(api_client):
    pass

4. Keep Tests Fast

Use mocking for external dependencies:

@pytest.mark.api
@pytest.mark.asyncio
async def test_originate_call_mock(api_client, mock_xmlrpc):
    """Test call origination with mocked XML-RPC."""
    mock_xmlrpc.return_value = {"status": "success"}
    
    response = await api_client.post("/v1/telephony/originate", json=data)
    assert response.status_code == 200

5. Document Test Purpose

Use clear docstrings:

async def test_create_queue_duplicate_returns_409(api_client):
    """
    Test creating duplicate queue returns 409 Conflict.
    
    Validates:
    - Duplicate detection works
    - Error response format is correct
    - Error code is QUEUE_EXISTS
    """
    pass

ADR: Test Organization Strategy

Status: Accepted

Context: Need a testing approach that enables fast development cycles while ensuring production quality.

Decision: Compartmentalized test structure with unit/API/E2E/integration layers.

Consequences:

✅ Fast feedback loop (unit + API tests run in ~6 seconds)
✅ 100% endpoint coverage with error scenarios
✅ Clear separation of concerns
✅ Production-ready test data catches config issues early
⚠️ Requires discipline to maintain separation
⚠️ E2E tests are slow (infrastructure-dependent)

Alternatives Considered:

Monolithic E2E only: Rejected due to slow feedback
Unit tests only: Rejected due to lack of integration coverage
Mixed structure: Rejected due to confusion about test scope

Pytest Configuration

The test suite uses the following pytest configuration:

[pytest]
timeout = 300
timeout_method = thread
addopts = -p no:warnings --color=yes --strict-markers
pythonpath = .
log_cli_level = WARNING

markers =
    unit: Unit tests
    api: API integration tests
    e2e: End-to-end tests
    integration: Integration workflow tests
    xml_rpc: Tests that use XML-RPC backend
    database: Tests that use PostgreSQL backend
    kubernetes: Tests that use Kubernetes backend
    hybrid: Tests that use multiple backends
    slow: Marks tests as slow (deselect with '-m "not slow"')

Troubleshooting

Tests Running Slowly

# Run only fast tests (skip E2E)
pytest -m "not e2e" -v

# Run specific test type
pytest tests/unit/ tests/api/ -v  # Skip slow E2E tests

Test Failures in CI

# Run tests with verbose output
pytest -v --tb=long

# Run with debug logging
pytest -v -s --log-cli-level=DEBUG

Missing Test Coverage

# Generate coverage report to find gaps
pytest --coverage-report=reports/test-coverage-report.json

# View untested endpoints
cat reports/test-coverage-report.json | jq '.endpoints[] | select(.tested == false)'

Fixture Not Found

Ensure you're using the correct fixture:

# ✅ CORRECT
async def test_something(api_client):  # Uses authenticated client
    pass

async def test_auth_error(unauthenticated_client):  # Uses unauthenticated client
    pass

# ❌ WRONG
async def test_something(client):  # No such fixture
    pass

Next Steps

Run the test suite to validate your changes:

   pytest tests/api/ tests/unit/ -v

Generate coverage report to identify gaps:

pytest --coverage-report=reports/test-coverage-report.json
cat reports/test-coverage-report.json | jq '.summary'

Add integration tests for complex workflows (optional)
Set up CI/CD to run tests on every commit

E2E Testing Guide - Infrastructure testing details
API Reference - Endpoint documentation
Development Guide - Local development setup
CI/CD Guide - Continuous integration setup

Overview​

Test Structure​

Directory Structure​

Test Types​

Unit Tests​

API Tests​

E2E Tests​

Integration Tests​

Backend Classification​

Marker Definitions​

Usage Example​

Running Tests​

Quick Commands​

Run by Backend Type​

Makefile Integration​

Test Execution Flow​

Coverage Strategy​

Endpoint Coverage Goals​

Coverage by Router​

Generate Coverage Report​

Example Coverage Report Output​

Production-Ready Test Data​

Queue Configuration​

Extension Configuration​

Schema Validation​

CI/CD Integration​

GitHub Actions Example​

Pre-commit Hooks​

Test Examples​

Unit Test Example​

API Test Example​

E2E Test Example​

Integration Workflow Example​

Testing Best Practices​

1. Use Production-Ready Defaults​

2. Test Error Scenarios​

3. Use Appropriate Markers​

4. Keep Tests Fast​

5. Document Test Purpose​

ADR: Test Organization Strategy​

Pytest Configuration​

Troubleshooting​

Tests Running Slowly​

Test Failures in CI​

Missing Test Coverage​

Fixture Not Found​

Next Steps​

Related Documentation​