Skip to Content

CI/CD

GitHub Actions

Set up the workflow

.github/workflows/llm-tests.yml
name: LLM Tests on: push: branches: [main] pull_request: jobs: unit-tests: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: "3.12" - run: pip install -e ".[dev]" - run: pytest tests/ -v --ignore=tests/test_real_* llm-tests: runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: "3.12" - run: pip install -e ".[dev,openai,anthropic]" - run: pytest -m llmtest --tb=short env: OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

Add test markers

test_ci.py
import pytest @pytest.mark.llmtest def test_with_llm(llm): output = llm("Hello", model="gpt-4o-mini") assert output.content

Run selectively

Terminal
pytest -m llmtest # only LLM tests pytest -m "not llmtest" # everything except LLM tests

Cost control — LLM tests cost money. Only run them on main or use cheap models like gpt-4o-mini in CI.

Cost Control

test_budget.py
from assertllm import expect, llm_test @llm_test( expect.cost_under(0.01), expect.latency_under(10000), model="gpt-4o-mini", ) def test_ci_safe(llm): llm("What is 2+2?")
Output
test_budget.py::test_ci_safe "4" ✓ cost_under(0.01) — $0.000005 ✓ latency_under(10000) — 234ms PASSED [0.2s]

Use expect.cost_under() to set a hard budget per test. Tests that exceed the limit will fail immediately.

Last updated on