Validating OpenAPI Specs in a CI/CD Pipeline: A Guide to Ensuring API Quality

Where We've Been

Expressing your API's intent enables product teams to verify that a critical gateway into your product's data operates as expected, and with integrity. Most broadly, this practices what Dr. Steven Spears calls a "problem-solving discipline" in The High-Velocity Edge in which verifiable predictions form the core of a learning organization:

Problem solving is done in a disciplined fashion. Assumptions about cause and effect are made explicit and are stated clearly, then they are tested in a rigorous fashion so improvement efforts both make processes better and deepen process knowledge.

In creating and maintaining an API specification, we make our intent explicit. This is opposed to the scenario in which an API's actual operation "is what it is". With intent, we reveal the opportunity to compare "what is" with what we wanted. In that space is where we find learning that can help improve our products.

In previous articles, I've covered the advantages of capturing an API's intent in an OpenAPI specification. This includes the following:

Creating browsable API documentation (see article series)
Validating (locally) that an API behaves as intended (see article)

Building on the last point above, in this article I'll show how to continuously validate your API spec in a continuous integration pipeline with GitHub Actions.

Following Along

All code I present here is hosted on GitHub at christherama/dino-api. There are three ways you can follow along with the implementation I'll discuss:

Clone the git repo and checkout the tag at our starting point: git checkout tags/v0.0.1 -b validate-spec-in-ci
Download an archive of the project at our starting point here
See the final outcome at the v1.0.0 tag here

Because I'll be using GitHub Actions to run API spec validations, if you want to do the same, you'll need your own repository. You could fork the repo above. However, because special rules apply to GitHub Actions run on forked repositories, I recommend that you create your own GitHub repository, then after cloning the repository referenced above, set the remote URL locally to the one you've created:

git clone git@github.com:christherama/dino-api.git
git remote set-url origin git@github.com:<USERNAME>/<REPO>.git
git push --set-upstream origin main

# If you want to begin from our starting point
git checkout tags/v0.0.1 -b validate-spec-in-ci

Refresher on Validating Locally

In this article, I showed how to validate an API spec locally with dredd. In summary, we did the following:

Delete the existing database. When running our app locally we used a SQLite database. So that we could run our tests from a known starting point, we deleted the database altogether.
Create a new database and migrate. As a follow-up to the first step, we apply migrations to a new database so that it reflects the latest state of the data models we've created in Django with Python.
Run the Django container. Before validating our API according to spec, we must first have a running API.
Run the dredd container. This creates and executes tests based on the API spec.

To unlock continuous validation, we'll need to map each one of these steps to GitHub Actions so that we can validate in pull requests, on commits to our main branch, or wherever else we desire.

GitHub Actions for Validation

To incorporate API validation into GitHub Actions, first we'll use a service container to run the Django API. A common use case for a service container is to run a database for the execution of tests in a job. Our service container will run the Django app as a whole, and since we'll use a SQLite (file-based) database for validation, it will contain our database as well.

Second, we'll run the command to generate and execute contract tests on a specified job container. We'll use the apiaryio/dredd docker image for this.

Let's start putting this into a GitHub Actions workflow. We'll start by building the API image with two jobs:

image-tag: Determines the image tag to be used when building and pushing an API docker image, and subsequently when pulling it to run the API when executing contract tests.
build-api-image: Builds and pushes a docker image for the API to the GitHub Container Registry (GHCR).

We'll store all this in a workflow at .github/workflows/pr.yaml, to be triggered upon opening a PR and with each commit thereafter:


name: Pull Request
on: [pull_request]

jobs:
  image-tag:
    name: Extract Image Tag
    runs-on: ubuntu-22.04
    outputs:
      tag: ${{ steps.tag.outputs.TAG }}
    steps:
      - uses: actions/checkout@v3
      - id: tag
        run: |
          TAG="${GITHUB_HEAD_REF}-$(git rev-parse --short HEAD)"
          echo "::notice title=Docker Image Tag::${TAG}"
          echo "TAG=${TAG}" >> $GITHUB_OUTPUT
  build-api-image:
    name: Build and Push API Image
    runs-on: ubuntu-22.04
    permissions:
      contents: read
      packages: write
    needs:
      - image-tag
    steps:
      - uses: actions/checkout@v3
      - name: Log in to GHCR
        uses: docker/login-action@v2
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      - name: Build and push image
        uses: docker/build-push-action@v3
        with:
          context: .
          file: Dockerfile
          push: true
          cache-from: type=gha
          cache-to: type=gha,mode=max
          tags: ghcr.io/${{ github.repository }}/django:${{ needs.image-tag.outputs.tag }}

Next, let's add a job to do the following:

Start the API as a service from the image built above, and
Generate and run contract tests on a apiaryio/dredd container

Here's the above file continued:

name: Pull Request
on: [pull_request]

jobs:
  image-tag: {...}
  build-api-image: {...}
  run-contract-tests:
    name: Run API Contract Tests
    runs-on: ubuntu-22.04
    needs:
      - image-tag
      - build-api-image
    services:
      django:
        image: ghcr.io/${{ github.repository }}/django:${{ needs.image-tag.outputs.tag }}
        credentials:
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
    container:
      image: apiaryio/dredd:14.0.0
    steps:
      - uses: actions/checkout@v3
      - name: Run tests
        run: dredd api/docs/openapi.yaml http://django:8000

After committing and pushing these changes to the remote repository, I open a pull request on GitHub. Navigating to the Actions tab of the GitHub repo, I see the following:

Upon drilling into the failing job, I see this error buried in the HTML response to a request from dredd while running tests:

<h1>OperationalError at /api/dinosaurs/</h1>
<pre class="exception_value">unable to open database file</pre>

This tells me that we don't have an accessible database. We're using a SQLite database that is expected to exist (or that Django can create) on the Django container, which is running as a service container in this job. The root cause of this particular error is that the directory where the database is expected (from dino/settings.py, the db directory of db/db.sqlite3) doesn't exist. That directory needs to exist so that Django can create the db.sqlite3 file there.

To solve this problem, we're going to create an alternate Docker entrypoint that creates this database directory, migrates the database, and starts the app, then leverage a service container's options to specify the new entrypoint with --entrypoint. See the Docker docs to see all available options.

Here's the entrypoint we'll create at bin/migrate-and-run:

#!/usr/bin/env bash

mkdir db
python manage.py migrate
uwsgi --ini uwsgi.ini --http :8000

Don't forget to make this executable:

chmod +x bin/migrate-and-run

With this in place, let's use our new entrypoint:

name: Pull Request
on: [pull_request]

jobs:
  image-tag: {...}
  build-api-image: {...}
  run-contract-tests:
    ...
    services:
      django:
        ...
        options: --entrypoint /usr/app/bin/migrate-and-run
    ...

After committing and pushing to the same remote branch of the previously open PR, we see now that the job succeeds:

Next Steps

From here, there are a few places you could go:

Add a similar workflow to be executed on commit to main. This ensures that anything that gets into main is fully validated, whether that's because main isn't a protected branch requiring pull requests, or because some change in main, when the validated PR is merged, ends up with failing contract tests.
Use a PostgreSQL database instead of SQLite. This requires a more sophisticated approach in the entire development lifecycle, and this GitHub Action implementation is no exception. I intend to cover this in a future article.
Start incorporating this into your own engineering workflows

Regardless of your app's language and framework choices, and regardless of the CI tool you use, ultimately of critical importance is your achieving a high level of confidence that your data services (REST API or other) operate as advertised. Automate the achievement of this confidence by automating the validation behind it.

Validating OpenAPI Specs in a CI/CD Pipeline: A Guide to Ensuring API Quality

Where We've Been

Following Along

Refresher on Validating Locally

GitHub Actions for Validation

Next Steps

Comments

More from this blog

Boost Engineering Agility with Mock Servers

Test-Drive Your API with a Validated OpenAPI Spec

JSON Logging with uWSGI

JSON Logging with Django

Command Palette

Where We've Been

Following Along

Refresher on Validating Locally

GitHub Actions for Validation

Next Steps

Comments

More from this blog