JSON Logging with Django

JSON Logging with Django

What is observability?

Improving an application's observability can be one of the most empowering endeavors as an engineer. Imagine tiptoeing across a messy child's room in the middle of the night with the goal of not stepping on a Lego. Now imagine the simple change of turning on a light. The room isn't any less messy, but the mess is in sight. Such are the outcomes of improved observability.

Before diving into the implementation of JSON logging, I want to clarify what I mean by "observability".

Observability is not the act of looking at running software. It doesn’t refer to the tools used for creating dashboards or for forwarding errors to a messaging channel for engineers. That is all the subject of monitoring.

Observability refers to the presence, accuracy and usefulness of contextual information emitted by an application. Improvements in observability increase the chances that when we look at our application, we see something useful about its internal state.

With this in mind, using JSON logging can support these traits of observability:

  • Presence: log messages will be emitted
  • Usefulness: log messages will contain contextual information in a structured, parsable format

Let's get started.

Build a sample Django app with JSON logging

Create a Django project and app

We'll start by creating a Django project with a couple dependencies installed:

❯ pip install django==4.1.3
❯ django-admin startproject django_json_logging
❯ mv django_json_logging django-json-logging
❯ cd django-json-logging
❯ pip freeze > requirements.txt

At this point, your directory structure should look like this:

django-json-logging
├── django_json_logging
│   ├── __init__.py
│   ├── asgi.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── manage.py
└── requirements.txt

Create a sample endpoint to demonstrate logging

To test the output of our logs as we work, let's create a simple endpoint where we can emit log messages.

First, create a view in a new file at django_json_logging/views.py:

import json
import logging

from django.http import HttpResponse

logger = logging.getLogger(__name__)


def log(request):
    logger.info("Sample info log")
    return HttpResponse(
        json.dumps({"status": "OK"}),
        headers={"content-type": "application/json"}
    )

Now we'll route requests to /log/ to this view by adding an entry to django_json_logging/urls.py:

from django.contrib import admin
from django.urls import path

from django_json_logging import views

urlpatterns = [
    path('admin/', admin.site.urls),
    path("log/", views.log),
]

Test the endpoint

To test the endpoint, first start the app in a terminal:

python manage.py runserver

Then in another terminal window or tab, you can use cURL to make a request:

❯ curl -w "\n" localhost:8000/log/
{"status": "OK"}

Alternatively you can use a browser to visit the same URL.

Looking back at the terminal window or tab where you ran runserver, you won't see the log message added in the view, only the Django startup and log messages:

Django version 4.1.3, using settings 'django_json_logging.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
[12/Nov/2022 13:11:10] "GET /log/ HTTP/1.1" 200 16

Note: Though I included it above, for purposes of brevity, I'll omit the Django startup messages in subsequent log samples.

Configure Django to log to stdout

Before we begin adding JSON logging, I want to touch on the destination of our logs. In accordance with The Twelve Factor App, we will write logs to standard out. In our case, this will be in JSON format.

A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream, unbuffered, to stdout.

- 12factor.net/logs

As a primer on configuring logging in Django, check out the docs here. We'll start with a basic dictConfig. In django_json_logging/settings.py, add the following:

LOGGING = {
    "version": 1,
    "disable_existing_loggers": False,
    "handlers": {
        "stdout": {
            "class": "logging.StreamHandler",
        },
    },
    "loggers": {
        "django": {
            "level": "ERROR",
            "handlers": ["stdout"],
            "propagate": False,
        },
        "": {
            "level": "INFO",
            "handlers": ["stdout"],
            "propagate": False,
        }
    }
}

In the loggers entry of the LOGGING dict, I've include two items:

  1. "django": this item configures Python logging such that any log records originating from Django will be passed to a handler only when the level is at "ERROR" or higher. I find that this helps keep my local terminal focused on the messages that are a part of the codebase I'm helping to author, instead of that signal being drowned by the noise of the framework. This reduced noise extends to deployed environments.
  2. "": this item configures Python logging such that log records originating from all other loggers will be passed to the indicated handler if they are at "INFO" or higher, thereby showing all logger.info calls that my team and I have authored.

If you're still running runserver, upon saving, your changes will be reloaded. Otherwise, after running it again and visiting /log/, you should see your log message:

Sample info log

For a bit more context in log messages, we can use a log formatter to display more information:

LOGGING = {
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "console": {
            "format": "{levelname} {asctime} {pathname} - line {lineno}: {message}",
            "style": "{",
        }
    },
    "handlers": {
        "stdout": {
            "class": "logging.StreamHandler",
            "formatter": "console"
        },
    },
    "loggers": {
        "django": {
            "level": "ERROR",
            "handlers": ["stdout"],
            "propagate": False,
        },
        "": {
            "level": "INFO",
            "handlers": ["stdout"],
            "propagate": False,
        }
    }
}

With this change, we now see the following:

INFO 2022-11-12 20:59:10,061 /Users/rama/workspace/personal/django-json-logging/django_json_logging/views.py - line 10: Sample info log

For all out-of-the-box options of what you can include with a log formatter, see the Python docs for LogRecord attributes.

Format Django logs as JSON

Now that we've got logs being written to stdout, it's time to structure them as JSON. Whether you're sending logs to AWS CloudWatch and using the CloudWatch query console, using a node-level fluentd logging agent to forward logs to a third-party service such as Sumo Logic, or simply tailing logs in a terminal and parsing with a tool like jq, having a structured format will help parse, query, and filter logs.

For a quick solution to start JSON-formatting our log records, we'll use the Python library json-log-formatter:

pip install json-log-formatter==0.5.1

With this change, we can now take advantage of the json_log_formatter.JSONFormatter class:

LOGGING = {
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "console": {
            "format": "{levelname} {asctime} {pathname} - line {lineno}: {message}",
            "style": "{",
        },
        "json": {
            "()": "json_log_formatter.JSONFormatter",  # Define the JSON formatter
        },
    },
    "handlers": {
        "stdout": {
            "class": "logging.StreamHandler",
            "formatter": "json"                        # Use the JSON formatter
        },
    },
    "loggers": {
        "django": {
            "level": "ERROR",
            "handlers": ["stdout"],
            "propagate": False,
        },
        "": {
            "level": "INFO",
            "handlers": ["stdout"],
            "propagate": False,
        }
    }
}

When runserver reloads our app, a request to /log/ will yield the following:

{"message": "Sample info log", "time": "2022-11-12T21:49:45.710234"}

Customize JSON log messages

For full customization of JSON log messages, we can define our own log formatter class as a subclass of json_log_formatter.JSONFormatter. Let's do this in django_json_logging/logs.py:

import json_log_formatter


class JSONFormatter(json_log_formatter.JSONFormatter):
    def json_record(self, message, extra, record):
        extra.update(
            {"source": "django", "level": record.levelname, "pathname": record.pathname, "lineno": record.lineno}
        )
        return super().json_record(message, extra, record)

Now in django_json_logging/settings.py, we can update the formatter from this:

"json": {
  "()": "json_log_formatter.JSONFormatter"
}

To this:

"json": {
  "()": "django_json_logging.logs.JSONFormatter"
}

Now with a request to /log/, we'll see a message with similar context as our console logs, except formatted as JSON:

{"source": "django", "level": "INFO", "pathname": "/Users/rama/workspace/personal/django-json-logging/django_json_logging/views.py", "lineno": 10, "message": "Sample info log", "time": "2022-11-12T21:57:21.015559"}

Add context with extra

An added bonus of this approach is that we can customize our log messages with helpful context. For example, if we update our logger call to this:

logger.info("Sample info log", extra={"user": "me"})

Then upon making a request to /log/, we see this:

{"user": "me", "source": "django", "level": "INFO", "pathname": "/Users/rama/workspace/personal/django-json-logging/django_json_logging/views.py", "lineno": 10, "message": "Sample info log", "time": "2022-11-17T22:53:29.408593"}

Conclusion and Up Next

The above message is parsable as JSON, can contain a useful message (though "Sample info log" is probably not helpful), contains contextual information about where the log message originated, and can hold any additional contextual data in the extra kwarg of the logger call.

In my next article, I'll cover how to format uWSGI request and server log messages as JSON in a way that ensures that all log messages, regardless of their Django or uWSGI origin, are emitted in a uniform format.