Simple JSON data mapping in Python

Use Python properties to perform simple data mapping from JSON payloads to Python objects.

Over the last few weeks I've been working on integrating with a third-party electronic health records integration API. The low-fidelity summary is as follows:

  1. Accept an event as a JSON payload in an HTTP POST request
  2. Validate the event payload
  3. Deserialize into a Python object
  4. Perform business logic with that object

Because I want the deserialized events to have discoverable, typed attributes, I started with a simple @property decorator:

class SampleEvent:
  def __init__(self, json_dict):
    self._json = json_dict

  @property
  def first_name(self) -> str:
    return self._json["FirstLevel"]["SecondLevel"]["FirstName"]

Not bad. Then I encountered fields that might be present. Consider the scenario in which $.FirstLevel.SecondLevel.FirstName in the data model for this integration API is optionally present. Here's one implementation:

class SampleEvent:
  def __init__(self, json_dict):
    self._json = json_dict

  @property
  def first_name(self) -> Optional[str]:
    return self._json["FirstLevel"]["SecondLevel"].get("FirstName")

Cool. And what if (as I encountered), one of the ancestors of FirstName is optional? Rather than chain a bunch of calls to getattr or have a massive if-block, we could do this:

class SampleEvent:
  def __init__(self, json_dict):
    self._json = json_dict

  @property
  def first_name(self) -> Optional[str]:
    try:
      return self._json["FirstLevel"]["SecondLevel"]["FirstName"]
    except (KeyError, TypeError):
      return None

It's not...the worst code I've seen. However, even if I can accept its verbosity, we have scores of properties like this. In addition, the mental mapping between our data provider's documentation using JSON paths (e.g. $.FirstLevel.SecondLevel.FirstName) and our Python implementation's arrays (e.g. self._json["FirstLevel"]["SecondLevel"]["FirstName"]) uses a cognitive cycle that I'd rather not expend.

Ideally, what I'd have is an implementation that does the following:

  • Expresses property origins as JSON paths
  • Exposes typed properties for discoverability by other engineers, including my future self
  • Allows for parsing into custom types
  • Avoids duplicated complexity/verbosity with each property addition

Luckily, I was able to achieve this without too much trouble using jsonpath-ng and the Python property function (though not as a decorator) when declaring attributes:

import jsonpath_ng

from datetime import date, datetime
from typing import Optional, Any, Dict, Callable


def json_property(json_path: str, converter: Optional[Callable] = str):
    def _json_attribute(value_dict: Dict, path: str, converter: Callable) -> Optional[Any]:
        """
        Gets a value from the provided dict using the provided json_path. If a value is found, the provided
        converter function will be used to generate a value. If no value is found, the return value is None.
        """
        jsonpath_expr = jsonpath_ng.parse(path)
        try:
            value = jsonpath_expr.find(value_dict)[0].value
            if value is None:
                # Don't convert None to "None"
                return value
            return converter(value)
        except IndexError:
            return None

    return property(
        lambda self: _json_attribute(
            self._json, json_path, converter
        )
    )


def _date_from_event_string(date_str: str) -> date:
    return datetime.strptime(date_str, "%Y-%m-%d").date()


class SampleEvent:
  first_name: Optional[str] = json_property("$.FirstLevel.SecondLevel.FirstName")
  dob: Optional[date] = json_property("$.FirstLevel.SecondLevel.DOB", _date_from_event_string)

  def __init__(self, json_dict: Dict):
    self._json = json_dict