Skip to main content

Command Palette

Search for a command to run...

Simple JSON data mapping in Python

Use Python properties to perform simple data mapping from JSON payloads to Python objects.

Published
2 min read

Over the last few weeks I've been working on integrating with a third-party electronic health records integration API. The low-fidelity summary is as follows:

  1. Accept an event as a JSON payload in an HTTP POST request
  2. Validate the event payload
  3. Deserialize into a Python object
  4. Perform business logic with that object

Because I want the deserialized events to have discoverable, typed attributes, I started with a simple @property decorator:

class SampleEvent:
  def __init__(self, json_dict):
    self._json = json_dict

  @property
  def first_name(self) -> str:
    return self._json["FirstLevel"]["SecondLevel"]["FirstName"]

Not bad. Then I encountered fields that might be present. Consider the scenario in which $.FirstLevel.SecondLevel.FirstName in the data model for this integration API is optionally present. Here's one implementation:

class SampleEvent:
  def __init__(self, json_dict):
    self._json = json_dict

  @property
  def first_name(self) -> Optional[str]:
    return self._json["FirstLevel"]["SecondLevel"].get("FirstName")

Cool. And what if (as I encountered), one of the ancestors of FirstName is optional? Rather than chain a bunch of calls to getattr or have a massive if-block, we could do this:

class SampleEvent:
  def __init__(self, json_dict):
    self._json = json_dict

  @property
  def first_name(self) -> Optional[str]:
    try:
      return self._json["FirstLevel"]["SecondLevel"]["FirstName"]
    except (KeyError, TypeError):
      return None

It's not...the worst code I've seen. However, even if I can accept its verbosity, we have scores of properties like this. In addition, the mental mapping between our data provider's documentation using JSON paths (e.g. $.FirstLevel.SecondLevel.FirstName) and our Python implementation's arrays (e.g. self._json["FirstLevel"]["SecondLevel"]["FirstName"]) uses a cognitive cycle that I'd rather not expend.

Ideally, what I'd have is an implementation that does the following:

  • Expresses property origins as JSON paths
  • Exposes typed properties for discoverability by other engineers, including my future self
  • Allows for parsing into custom types
  • Avoids duplicated complexity/verbosity with each property addition

Luckily, I was able to achieve this without too much trouble using jsonpath-ng and the Python property function (though not as a decorator) when declaring attributes:

import jsonpath_ng

from datetime import date, datetime
from typing import Optional, Any, Dict, Callable


def json_property(json_path: str, converter: Optional[Callable] = str):
    def _json_attribute(value_dict: Dict, path: str, converter: Callable) -> Optional[Any]:
        """
        Gets a value from the provided dict using the provided json_path. If a value is found, the provided
        converter function will be used to generate a value. If no value is found, the return value is None.
        """
        jsonpath_expr = jsonpath_ng.parse(path)
        try:
            value = jsonpath_expr.find(value_dict)[0].value
            if value is None:
                # Don't convert None to "None"
                return value
            return converter(value)
        except IndexError:
            return None

    return property(
        lambda self: _json_attribute(
            self._json, json_path, converter
        )
    )


def _date_from_event_string(date_str: str) -> date:
    return datetime.strptime(date_str, "%Y-%m-%d").date()


class SampleEvent:
  first_name: Optional[str] = json_property("$.FirstLevel.SecondLevel.FirstName")
  dob: Optional[date] = json_property("$.FirstLevel.SecondLevel.DOB", _date_from_event_string)

  def __init__(self, json_dict: Dict):
    self._json = json_dict