nadap: Namespace-Aware Data Validation and Pre-Processing

This Python module provides data validation against a data schema. The data schema describes the structure, the data types and all value limitations which a given data must match.

In addition data values at defined points within the data schema can be referenced among each other. They can be tested on uniqueness or if at some point in the data a value (consumer) is the same value that is located at another point in data (producer). For more details see Reference Feature Documentation.

Furthermore, input data can be enriched with default values or values can be converted (i.e. into another data type). For more details see Conversion Feature Documentation.

Code Example

import yaml
import nadap

schema_definition_yaml = """
root:
  type: list
  elements:
    type: dict
    restrictions:
        required: ["id", "name"]
    keys:
        id:
          type: int
          reference: person_id
        name: str
        healthy: bool
"""

# Correct data
data1_yaml = """
- id: 1
  name: Nadap
  healthy: true
- id: 2
  name: Other
  healthy: false
- id: 3
  name: Unkown
"""

# Wrong type for 'name'
data2_yaml = """
- id: 1
  name: 1
"""

# 'id' of 'Other' is not unique; used by 'Nadap'
data3_yaml = """
- id: 1
  name: Nadap
  healthy: true
- id: 1
  name: Other
  healthy: false
"""

schema_def = yaml.load(schema_definition_yaml, Loader=yaml.SafeLoader)
n = nadap.Nadap()
n.schema = schema_def

data1 = yaml.load(data1_yaml, Loader=yaml.SafeLoader)
try:
    n.validate(data1)
except nadap.DataValidationError:
    print("Data1 fails:")
    for finding in n.findings:
        print(finding)

# Recreate a Nadap instance to clear referencing cache
n = nadap.Nadap()
n.schema = schema_def
data2 = yaml.load(data2_yaml, Loader=yaml.SafeLoader)
try:
    n.validate(data2)
except nadap.DataValidationError:
    print("Data2 fails:")
    for finding in n.findings:
        print(finding)

# Recreate a Nadap instance to clear referencing cache
n = nadap.Nadap()
n.schema = schema_def
data3 = yaml.load(data3_yaml, Loader=yaml.SafeLoader)
try:
    n.validate(data3)
except nadap.DataValidationError:
    print("Data3 fails:")
    for finding in n.findings:
        print(finding)
else:
    if n.findings:
        print("Data3 referencing fails:")
        for finding in n.findings:
            print(finding)

... will print this output:

Data2 fails:
[0].name: Data is not an instance of 'str'
Data3 referencing fails:
[1].id: Reference already defined at [0].id