Operators¶

Abstract

Operators are functions that perform operations on Mappings.

combine¶

Combines two recursive tree structures using a binary operator op to resolve conflicts at the leaf nodes.

This is a powerful generalization of merge. It recursively walks two tree structures and applies the op only when it encounters a conflict at the leaf nodes (e.g., two scalars at the same path, or a structural mismatch like a dict vs a list).

You can pass a custom callable (old, new) -> resolved or use one of the pre-built strategies from the resolver enums: Resolver (structural), LogicalResolver (bitwise/sets), or NumericResolver (math).

If you just need the standard "last-wins" behavior, you can use the simpler merge operator, which is essentially combine(tree1, tree2, op=Resolver.LAST), barring the difference in handling list vs. scalars conflicts.

Combining with different conflict resolutions

from mappingtools.operators import combine
from mappingtools.resolvers import NumericResolver, Resolver

tree1 = {"a": 1, "b": {"c": 10}}
tree2 = {"a": 2, "b": {"c": 20}, "d": 5}

# 1. Sum conflicts
summed = combine(tree1, tree2, op=NumericResolver.SUM)
print(summed)
# output: {'a': 3, 'b': {'c': 30}, 'd': 5}

# 2. Keep original values on conflict
first_wins = combine(tree1, tree2, op=Resolver.FIRST)
print(first_wins)
# output: {'a': 1, 'b': {'c': 10}, 'd': 5}

distinct¶

Yields distinct values for a specified key across multiple mappings.

Example

from mappingtools.operators import distinct

mappings = [
    {'a': 1, 'b': 2},
    {'a': 2, 'b': 3},
    {'a': 1, 'b': 4}
]
distinct_values = list(distinct('a', *mappings))
print(distinct_values)
# output: [1, 2]

flatten¶

The flatten function takes a nested mapping structure and converts it into a single-level dictionary by flattening the keys into tuples.

Example

from mappingtools.operators import flatten

nested_dict = {
    'a': {'b': 1, 'c': {'d': 2}},
    'e': 3
}
flat_dict = flatten(nested_dict)
print(flat_dict)
# output: {('a', 'b'): 1, ('a', 'c', 'd'): 2, ('e',): 3}

inverse¶

Swaps keys and values in a dictionary.

Example

from mappingtools.operators import inverse

original_mapping = {'a': {1, 2}, 'b': {3}}
inverted_mapping = inverse(original_mapping)
print(inverted_mapping)
# output: defaultdict(<class 'set'>, {1: {'a'}, 2: {'a'}, 3: {'b'}})

merge¶

A pure function (Monoid operation) to deeply merge two recursive tree structures. The merging strategy resolves conflicts by overwriting existing values with new ones (right-side precedence), unless the conflict is a list vs. scalar, in which case it concatenates (appends/prepends) the list.

Mathematically, this operation forms a composite Monoid:

Last Monoid (Scalar Fallback): When resolving conflicts between simple values, the right-hand side (tree2) wins.
Pointwise Monoid (Dictionary Merge): If the values are dictionaries, they are merged by key, recursively calling merge on the values.
Zip Monoid (List Merge): If both are lists, they are zipped and merged positionally, substituting MISSING for missing indices.
Free Monoid (Mixed List/Scalar): If one is a list and the other is a scalar/dict, it concatenates (appends/prepends).

Because it forms a Monoid, this function can be used with functools.reduce to collect an iterable of trees into a single structure.

Merging two trees directly

from mappingtools.operators import merge

tree1 = {"a": 1, "b": [1, 2]}
tree2 = {"b": [3], "c": 4}

merged = merge(tree1, tree2)
print(merged)
# output: {'a': 1, 'b': [3, 2], 'c': 4}

Reducing an iterable of trees

Using Python's standard functools.reduce, we can easily merge an entire sequence of nested structures.

from functools import reduce
from mappingtools.operators import merge

trees = [
    {"a": 1, "b": {"c": 2}},
    {"b": {"d": 3}},
    {"a": 10}, # Overwrites previous "a"
]

merged = reduce(merge, trees)
print(merged)
# output: {'a': 10, 'b': {'c': 2, 'd': 3}}

Deep merging with Lenses

If you need to merge data into a specific, deeply nested location of a larger tree, you can compose the merge function with an Optic (Lens). This avoids modifying the pure merge function with path traversal logic.

from mappingtools.operators import merge
from mappingtools.optics import Lens

system_state = {"system": {"config": {"retries": 3}}}
new_config = {"timeout": 30}

# Focus specifically on the 'config' node inside 'system'
config_lens = Lens.path("system", "config")

# Apply the merge function OVER the focused node
new_state = config_lens.modify(
    system_state, 
    lambda old: merge(old, new_config)
)

print(new_state)
# output: {'system': {'config': {'retries': 3, 'timeout': 30}}}

pivot¶

Reshapes a list of mappings into a nested dictionary based on index and column keys. Supports different aggregation modes via Aggregation.

Example

from mappingtools.operators import pivot
from mappingtools.aggregations import Aggregation

data = [
    {"city": "NYC", "month": "Jan", "temp": 10},
    {"city": "NYC", "month": "Feb", "temp": 12},
    {"city": "LON", "month": "Jan", "temp": 5},
    {"city": "NYC", "month": "Jan", "temp": 20}, # Duplicate
]

# Default mode (LAST wins)
result = pivot(data, index="city", columns="month", values="temp")
print(result)
# output: {'NYC': {'Jan': 20, 'Feb': 12}, 'LON': {'Jan': 5}}

# Aggregation mode: ALL (collect list)
result_all = pivot(data, index="city", columns="month", values="temp", aggregation=Aggregation.ALL)
print(result_all["NYC"]["Jan"])
# output: [10, 20]

reshape¶

A generalization of pivot that creates nested dictionaries (tensors) of arbitrary depth. While pivot is limited to 2 dimensions (Index, Columns), reshape accepts a sequence of keys to define the hierarchy.

Example

from mappingtools.operators import reshape
from mappingtools.aggregations import Aggregation

data = [
    {"country": "US", "state": "NY", "city": "NYC", "pop": 8.4},
    {"country": "US", "state": "CA", "city": "LA", "pop": 3.9},
    {"country": "UK", "state": "ENG", "city": "LON", "pop": 8.9},
    {"country": "US", "state": "NY", "city": "Albany", "pop": 0.1},
]

# 3-Level Hierarchy: Country -> State -> City
tree = reshape(data, keys=["country", "state", "city"], value="pop")

print(tree["US"]["NY"]["NYC"])
# output: 8.4

# Aggregation: Sum population by Country -> State
# (City is marginalized/ignored)
state_pop = reshape(
    data, 
    keys=["country", "state"], 
    value="pop", 
    aggregation=Aggregation.SUM
)

print(state_pop["US"]["NY"])
# output: 8.5

# Deep Keys (using Lenses or Callables)
# If your data is nested, you can use callables to extract keys.
# This works perfectly with the library's `Lens` or standard `operator.itemgetter`.

nested_data = [
    {"id": 1, "meta": {"region": "US"}, "val": 10},
    {"id": 2, "meta": {"region": "UK"}, "val": 20},
]

# Group by meta.region
deep_tree = reshape(
    nested_data, 
    keys=[lambda x: x["meta"]["region"]], 
    value="val"
)
# output: {'US': 10, 'UK': 20}

rekey¶

Transforms keys of a mapping based on a factory function of (key, value). This allows "re-indexing" a mapping where the new key depends on the content of the value or a combination of the old key and value. Collisions are handled according to the specified aggregation.

Example

from mappingtools.operators import rekey
from mappingtools.aggregations import Aggregation

mapping = {
    "alice": {"dept": "IT", "id": 1},
    "bob": {"dept": "HR", "id": 2},
    "charlie": {"dept": "IT", "id": 3},
}

# Re-index by 'id'
by_id = rekey(mapping, lambda k, v: v["id"])
print(by_id[1])
# output: {'dept': 'IT', 'id': 1}

# Group by 'dept' using Aggregation.ALL
by_dept = rekey(mapping, lambda k, v: v["dept"], aggregation=Aggregation.ALL)
print(list(by_dept.keys()))
# output: ['IT', 'HR']
print(len(by_dept["IT"]))
# output: 2

rename¶

Renames keys in a mapping based on a mapper (Mapping or Callable). If a key is not present in the mapper, it remains unchanged. Collisions are handled according to the specified aggregation.

Example

from mappingtools.operators import rename

data = {"usr_id": 1, "usr_name": "Alice", "email": "alice@example.com"}

# Using a mapping
renamed = rename(data, {"usr_id": "id", "usr_name": "name"})
print(list(renamed.keys()))
# output: ['id', 'name', 'email']

# Using a callable
renamed_upper = rename(data, str.upper)
print(list(renamed_upper.keys()))
# output: ['USR_ID', 'USR_NAME', 'EMAIL']