Skip to content

Overview

The MappingTools library is organized into several namespaces, each containing specific functionalities for manipulating and transforming data structures.

Library's Content

Below is a brief description of the main namespaces within the library:

This namespace contains classes and functions for collecting and categorizing data items into mappings.

Class Description
AutoMapper A Mapping-like class that automatically generates and assigns unique, minified strings values for any new keys accessed.
CategoryCollector A generalized collector that aggregates data into categories (2D structure). Supports various aggregation modes.
CategoryCounter A specialized CategoryCollector for counting occurrences (Aggregation.COUNT).
DictOperation An enumeration of dictionary operations that can be tracked by MeteredDict.
MappingCollector Collects key-value pairs into an internal mapping based on different modes (ALL, COUNT, DISTINCT, FIRST, LAST).
MappingCollectorMode An enumeration of modes for the MappingCollector.
MeteredDict A dictionary that tracks changes made to it.
nested_defaultdict Creates a nested defaultdict with specified depth and factory.

This namespace provides functions that perform operations on mappings.

Function Description
combine Generalizes merge by using a binary operator to resolve conflicts over deeply nested structures.
distinct Yields distinct values for a specified key across multiple mappings.
flatten Converts a nested mapping structure into a single-level dictionary by flattening the keys into tuples.
inverse Generates an inverse Mapping by swapping keys and values.
merge Deeply merges two recursive tree structures.
pivot Reshapes a list of mappings into a nested dictionary based on index and column keys.
rekey Transforms keys based on a function of (key, value). Supports aggregation.
rename Renames keys based on a mapping or callable. Supports aggregation.
reshape Reshapes a stream of mappings into a nested dictionary of arbitrary depth.

This namespace provides functional, immutable tools for accessing and modifying deeply nested data structures.

Class Description
Lens A functional optic for immutable access and modification of nested data structures. Supports composition via /.
Function Description
patch Applies a set of changes to a data structure immutably using dot-separated paths or Lenses.
project Projects a data structure into a new dictionary shape based on a schema of dot-separated paths or Lenses.

This namespace provides advanced, dictionary-like data structures that act as proxies or containers for collections of objects.

Class Description
Dictifier A strict, type-safe container that proxies method calls and attribute access to a collection of objects. It requires an explicit type and enables deep proxying with type hints. For convenience, it offers an auto() factory method to enable type inference.
LazyDictifier A lazy version of Dictifier that defers execution until results are accessed. Ideal for large datasets or streaming pipelines where memory efficiency is critical.
Function Description
dictify A class decorator that transforms a class definition into a specialized Dictifier collection, providing a declarative way to define object collections with optimized performance.
map_objects A factory function that provides a unified entry point for creating Dictifier or LazyDictifier instances based on the desired behavior (strict, auto, or lazy).

This namespace includes functions that reshape objects while maintaining the consistency of their structure.

Class Description
Transformer A base class for creating reusable, composable data transformers.
Function Description
listify Transforms complex objects into a list of dictionaries with key and value pairs.
minify The minify function is used to shorten the keys of an object using a specified alphabet.
simplify Converts objects to strictly structured dictionaries.
strictify Applies a strict structural conversion to an object using optional converters for keys and values.
stringify Converts an object into a string representation by recursively processing it based on its type.

This namespace provides custom type hints and utility functions for working with types within the library.

Type Description
Tree A recursive type representing a tree structure where each node can be of a generic type T, a list of subtrees, or a dictionary mapping strings to subtrees.
JsonScalar Represents the basic scalar types found in JSON data (None, bool, int, float, str).
JsonTree A recursive type representing a JSON-like tree structure where each node can be a JsonScalar, a list of JsonTrees, or a dictionary mapping strings to JsonTrees.
EnhancedJsonTree A recursive type that extends JsonTree by allowing each node to also be of a generic type T, in addition to JsonScalars, lists, or dictionaries.
MISSING A sentinel object used to distinguish an explicit missing value from an actual None value, particularly in the operators.merge function.
Combine A protocol for a callable that takes two objects of type T and combines them, returning T. Used by the combine operator.
Handler A protocol for a callable that takes an object of a generic type T and returns any value.

Comparison with Other Libraries

mappingtools occupies a unique niche in the Python ecosystem. Here is how it compares to other well-known libraries.

1. GraphBLAS (python-graphblas)

  • Domain: High-Performance Sparse Linear Algebra.
  • Backend: C (SuiteSparse).
  • Keys: Integer indices only (\(0 \dots N-1\)).
  • Comparison:
    • GraphBLAS is the "F1 Car." It is optimized for massive scale (billions of edges) and raw speed on integer matrices.
    • MappingTools is the "All-Terrain Vehicle." It works with Symbolic Keys (strings, tuples, objects) directly, without needing to map them to integers first. It is pure Python and zero-dependency.
  • Verdict: Use GraphBLAS for massive numerical graphs. Use MappingTools for NLP, Knowledge Graphs, and symbolic prototyping.

2. NumPy (numpy)

  • Domain: Dense Linear Algebra.
  • Backend: C/Fortran.
  • Comparison:
    • NumPy stores matrices as dense arrays. It is unbeatable for dense data (density > 50%).
    • MappingTools stores matrices as sparse dictionaries. It is orders of magnitude faster for very sparse data ( density < 1%).
  • Verdict: Use NumPy for images and dense tensors. Use MappingTools for sparse feature vectors and graphs.

3. SciPy Sparse (scipy.sparse)

  • Domain: Numerical Sparse Linear Algebra.
  • Backend: C/C++.
  • Keys: Integer indices only.
  • Comparison:
    • SciPy is the industry standard for numerical sparse matrices. However, it is limited to standard arithmetic (\(+, \times\)).
    • MappingTools supports Generalized Semirings (Tropical, Boolean, Expectation), allowing you to solve Shortest Path, Reachability, and Viterbi Decoding using the same "matrix multiplication" code.
  • Verdict: Use SciPy for standard numerical solvers (eigenvalues, linear systems). Use MappingTools for generalized algebraic problems.

4. NetworkX (networkx)

  • Domain: Graph Algorithms.
  • Backend: Pure Python.
  • Comparison:
    • NetworkX is object-oriented. You manipulate Graph objects, add nodes, and run algorithms like shortest_path.
    • MappingTools is algebraic. You represent the graph as a sparse matrix and run dot (matrix multiplication) to find paths.
  • Verdict: Use NetworkX for complex graph algorithms (clustering, community detection) and visualization. Use MappingTools when you want to express graph problems as linear algebra equations.