Skip to content

Overview

The MappingTools library is organized into several namespaces, each containing specific functionalities for manipulating and transforming data structures.

Below is a brief description of the main namespaces within the library:

This namespace contains classes and functions for collecting and categorizing data items into mappings.

Class Description
AutoMapper A Mapping-like class that automatically generates and assigns unique, minified strings values for any new keys accessed.
CategoryCollector A generalized collector that aggregates data into categories (2D structure). Supports various aggregation modes.
CategoryCounter A specialized CategoryCollector for counting occurrences (Aggregation.COUNT).
MappingCollector Collects key-value pairs into an internal mapping based on different modes (ALL, COUNT, DISTINCT, FIRST, LAST).
MeteredDict A dictionary that tracks changes made to it.
nested_defaultdict Creates a nested defaultdict with specified depth and factory.

This namespace provides functions that perform operations on mappings.

Function Description
distinct Yields distinct values for a specified key across multiple mappings.
flatten Converts a nested mapping structure into a single-level dictionary by flattening the keys into tuples.
inverse Generates an inverse Mapping by swapping keys and values.
keep Deprecated Yields subsets of mappings by retaining the specified keys.
pivot Reshapes a list of mappings into a nested dictionary based on index and column keys.
rekey Transforms keys based on a function of (key, value). Supports aggregation.
remove Deprecated Yields subsets mappings with the specified keys removed.
rename Renames keys based on a mapping or callable. Supports aggregation.
stream Deprecated Generates items from a mapping, optionally applying a factory function to each key-value pair.
stream_dict_records Deprecated Generates dictionary records from a mapping with customizable key and value names.

This namespace provides functional, immutable tools for accessing and modifying deeply nested data structures.

Class Description
Lens A functional optic for immutable access and modification of nested data structures. Supports composition via /.
Function Description
patch Applies a set of changes to a data structure immutably using dot-separated paths or Lenses.
project Projects a data structure into a new dictionary shape based on a schema of dot-separated paths or Lenses.

This namespace provides advanced, dictionary-like data structures that act as proxies or containers for collections of objects.

Class Description
Dictifier A strict, type-safe container that proxies method calls and attribute access to a collection of objects. It requires an explicit type and enables deep proxying with type hints. For convenience, it offers an auto() factory method to enable type inference.
LazyDictifier A lazy version of Dictifier that defers execution until results are accessed. Ideal for large datasets or streaming pipelines where memory efficiency is critical.
Function Description
dictify A class decorator that transforms a class definition into a specialized Dictifier collection, providing a declarative way to define object collections with optimized performance.
map_objects A factory function that provides a unified entry point for creating Dictifier or LazyDictifier instances based on the desired behavior (strict, auto, or lazy).

This namespace includes functions that reshape objects while maintaining the consistency of their structure.

Function Description
listify Transforms complex objects into a list of dictionaries with key and value pairs.
minify The minify function is used to shorten the keys of an object using a specified alphabet.
simplify Converts objects to strictly structured dictionaries.
strictify Applies a strict structural conversion to an object using optional converters for keys and values.
stringify Converts an object into a string representation by recursively processing it based on its type.

Comparison with Other Libraries

mappingtools occupies a unique niche in the Python ecosystem. Here is how it compares to other well-known libraries.

1. GraphBLAS (python-graphblas)

  • Domain: High-Performance Sparse Linear Algebra.
  • Backend: C (SuiteSparse).
  • Keys: Integer indices only ($0 \dots N-1$).
  • Comparison:
    • GraphBLAS is the "F1 Car." It is optimized for massive scale (billions of edges) and raw speed on integer matrices.
    • MappingTools is the "All-Terrain Vehicle." It works with Symbolic Keys (strings, tuples, objects) directly, without needing to map them to integers first. It is pure Python and zero-dependency.
  • Verdict: Use GraphBLAS for massive numerical graphs. Use MappingTools for NLP, Knowledge Graphs, and symbolic prototyping.

2. NumPy (numpy)

  • Domain: Dense Linear Algebra.
  • Backend: C/Fortran.
  • Comparison:
    • NumPy stores matrices as dense arrays. It is unbeatable for dense data (density > 50%).
    • MappingTools stores matrices as sparse dictionaries. It is orders of magnitude faster for very sparse data ( density < 1%).
  • Verdict: Use NumPy for images and dense tensors. Use MappingTools for sparse feature vectors and graphs.

3. SciPy Sparse (scipy.sparse)

  • Domain: Numerical Sparse Linear Algebra.
  • Backend: C/C++.
  • Keys: Integer indices only.
  • Comparison:
    • SciPy is the industry standard for numerical sparse matrices. However, it is limited to standard arithmetic ($+, \times$).
    • MappingTools supports Generalized Semirings (Tropical, Boolean, Expectation), allowing you to solve Shortest Path, Reachability, and Viterbi Decoding using the same "matrix multiplication" code.
  • Verdict: Use SciPy for standard numerical solvers (eigenvalues, linear systems). Use MappingTools for generalized algebraic problems.

4. NetworkX (networkx)

  • Domain: Graph Algorithms.
  • Backend: Pure Python.
  • Comparison:
    • NetworkX is object-oriented. You manipulate Graph objects, add nodes, and run algorithms like shortest_path.
    • MappingTools is algebraic. You represent the graph as a sparse matrix and run dot (matrix multiplication) to find paths.
  • Verdict: Use NetworkX for complex graph algorithms (clustering, community detection) and visualization. Use MappingTools when you want to express graph problems as linear algebra equations.