torchcache.torchcache¶

torchcache(*, enabled: bool = True, memory_cache_device: str = 'cpu', subsample_count: int = 10000, persistent: bool = False, persistent_cache_dir: str | None = None, persistent_module_hash: str | None = None, max_persistent_cache_size: int = 10000000000, max_memory_cache_size: int = 1000000000, zstd_compression: bool = False, zstd_compression_level: int = 3, zstd_compression_threads: int = 1, cache_dtype: dtype | None = None, use_mmap_on_load: bool = False) → callable¶

Polymorphic cache decorator for nn.Module subclasses or pure Tensor functions.

As a class decorator: caches Module.forward outputs. As a function decorator: wraps the function in an nn.Module and caches its outputs.

Always invoke the decorator with parentheses, even if no arguments are passed. For example:

@torchcache()
class CachedModule(nn.Module):
    pass

You can also override the arguments of the underlying class instance by setting class attributes that starts with “torchcache_”. For example, to set the cache directory for a module (persistent_cache_dir), you can do:

@torchcache(persistent=True)
class CachedModule(nn.Module):
    def __init__(self, cache_dir: str | Path):
        self.torchcache_persistent_cache_dir = cache_dir

Parameters:

enabledbool: The decorator is enabled, by default True.
subsample_countint: Number of values to subsample from the tensor in hash computation, by default 10000. This is used to improve hashing performance, at the cost of a higher probability of hash collisions. Current default is 10000, which should be enough for most use cases.
memory_cache_devicestr or torch.device, optional: Device to use for the cache, by default “cpu”. If None, then the original device of the tensor is used.
persistentbool, optional: Whether to use a file-system-based cache, by default False
persistent_cache_dirstr or Path, optional: Directory to use for caching, by default None. If None, then a temporary directory is used. Only used if persistent is True.
persistent_module_hashstr, optional: Hash of the module definition, args, and kwargs, by default None. If None, then the module hash is automatically determined. You can explicitly set this if you want to use the same cache for slightly different modules. You can find the module hash in the following locations: - In the logs, if you set the logging level to INFO or DEBUG - In the cached module’s self.cache_instance.module_hash attribute - As the name of the subdirectory in the persistent cache
max_persistent_cache_sizeint, optional: Maximum size of the persistent cache in bytes, by default 10e9 (10 GB)
max_memory_cache_sizeint, optional: Maximum size of the memory cache in bytes, by default 1e9 (1 GB)
zstd_compressionbool, optional: Whether to use zstd compression, by default False. See https://github.com/sergey-dryabzhinsky/python-zstd for more information on the arguments below.
zstd_compression_levelint, optional: Compression level to use, by default 3. Must be between -100 and 22, where -100 is the fastest compression and 22 is the slowest.
zstd_compression_threadsint, optional: Number of threads to use for compression, by default 1. If 0, then the number of threads is automatically determined.
cache_dtypetorch.dtype, optional: Data type to use for the cache, by default None. If None, then the data type of the first tensor that is processed is used.
use_mmap_on_loadbool, optional: Whether to use mmap when loading the cached embeddings from file, by default False. This option might be useful if each embedding is very large, as it might improve the performance for large files.