torchcache.torchcache

torchcache(*, enabled: bool = True, memory_cache_device: str = 'cpu', subsample_count: int = 10000, persistent: bool = False, persistent_cache_dir: str | None = None, persistent_module_hash: str | None = None, max_persistent_cache_size: int = 10000000000, max_memory_cache_size: int = 1000000000, zstd_compression: bool = False, zstd_compression_level: int = 3, zstd_compression_threads: int = 1, cache_dtype: dtype | None = None, use_mmap_on_load: bool = False) callable

Polymorphic cache decorator for nn.Module subclasses or pure Tensor functions.

As a class decorator: caches Module.forward outputs. As a function decorator: wraps the function in an nn.Module and caches its outputs.

Always invoke the decorator with parentheses, even if no arguments are passed. For example:

@torchcache()
class CachedModule(nn.Module):
    pass

You can also override the arguments of the underlying class instance by setting class attributes that starts with “torchcache_”. For example, to set the cache directory for a module (persistent_cache_dir), you can do:

@torchcache(persistent=True)
class CachedModule(nn.Module):
    def __init__(self, cache_dir: str | Path):
        self.torchcache_persistent_cache_dir = cache_dir
Parameters:
enabledbool

The decorator is enabled, by default True.

subsample_countint

Number of values to subsample from the tensor in hash computation, by default 10000. This is used to improve hashing performance, at the cost of a higher probability of hash collisions. Current default is 10000, which should be enough for most use cases.

memory_cache_devicestr or torch.device, optional

Device to use for the cache, by default “cpu”. If None, then the original device of the tensor is used.

persistentbool, optional

Whether to use a file-system-based cache, by default False

persistent_cache_dirstr or Path, optional

Directory to use for caching, by default None. If None, then a temporary directory is used. Only used if persistent is True.

persistent_module_hashstr, optional

Hash of the module definition, args, and kwargs, by default None. If None, then the module hash is automatically determined. You can explicitly set this if you want to use the same cache for slightly different modules. You can find the module hash in the following locations: - In the logs, if you set the logging level to INFO or DEBUG - In the cached module’s self.cache_instance.module_hash attribute - As the name of the subdirectory in the persistent cache

max_persistent_cache_sizeint, optional

Maximum size of the persistent cache in bytes, by default 10e9 (10 GB)

max_memory_cache_sizeint, optional

Maximum size of the memory cache in bytes, by default 1e9 (1 GB)

zstd_compressionbool, optional

Whether to use zstd compression, by default False. See https://github.com/sergey-dryabzhinsky/python-zstd for more information on the arguments below.

zstd_compression_levelint, optional

Compression level to use, by default 3. Must be between -100 and 22, where -100 is the fastest compression and 22 is the slowest.

zstd_compression_threadsint, optional

Number of threads to use for compression, by default 1. If 0, then the number of threads is automatically determined.

cache_dtypetorch.dtype, optional

Data type to use for the cache, by default None. If None, then the data type of the first tensor that is processed is used.

use_mmap_on_loadbool, optional

Whether to use mmap when loading the cached embeddings from file, by default False. This option might be useful if each embedding is very large, as it might improve the performance for large files.