torchcache.torchcache¶
- torchcache(*, enabled: bool = True, memory_cache_device: str = 'cpu', subsample_count: int = 10000, persistent: bool = False, persistent_cache_dir: str | None = None, persistent_module_hash: str | None = None, max_persistent_cache_size: int = 10000000000, max_memory_cache_size: int = 1000000000, zstd_compression: bool = False, zstd_compression_level: int = 3, zstd_compression_threads: int = 1, cache_dtype: dtype | None = None, use_mmap_on_load: bool = False) callable¶
Polymorphic cache decorator for nn.Module subclasses or pure Tensor functions.
As a class decorator: caches Module.forward outputs. As a function decorator: wraps the function in an nn.Module and caches its outputs.
Always invoke the decorator with parentheses, even if no arguments are passed. For example:
@torchcache() class CachedModule(nn.Module): pass
You can also override the arguments of the underlying class instance by setting class attributes that starts with “torchcache_”. For example, to set the cache directory for a module (persistent_cache_dir), you can do:
@torchcache(persistent=True) class CachedModule(nn.Module): def __init__(self, cache_dir: str | Path): self.torchcache_persistent_cache_dir = cache_dir
- Parameters:
- enabledbool
The decorator is enabled, by default True.
- subsample_countint
Number of values to subsample from the tensor in hash computation, by default 10000. This is used to improve hashing performance, at the cost of a higher probability of hash collisions. Current default is 10000, which should be enough for most use cases.
- memory_cache_devicestr or torch.device, optional
Device to use for the cache, by default “cpu”. If None, then the original device of the tensor is used.
- persistentbool, optional
Whether to use a file-system-based cache, by default False
- persistent_cache_dirstr or Path, optional
Directory to use for caching, by default None. If None, then a temporary directory is used. Only used if persistent is True.
- persistent_module_hashstr, optional
Hash of the module definition, args, and kwargs, by default None. If None, then the module hash is automatically determined. You can explicitly set this if you want to use the same cache for slightly different modules. You can find the module hash in the following locations: - In the logs, if you set the logging level to INFO or DEBUG - In the cached module’s self.cache_instance.module_hash attribute - As the name of the subdirectory in the persistent cache
- max_persistent_cache_sizeint, optional
Maximum size of the persistent cache in bytes, by default 10e9 (10 GB)
- max_memory_cache_sizeint, optional
Maximum size of the memory cache in bytes, by default 1e9 (1 GB)
- zstd_compressionbool, optional
Whether to use zstd compression, by default False. See https://github.com/sergey-dryabzhinsky/python-zstd for more information on the arguments below.
- zstd_compression_levelint, optional
Compression level to use, by default 3. Must be between -100 and 22, where -100 is the fastest compression and 22 is the slowest.
- zstd_compression_threadsint, optional
Number of threads to use for compression, by default 1. If 0, then the number of threads is automatically determined.
- cache_dtypetorch.dtype, optional
Data type to use for the cache, by default None. If None, then the data type of the first tensor that is processed is used.
- use_mmap_on_loadbool, optional
Whether to use mmap when loading the cached embeddings from file, by default False. This option might be useful if each embedding is very large, as it might improve the performance for large files.