Experiment result metrics#
- class ablator.modules.metrics.main.Metrics(*args: Any, batch_limit: int | None = 30, memory_limit: int | None = 100000000, evaluation_functions: dict[str, collections.abc.Callable] | None = None, moving_average_limit: int | None = 3000, static_aux_metrics: dict[str, Any] | None = None, moving_aux_metrics: Iterable[str] | None = None)[source]
Stores and manages predictions and calculates metrics given some custom evaluation functions. This class makes batch-updates as metrics are calculated while training/evaluating a model. It takes into account the memory limits, applies evaluation functions, and provides cached or online updates on the metrics.
We can access all the metrics from the
Metrics
object using itsto_dict()
method. Refer to Prototyping Models tutorial for more details.- Parameters:
- *argsty.Any
This argument is just for disabling passing by positional arguments.
- batch_limitint | None
Maximum number of batches to keep for every category of data (specified by
tags
), so onlybatch_limit
number of latest batches is stored for each of the categories, by default30
.- memory_limitint | None
Maximum memory (in bytes) of batches to keep for every category of data (specified by
tags
). Every time this limit is exceeded,batch_limit
will be reduced by 1, by default1e8
.- evaluation_functionsdict[str, Callable] | None
A dictionary of key-value pairs, keys are evaluation function names, values are callable evaluation functions, e.g mean, sum. Note that arguments to this Callable must match with names of prediction batches that the model returns. So if model prediction over a batch looks like this:
{"preds": <batch of predictions>, "labels": <batch of predicted labels>}
, then callable’s arguments should bepreds
andlabels
, e.gevaluation_functions= {"mean": lambda preds, labels: np.mean(preads) + np.mean(labels)}
, by defaultNone
.- moving_average_limitint | None
The maximum number of values allowed to store moving average metrics, by default
3000
.- static_aux_metricsdict[str, ty.Any] | None
A dictionary of static metrics, those with their initial value that are updated manually, such as learning rate, best loss, total steps, etc. Keys of this dictionary are static metric names, while values is a proper initial value, by default
None
.- moving_aux_metricsIterable[str] | None
A list of metrics, those we update with their moving average, such as loss, by default
None
.
Examples
Initialize an object of Metrics:
>>> from ablator.modules.metrics.main import Metrics >>> train_metrics = Metrics( ... batch_limit=30, ... memory_limit=None, ... evaluation_functions={"mean": lambda x: np.mean(x)}, ... moving_average_limit=100, ... static_aux_metrics={"lr": 1.0}, ... moving_aux_metrics={"loss"}, ... ) >>> train_metrics.to_dict() # metrics are set to np.nan if it's not updated yet {'loss': nan, 'lr': 1.0, 'mean': nan}
- to_dict() dict[str, Any] [source]
Get all metrics, i.e moving auxiliary metrics, moving evaluation metrics, and static auxiliary metrics. Note that moving attributes will be an averaged value of all previous batches. Metrics are set to
np.nan
if it’s never updated.- Returns:
- dict[str, ty.Any]
Contains key-value pairs for the metric’s name and its value.
Examples
>>> from ablator.modules.metrics.main import Metrics >>> train_metrics = Metrics( ... batch_limit=30, ... memory_limit=None, ... evaluation_functions={"mean": lambda preds: np.mean(preds)}, # mean of all predictions appended ... moving_average_limit=100, ... static_aux_metrics={"lr": 0.75}, ... moving_aux_metrics={"loss"}, ... ) >>> train_metrics.append_batch(preds=np.array([[100]*10])) >>> train_metrics.evaluate(reset=False, update=True) >>> train_metrics.to_dict() {'loss': nan, 'lr': 0.75, 'mean': 100.0} >>> train_metrics.append_batch(preds=np.array([0] * 10)) >>> train_metrics.evaluate(reset=True, update=True) >>> train_metrics.to_dict() {'loss': nan, 'lr': 0.75, 'mean': 50.0}