leaspy.models.obs_models

Attributes

Classes

ObservationModel

Base class for valid observation models that may be used in probabilistic models (stateless).

BernoulliObservationModel

Observation model for binary outcomes using a Bernoulli distribution.

ObservationModelNames

Enumeration defining the possible names for observation models.

FullGaussianObservationModel

Specialized GaussianObservationModel when all data share the same observation model, with default naming.

GaussianObservationModel

Specialized ObservationModel for noisy observations with Gaussian residuals assumption.

AbstractWeibullRightCensoredObservationModel

Base class for valid observation models that may be used in probabilistic models (stateless).

WeibullRightCensoredObservationModel

Base class for valid observation models that may be used in probabilistic models (stateless).

WeibullRightCensoredWithSourcesObservationModel

Base class for valid observation models that may be used in probabilistic models (stateless).

Functions

observation_model_factory(model, **kwargs)

Factory for observation models.

Package Contents

class ObservationModel[source]

Base class for valid observation models that may be used in probabilistic models (stateless).

In particular, it provides data & linked variables regarding observations and their attachment to the model (the negative log-likelihood - nll - to be minimized).

Parameters:
namestr

The name of observed variable (to name the data variable & attachment term related to this observation).

getterfunction Dataset -> WeightedTensor

The way to retrieve the observed values from the Dataset (as a WeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …

distSymbolicDistribution

The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).

extra_varsNone (default) or Mapping[VarName, VariableInterface]

Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics. (e.g. “noise_std”, and “y_L2_per_ft” for instance for a Gaussian model)

name: VariableName
getter: Callable[[Dataset], WeightedTensor]
dist: SymbolicDistribution
extra_vars: Mapping[VariableName, VariableInterface] | None = None
get_nll_attach_var_name(named_attach_vars=True)[source]

Return the name of the negative log likelihood attachement variable.

Parameters:

named_attach_vars (bool)

Return type:

str

get_variables_specs(named_attach_vars=True)[source]

Automatic specifications of variables for this observation model.

Parameters:
named_attached_vars ::obj:`bool`, optional
Returns:
dict [ VariableName, VariableInterface]

A dictionary mapping variable name to their correspondind specifications with - the primary DaraVariable - any extra_vars defined by the model - nll attachment variables :

  • nll_attach_var_ind: a LinkedVariable representing the individual-level

negative log-likelihood contributions - nll_attach_var: a LinkedVariable that sums the individual contributions

Parameters:

named_attach_vars (bool)

Return type:

dict[VariableName, VariableInterface]

Notes

The distribution object self.dist`should provide a `get_func_nll(name) method that returns a callable for computing the nll

serialized()[source]

Returns a JSON-exportable representation of the instance, excluding its name.

Returns:
Any

A representation of the instance, currently based on repr(self.dist), that is intended to be JSON-serializable.

Return type:

Any

to_dict()[source]

To be implemented…

Return type:

dict

to_string()[source]

Returns a string representation of the parameter for saving

Returns:
str

A string representation of the parameter, as stored in self.string_for_json.

Return type:

str

class BernoulliObservationModel(**extra_vars)[source]

Bases: leaspy.models.obs_models._base.ObservationModel

Observation model for binary outcomes using a Bernoulli distribution.

This model expects binary-valued observations and uses a Bernoulli distribution to define the likelihood. It assumes the response variable is named “y”.

Parameters:
**extra_varsVariableInterface

Optional extra variables required by the model. These are passed to the parent ObservationModel class and can be used for conditioning the likelihood.

Attributes:
string_for_jsonstr

A static string identifier used for serialization.

Parameters:

extra_vars (VariableInterface)

string_for_json = 'bernoulli'
static y_getter(dataset)[source]

Extracts and validates the observation values and associated mask from a dataset.

Parameters:
datasetDataset

A dataset object containing values and mask attributes.

Returns:
WeightedTensor

A tensor containing the observed binary values along with a boolean mask indicating which entries are valid.

Raises:
ValueError

If either dataset.values or dataset.mask is None, indicating that the dataset is improperly initialized.

Parameters:

dataset (Dataset)

Return type:

WeightedTensor

OBSERVATION_MODELS: Dict[ObservationModelNames, Type[leaspy.models.obs_models._base.ObservationModel]]
ObservationModelFactoryInput
class ObservationModelNames(*args, **kwds)[source]

Bases: enum.Enum

Enumeration defining the possible names for observation models.

GAUSSIAN_DIAGONAL = 'gaussian-diagonal'
GAUSSIAN_SCALAR = 'gaussian-scalar'
BERNOULLI = 'bernoulli'
WEIBULL_RIGHT_CENSORED = 'weibull-right-censored'
WEIBULL_RIGHT_CENSORED_WITH_SOURCES = 'weibull-right-censored-with-sources'
classmethod from_string(model_name)[source]
Parameters:

model_name (str)

observation_model_factory(model, **kwargs)[source]

Factory for observation models.

Parameters:
modelstr or ObservationModel or dict [ str, …]
  • If an instance of a subclass of ObservationModel, returns the instance.

  • If a string, then returns a new instance of the appropriate class (with optional parameters kws).

  • If a dictionary, it must contain the ‘name’ key and other initialization parameters.

**kwargs

Optional parameters for initializing the requested observation model when a string.

Returns:
ObservationModel

The desired observation model.

Raises:
LeaspyModelInputError

If model is not supported.

Parameters:

model (ObservationModelFactoryInput)

Return type:

leaspy.models.obs_models._base.ObservationModel

class FullGaussianObservationModel(noise_std, **extra_vars)[source]

Bases: GaussianObservationModel

Specialized GaussianObservationModel when all data share the same observation model, with default naming.

The default naming is:
  • ‘y’ for observations

  • ‘model’ for model predictions

  • ‘noise_std’ for scale of residuals

We also provide a convenient factory default for most common case, which corresponds to noise_std directly being a ModelParameter (it could also be a PopulationLatentVariable with positive support). Whether scale of residuals is scalar or diagonal depends on the dimension argument of this method.

Parameters:
tol_noise_variance = 1e-05
static y_getter(dataset)[source]

Extracts the observation values and mask from a dataset.

Parameters:
datasetDataset

A dataset object containing ‘values’ and ‘mask’ attributes

Returns:
WeightedTensor

A tensor containing the observed values and a boolean mask used as weights for likekelihood and loss computations

Raises:
AssertionError

If either dataset.values`or `dataset.mask`is `None.

Parameters:

dataset (Dataset)

Return type:

WeightedTensor

classmethod noise_std_suff_stats()[source]

Dictionary of sufficient statistics needed for noise_std (when directly a model parameter).

Returns:
dict [ VariableName, LinkedVariable]

A dictionary containing the sufficient statistics:

  • “y_x_model”: Product of the observed values (“y”) and the model predictions (“model”).

  • “model_x_model”: Squared values of the model predictions (“model”).

Return type:

dict[VariableName, LinkedVariable]

classmethod scalar_noise_std_update(*, state, y_x_model, model_x_model)[source]

Update rule for scalar noise_std (when directly a model parameter), from state & sufficient statistics.

Computes a common noise_std for all the features

Parameters:
state: :class:`State`

A state dictionary containing precomputed values

y_x_modelWeightedTensor[float]

The weighted inner product between the observations and the model predictions.

model_x_modelWeightedTensor[float]

The weighted inner product of the model predictions with themselves.

Returns:
torch.Tensor

The updated scalar value of the noise_std.

Parameters:
Return type:

Tensor

classmethod diagonal_noise_std_update(*, state, y_x_model, model_x_model)[source]

Update rule for feature-wise noise_std (when directly a model parameter), from state & sufficient statistics.

Computes one noise_std per feature.

Parameters:
state: :class:`State`

A state dictionary containing precomputed values

y_x_modelWeightedTensor`[:obj:`float]

The weighted inner product between the observations and the model predictions.

model_x_modelWeightedTensor`[:obj:`float]

The weighted inner product of the model predictions with themselves.

Returns:
torch.Tensor

The updated value of the noise_std for each feature.

Parameters:
Return type:

Tensor

classmethod noise_std_specs(dimension)[source]

Default specifications of noise_std variable when directly modelled as a parameter (no latent population variable).

Parameters:
dimensionint

The dimension of the noise_std. - If dimension == 1, a scalar noise_std deviation is assumed. - If dimension > 1, feature-wise independent noise_std deviations are assumed (diagonal noise).

Returns:
ModelParameter

The specification of the noise_std, including: - shape: tuple defining the parameter shape. - suff_stats: collected sufficient statistics needed for updates. - update_rule: method to update the parameter based on statistics.

Parameters:

dimension (int)

Return type:

ModelParameter

classmethod with_noise_std_as_model_parameter(dimension)[source]

Default instance of FullGaussianObservationModel with noise_std (scalar or diagonal depending on dimension) being a ModelParameter.

Parameters:
dimensionint

The dimension of the noise_std. - If dimension == 1, a scalar noise_std is assumed. - If dimension > 1, feature-wise independent noise_std deviations are assumed (diagonal noise).

Returns:
FullGaussianObservationModel

A configured instance with noise_std as a ModelParameter, along with the necessary sufficient statistics for inference.

Raises:
ValueError

If dimension is not a positive integer.

Parameters:

dimension (int)

classmethod compute_rmse(*, y, model)[source]

Computes the Root Mean Square Error (RMSE) between predictions and observations.

Parameters:
yWeightedTensor`[:obj:`float]

The observed target values with associated weights.

modelWeightedTensor`[:obj:`float]

The model predictions with the same shape and weighting scheme as y.

Returns:
torch.Tensor

A scalar tensor representing the RMSE between model and y.

Parameters:
Return type:

Tensor

classmethod compute_rmse_per_ft(*, y, model)[source]

Computes the Root Mean Square Error (RMSE) between predictions and observations separately for each feature.

Parameters:
yWeightedTensor`[:obj:`float]

The observed target values with associated weights.

modelWeightedTensor`[:obj:`float]

The model predictions with the same shape and weighting scheme as y.

Returns:
torch.Tensor

A scalar tensor representing the RMSE between model and y.

Parameters:
Return type:

Tensor

to_string()[source]

method for parameter saving

Return type:

str

class GaussianObservationModel(name, getter, loc, scale, **extra_vars)[source]

Bases: leaspy.models.obs_models._base.ObservationModel

Specialized ObservationModel for noisy observations with Gaussian residuals assumption.

Parameters:
namestr

The name of observed variable (to name the data variable & attachment term related to this observation).

getterfunction Dataset -> WeightedTensor

The way to retrieve the observed values from the Dataset (as a WeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …

locstr

The name of the variable representing the mean (location) of the Gaussian

scalestr

The name of the variable representing the standard deviation (scale) of the Gaussian (noise_std)

**extra_varsVariableInterface

Additional variables required by the model

Parameters:

Notes

  • The model uses leaspy.variables.distributions.Normal internally for computing the log-likelihood and related operations.

class AbstractWeibullRightCensoredObservationModel[source]

Bases: leaspy.models.obs_models._base.ObservationModel

Base class for valid observation models that may be used in probabilistic models (stateless).

In particular, it provides data & linked variables regarding observations and their attachment to the model (the negative log-likelihood - nll - to be minimized).

Parameters:
namestr

The name of observed variable (to name the data variable & attachment term related to this observation).

getterfunction Dataset -> WeightedTensor

The way to retrieve the observed values from the Dataset (as a WeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …

distSymbolicDistribution

The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).

extra_varsNone (default) or Mapping[VarName, VariableInterface]

Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics. (e.g. “noise_std”, and “y_L2_per_ft” for instance for a Gaussian model)

static getter(dataset)[source]
Parameters:

dataset (Dataset)

Return type:

WeightedTensor

get_variables_specs(named_attach_vars=True)[source]

Automatic specifications of variables for this observation model.

Parameters:

named_attach_vars (bool)

Return type:

dict[VariableName, VariableInterface]

class WeibullRightCensoredObservationModel(nu, rho, xi, tau, **extra_vars)[source]

Bases: AbstractWeibullRightCensoredObservationModel

Base class for valid observation models that may be used in probabilistic models (stateless).

In particular, it provides data & linked variables regarding observations and their attachment to the model (the negative log-likelihood - nll - to be minimized).

Parameters:
namestr

The name of observed variable (to name the data variable & attachment term related to this observation).

getterfunction Dataset -> WeightedTensor

The way to retrieve the observed values from the Dataset (as a WeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …

distSymbolicDistribution

The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).

extra_varsNone (default) or Mapping[VarName, VariableInterface]

Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics. (e.g. “noise_std”, and “y_L2_per_ft” for instance for a Gaussian model)

Parameters:
string_for_json = 'weibull-right-censored'
classmethod default_init(**kwargs)[source]
class WeibullRightCensoredWithSourcesObservationModel(nu, rho, xi, tau, survival_shifts, **extra_vars)[source]

Bases: AbstractWeibullRightCensoredObservationModel

Base class for valid observation models that may be used in probabilistic models (stateless).

In particular, it provides data & linked variables regarding observations and their attachment to the model (the negative log-likelihood - nll - to be minimized).

Parameters:
namestr

The name of observed variable (to name the data variable & attachment term related to this observation).

getterfunction Dataset -> WeightedTensor

The way to retrieve the observed values from the Dataset (as a WeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …

distSymbolicDistribution

The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).

extra_varsNone (default) or Mapping[VarName, VariableInterface]

Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics. (e.g. “noise_std”, and “y_L2_per_ft” for instance for a Gaussian model)

Parameters:
string_for_json = 'weibull-right-censored-with-sources'
classmethod default_init(**kwargs)[source]