leaspy.models.obs_models¶
Attributes¶
Classes¶
Base class for valid observation models that may be used in probabilistic models (stateless). |
|
Observation model for binary outcomes using a Bernoulli distribution. |
|
Enumeration defining the possible names for observation models. |
|
Specialized GaussianObservationModel when all data share the same observation model, with default naming. |
|
Specialized ObservationModel for noisy observations with Gaussian residuals assumption. |
|
Base class for valid observation models that may be used in probabilistic models (stateless). |
|
Base class for valid observation models that may be used in probabilistic models (stateless). |
|
Base class for valid observation models that may be used in probabilistic models (stateless). |
Functions¶
|
Factory for observation models. |
Package Contents¶
- class ObservationModel[source]¶
Base class for valid observation models that may be used in probabilistic models (stateless).
In particular, it provides data & linked variables regarding observations and their attachment to the model (the negative log-likelihood - nll - to be minimized).
- Parameters:
- name
str The name of observed variable (to name the data variable & attachment term related to this observation).
- getterfunction
Dataset->WeightedTensor The way to retrieve the observed values from the
Dataset(as aWeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …- dist
SymbolicDistribution The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).
- extra_varsNone (default) or Mapping[VarName,
VariableInterface] Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics. (e.g. “noise_std”, and “y_L2_per_ft” for instance for a Gaussian model)
- name
- name: VariableName¶
- getter: Callable[[Dataset], WeightedTensor]¶
- dist: SymbolicDistribution¶
- extra_vars: Mapping[VariableName, VariableInterface] | None = None¶
- get_nll_attach_var_name(named_attach_vars=True)[source]¶
Return the name of the negative log likelihood attachement variable.
- get_variables_specs(named_attach_vars=True)[source]¶
Automatic specifications of variables for this observation model.
- Parameters:
- named_attached_vars ::obj:`bool`, optional
- Returns:
dict[VariableName,VariableInterface]A dictionary mapping variable name to their correspondind specifications with - the primary DaraVariable - any extra_vars defined by the model - nll attachment variables :
nll_attach_var_ind: a
LinkedVariablerepresenting the individual-level
negative log-likelihood contributions - nll_attach_var: a
LinkedVariablethat sums the individual contributions
- Parameters:
named_attach_vars (bool)
- Return type:
Notes
The distribution object self.dist`should provide a `get_func_nll(name) method that returns a callable for computing the nll
- class BernoulliObservationModel(**extra_vars)[source]¶
Bases:
leaspy.models.obs_models._base.ObservationModelObservation model for binary outcomes using a Bernoulli distribution.
This model expects binary-valued observations and uses a Bernoulli distribution to define the likelihood. It assumes the response variable is named “y”.
- Parameters:
- **extra_varsVariableInterface
Optional extra variables required by the model. These are passed to the parent ObservationModel class and can be used for conditioning the likelihood.
- Attributes:
- string_for_json
str A static string identifier used for serialization.
- string_for_json
- Parameters:
extra_vars (VariableInterface)
- string_for_json = 'bernoulli'¶
- static y_getter(dataset)[source]¶
Extracts and validates the observation values and associated mask from a dataset.
- Parameters:
- dataset
Dataset A dataset object containing values and mask attributes.
- dataset
- Returns:
WeightedTensorA tensor containing the observed binary values along with a boolean mask indicating which entries are valid.
- Raises:
- ValueError
If either dataset.values or dataset.mask is None, indicating that the dataset is improperly initialized.
- Parameters:
dataset (Dataset)
- Return type:
- OBSERVATION_MODELS: Dict[ObservationModelNames, Type[leaspy.models.obs_models._base.ObservationModel]]¶
- ObservationModelFactoryInput¶
- class ObservationModelNames(*args, **kwds)[source]¶
Bases:
enum.EnumEnumeration defining the possible names for observation models.
- GAUSSIAN_DIAGONAL = 'gaussian-diagonal'¶
- GAUSSIAN_SCALAR = 'gaussian-scalar'¶
- BERNOULLI = 'bernoulli'¶
- WEIBULL_RIGHT_CENSORED = 'weibull-right-censored'¶
- WEIBULL_RIGHT_CENSORED_WITH_SOURCES = 'weibull-right-censored-with-sources'¶
- observation_model_factory(model, **kwargs)[source]¶
Factory for observation models.
- Parameters:
- model
strorObservationModelordict[str, …] If an instance of a subclass of
ObservationModel, returns the instance.If a string, then returns a new instance of the appropriate class (with optional parameters kws).
If a dictionary, it must contain the ‘name’ key and other initialization parameters.
- **kwargs
Optional parameters for initializing the requested observation model when a string.
- model
- Returns:
ObservationModelThe desired observation model.
- Raises:
LeaspyModelInputErrorIf model is not supported.
- Parameters:
model (ObservationModelFactoryInput)
- Return type:
leaspy.models.obs_models._base.ObservationModel
- class FullGaussianObservationModel(noise_std, **extra_vars)[source]¶
Bases:
GaussianObservationModelSpecialized GaussianObservationModel when all data share the same observation model, with default naming.
- The default naming is:
‘y’ for observations
‘model’ for model predictions
‘noise_std’ for scale of residuals
We also provide a convenient factory default for most common case, which corresponds to noise_std directly being a ModelParameter (it could also be a PopulationLatentVariable with positive support). Whether scale of residuals is scalar or diagonal depends on the dimension argument of this method.
- Parameters:
noise_std (VariableInterface)
extra_vars (VariableInterface)
- tol_noise_variance = 1e-05¶
- static y_getter(dataset)[source]¶
Extracts the observation values and mask from a dataset.
- Parameters:
- dataset
Dataset A dataset object containing ‘values’ and ‘mask’ attributes
- dataset
- Returns:
WeightedTensorA tensor containing the observed values and a boolean mask used as weights for likekelihood and loss computations
- Raises:
- AssertionError
If either dataset.values`or `dataset.mask`is `None.
- Parameters:
dataset (Dataset)
- Return type:
- classmethod noise_std_suff_stats()[source]¶
Dictionary of sufficient statistics needed for noise_std (when directly a model parameter).
- Returns:
dict[VariableName,LinkedVariable]A dictionary containing the sufficient statistics:
“y_x_model”: Product of the observed values (“y”) and the model predictions (“model”).
“model_x_model”: Squared values of the model predictions (“model”).
- Return type:
- classmethod scalar_noise_std_update(*, state, y_x_model, model_x_model)[source]¶
Update rule for scalar noise_std (when directly a model parameter), from state & sufficient statistics.
Computes a common noise_std for all the features
- Parameters:
- state: :class:`State`
A state dictionary containing precomputed values
- y_x_modelWeightedTensor[float]
The weighted inner product between the observations and the model predictions.
- model_x_modelWeightedTensor[float]
The weighted inner product of the model predictions with themselves.
- Returns:
torch.TensorThe updated scalar value of the noise_std.
- Parameters:
state (State)
y_x_model (WeightedTensor[float])
model_x_model (WeightedTensor[float])
- Return type:
- classmethod diagonal_noise_std_update(*, state, y_x_model, model_x_model)[source]¶
Update rule for feature-wise noise_std (when directly a model parameter), from state & sufficient statistics.
Computes one noise_std per feature.
- Parameters:
- state: :class:`State`
A state dictionary containing precomputed values
- y_x_model
WeightedTensor`[:obj:`float] The weighted inner product between the observations and the model predictions.
- model_x_model
WeightedTensor`[:obj:`float] The weighted inner product of the model predictions with themselves.
- Returns:
torch.TensorThe updated value of the noise_std for each feature.
- Parameters:
state (State)
y_x_model (WeightedTensor[float])
model_x_model (WeightedTensor[float])
- Return type:
- classmethod noise_std_specs(dimension)[source]¶
Default specifications of noise_std variable when directly modelled as a parameter (no latent population variable).
- Parameters:
- dimension
int The dimension of the noise_std. - If dimension == 1, a scalar noise_std deviation is assumed. - If dimension > 1, feature-wise independent noise_std deviations are assumed (diagonal noise).
- dimension
- Returns:
- ModelParameter
The specification of the noise_std, including: - shape: tuple defining the parameter shape. - suff_stats: collected sufficient statistics needed for updates. - update_rule: method to update the parameter based on statistics.
- Parameters:
dimension (int)
- Return type:
- classmethod with_noise_std_as_model_parameter(dimension)[source]¶
Default instance of FullGaussianObservationModel with noise_std (scalar or diagonal depending on dimension) being a ModelParameter.
- Parameters:
- dimension
int The dimension of the noise_std. - If dimension == 1, a scalar noise_std is assumed. - If dimension > 1, feature-wise independent noise_std deviations are assumed (diagonal noise).
- dimension
- Returns:
- FullGaussianObservationModel
A configured instance with noise_std as a ModelParameter, along with the necessary sufficient statistics for inference.
- Raises:
- ValueError
If dimension is not a positive integer.
- Parameters:
dimension (int)
- classmethod compute_rmse(*, y, model)[source]¶
Computes the Root Mean Square Error (RMSE) between predictions and observations.
- Parameters:
- y
WeightedTensor`[:obj:`float] The observed target values with associated weights.
- model
WeightedTensor`[:obj:`float] The model predictions with the same shape and weighting scheme as y.
- y
- Returns:
torch.TensorA scalar tensor representing the RMSE between model and y.
- Parameters:
y (WeightedTensor[float])
model (WeightedTensor[float])
- Return type:
- classmethod compute_rmse_per_ft(*, y, model)[source]¶
Computes the Root Mean Square Error (RMSE) between predictions and observations separately for each feature.
- Parameters:
- y
WeightedTensor`[:obj:`float] The observed target values with associated weights.
- model
WeightedTensor`[:obj:`float] The model predictions with the same shape and weighting scheme as y.
- y
- Returns:
torch.TensorA scalar tensor representing the RMSE between model and y.
- Parameters:
y (WeightedTensor[float])
model (WeightedTensor[float])
- Return type:
- class GaussianObservationModel(name, getter, loc, scale, **extra_vars)[source]¶
Bases:
leaspy.models.obs_models._base.ObservationModelSpecialized ObservationModel for noisy observations with Gaussian residuals assumption.
- Parameters:
- name
str The name of observed variable (to name the data variable & attachment term related to this observation).
- getterfunction
Dataset->WeightedTensor The way to retrieve the observed values from the
Dataset(as aWeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …- loc
str The name of the variable representing the mean (location) of the Gaussian
- scale
str The name of the variable representing the standard deviation (scale) of the Gaussian (noise_std)
- **extra_varsVariableInterface
Additional variables required by the model
- name
- Parameters:
name (VariableName)
getter (Callable[[Dataset], WeightedTensor])
loc (VariableName)
scale (VariableName)
extra_vars (VariableInterface)
Notes
The model uses leaspy.variables.distributions.Normal internally for computing the log-likelihood and related operations.
- class AbstractWeibullRightCensoredObservationModel[source]¶
Bases:
leaspy.models.obs_models._base.ObservationModelBase class for valid observation models that may be used in probabilistic models (stateless).
In particular, it provides data & linked variables regarding observations and their attachment to the model (the negative log-likelihood - nll - to be minimized).
- Parameters:
- name
str The name of observed variable (to name the data variable & attachment term related to this observation).
- getterfunction
Dataset->WeightedTensor The way to retrieve the observed values from the
Dataset(as aWeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …- dist
SymbolicDistribution The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).
- extra_varsNone (default) or Mapping[VarName,
VariableInterface] Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics. (e.g. “noise_std”, and “y_L2_per_ft” for instance for a Gaussian model)
- name
- class WeibullRightCensoredObservationModel(nu, rho, xi, tau, **extra_vars)[source]¶
Bases:
AbstractWeibullRightCensoredObservationModelBase class for valid observation models that may be used in probabilistic models (stateless).
In particular, it provides data & linked variables regarding observations and their attachment to the model (the negative log-likelihood - nll - to be minimized).
- Parameters:
- name
str The name of observed variable (to name the data variable & attachment term related to this observation).
- getterfunction
Dataset->WeightedTensor The way to retrieve the observed values from the
Dataset(as aWeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …- dist
SymbolicDistribution The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).
- extra_varsNone (default) or Mapping[VarName,
VariableInterface] Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics. (e.g. “noise_std”, and “y_L2_per_ft” for instance for a Gaussian model)
- name
- Parameters:
nu (VariableName)
rho (VariableName)
xi (VariableName)
tau (VariableName)
extra_vars (VariableInterface)
- string_for_json = 'weibull-right-censored'¶
- class WeibullRightCensoredWithSourcesObservationModel(nu, rho, xi, tau, survival_shifts, **extra_vars)[source]¶
Bases:
AbstractWeibullRightCensoredObservationModelBase class for valid observation models that may be used in probabilistic models (stateless).
In particular, it provides data & linked variables regarding observations and their attachment to the model (the negative log-likelihood - nll - to be minimized).
- Parameters:
- name
str The name of observed variable (to name the data variable & attachment term related to this observation).
- getterfunction
Dataset->WeightedTensor The way to retrieve the observed values from the
Dataset(as aWeightedTensor): e.g. all values, subset of values - only x, y, z features, one-hot encoded features, …- dist
SymbolicDistribution The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).
- extra_varsNone (default) or Mapping[VarName,
VariableInterface] Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics. (e.g. “noise_std”, and “y_L2_per_ft” for instance for a Gaussian model)
- name
- Parameters:
nu (VariableName)
rho (VariableName)
xi (VariableName)
tau (VariableName)
survival_shifts (VariableName)
extra_vars (VariableInterface)
- string_for_json = 'weibull-right-censored-with-sources'¶