leaspy.models.obs_models
========================

.. py:module:: leaspy.models.obs_models


Attributes
----------

.. autoapisummary::

   leaspy.models.obs_models.OBSERVATION_MODELS
   leaspy.models.obs_models.ObservationModelFactoryInput


Classes
-------

.. autoapisummary::

   leaspy.models.obs_models.ObservationModel
   leaspy.models.obs_models.BernoulliObservationModel
   leaspy.models.obs_models.ObservationModelNames
   leaspy.models.obs_models.FullGaussianObservationModel
   leaspy.models.obs_models.GaussianObservationModel
   leaspy.models.obs_models.AbstractWeibullRightCensoredObservationModel
   leaspy.models.obs_models.WeibullRightCensoredObservationModel
   leaspy.models.obs_models.WeibullRightCensoredWithSourcesObservationModel


Functions
---------

.. autoapisummary::

   leaspy.models.obs_models.observation_model_factory


Package Contents
----------------

.. py:class:: ObservationModel

   
   Base class for valid observation models that may be used in probabilistic models (stateless).

   In particular, it provides data & linked variables regarding observations and their attachment to the model
   (the negative log-likelihood - nll - to be minimized).

   :Parameters:

       **name** : :obj:`str`
           The name of observed variable (to name the data variable & attachment term related to this observation).

       **getter** : function :class:`.Dataset` -> :class:`.WeightedTensor`
           The way to retrieve the observed values from the :class:`.Dataset` (as a :class:`.WeightedTensor`):
           e.g. all values, subset of values - only x, y, z features, one-hot encoded features, ...

       **dist** : :class:`.SymbolicDistribution`
           The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).

       **extra_vars** : None (default) or Mapping[VarName, :class:`.VariableInterface`]
           Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics.
           (e.g. "noise_std", and "y_L2_per_ft" for instance for a Gaussian model)


   ..
       !! processed by numpydoc !!

   .. py:attribute:: name
      :type:  leaspy.variables.specs.VariableName


   .. py:attribute:: getter
      :type:  Callable[[leaspy.io.data.dataset.Dataset], leaspy.utils.weighted_tensor.WeightedTensor]


   .. py:attribute:: dist
      :type:  leaspy.variables.distributions.SymbolicDistribution


   .. py:attribute:: extra_vars
      :type:  Optional[Mapping[leaspy.variables.specs.VariableName, leaspy.variables.specs.VariableInterface]]
      :value: None


   .. py:method:: get_nll_attach_var_name(named_attach_vars = True)

      
      Return the name of the negative log likelihood attachement
      variable.


      ..
          !! processed by numpydoc !!


   .. py:method:: get_variables_specs(named_attach_vars = True)

      
      Automatic specifications of variables for this observation model.


      :Parameters:

          **named_attached_vars ::obj:`bool`, optional**
              ..


      :Returns:

          :obj:`dict` [ :class:`~leaspy.variables.specs.VariableName`, :class:`~leaspy.variables.specs.VariableInterface`]
              A dictionary mapping variable name to their correspondind specifications with
              - the primary DaraVariable
              - any `extra_vars` defined by the model
              - nll attachment variables :
                  - nll_attach_var_ind: a :class:`~leaspy.variables.specs.LinkedVariable` representing the individual-level
                  negative log-likelihood contributions
                  - nll_attach_var: a :class:`~leaspy.variables.specs.LinkedVariable` that sums the individual contributions


      .. rubric:: Notes

      The distribution object `self.dist`should provide a `get_func_nll(name)` method that
      returns a callable for computing the nll


      ..
          !! processed by numpydoc !!


   .. py:method:: serialized()

      
      Returns a JSON-exportable representation of the instance, excluding its name.


      :Returns:

          Any
              A representation of the instance, currently based on `repr(self.dist)`, 
              that is intended to be JSON-serializable.


      ..
          !! processed by numpydoc !!


   .. py:method:: to_dict()

      
      To be implemented...


      ..
          !! processed by numpydoc !!


   .. py:method:: to_string()

      
      Returns a string representation of the parameter for saving


      :Returns:

          :obj:`str`
              A string representation of the parameter, as stored in `self.string_for_json`.


      ..
          !! processed by numpydoc !!


.. py:class:: BernoulliObservationModel(**extra_vars)

   Bases: :py:obj:`leaspy.models.obs_models._base.ObservationModel`


   Observation model for binary outcomes using a Bernoulli distribution.

   This model expects binary-valued observations and uses a Bernoulli distribution
   to define the likelihood. It assumes the response variable is named `"y"`.

   :Parameters:

       **\*\*extra_vars** : VariableInterface
           Optional extra variables required by the model. These are passed to the
           parent `ObservationModel` class and can be used for conditioning the likelihood.

   :Attributes:

       **string_for_json** : :obj:`str`
           A static string identifier used for serialization.


   ..
       !! processed by numpydoc !!

   .. py:attribute:: string_for_json
      :value: 'bernoulli'


   .. py:method:: y_getter(dataset)
      :staticmethod:


      Extracts and validates the observation values and associated mask from a dataset.


      :Parameters:

          **dataset** : :class:`.Dataset`
              A dataset object containing `values` and `mask` attributes.


      :Returns:

          :class:`.WeightedTensor`
              A tensor containing the observed binary values along with a boolean mask
              indicating which entries are valid.


      :Raises:

          ValueError
              If either `dataset.values` or `dataset.mask` is `None`, indicating that
              the dataset is improperly initialized.


      ..
          !! processed by numpydoc !!


.. py:data:: OBSERVATION_MODELS
   :type:  Dict[ObservationModelNames, Type[leaspy.models.obs_models._base.ObservationModel]]

.. py:data:: ObservationModelFactoryInput

.. py:class:: ObservationModelNames(*args, **kwds)

   Bases: :py:obj:`enum.Enum`


   Enumeration defining the possible names for observation models.


   ..
       !! processed by numpydoc !!

   .. py:attribute:: GAUSSIAN_DIAGONAL
      :value: 'gaussian-diagonal'


   .. py:attribute:: GAUSSIAN_SCALAR
      :value: 'gaussian-scalar'


   .. py:attribute:: BERNOULLI
      :value: 'bernoulli'


   .. py:attribute:: WEIBULL_RIGHT_CENSORED
      :value: 'weibull-right-censored'


   .. py:attribute:: WEIBULL_RIGHT_CENSORED_WITH_SOURCES
      :value: 'weibull-right-censored-with-sources'


   .. py:method:: from_string(model_name)
      :classmethod:


.. py:function:: observation_model_factory(model, **kwargs)

   
   Factory for observation models.


   :Parameters:

       **model** : :obj:`str` or :class:`.ObservationModel` or :obj:`dict` [ :obj:`str`, ...]
           - If an instance of a subclass of :class:`.ObservationModel`, returns the instance.
           - If a string, then returns a new instance of the appropriate class (with optional parameters `kws`).
           - If a dictionary, it must contain the 'name' key and other initialization parameters.

       **\*\*kwargs**
           Optional parameters for initializing the requested observation model when a string.


   :Returns:

       :class:`.ObservationModel`
           The desired observation model.


   :Raises:

       :exc:`.LeaspyModelInputError`
           If `model` is not supported.


   ..
       !! processed by numpydoc !!

.. py:class:: FullGaussianObservationModel(noise_std, **extra_vars)

   Bases: :py:obj:`GaussianObservationModel`


   Specialized `GaussianObservationModel` when all data share the same observation model, with default naming.

   The default naming is:
       - 'y' for observations
       - 'model' for model predictions
       - 'noise_std' for scale of residuals

   We also provide a convenient factory `default` for most common case, which corresponds
   to `noise_std` directly being a `ModelParameter` (it could also be a `PopulationLatentVariable`
   with positive support). Whether scale of residuals is scalar or diagonal depends on the
   `dimension` argument of this method.


   ..
       !! processed by numpydoc !!

   .. py:attribute:: tol_noise_variance
      :value: 1e-05


   .. py:method:: y_getter(dataset)
      :staticmethod:


      Extracts the observation values and mask from a dataset.


      :Parameters:

          **dataset** : :class:`.Dataset`
              A dataset object containing 'values' and 'mask' attributes


      :Returns:

          :class:`.WeightedTensor`
              A tensor containing the observed values and a boolean mask used as weights
              for likekelihood and loss computations


      :Raises:

          AssertionError
              If either `dataset.values`or `dataset.mask`is `None`.


      ..
          !! processed by numpydoc !!


   .. py:method:: noise_std_suff_stats()
      :classmethod:


      Dictionary of sufficient statistics needed for `noise_std` (when directly a model parameter).


      :Returns:

          :obj:`dict` [ :class:`~leaspy.variables.specs.VariableName`, :class:`~leaspy.variables.specs.LinkedVariable`]
              A dictionary containing the sufficient statistics:
              
              - `"y_x_model"`: Product of the observed values (`"y"`) and the model predictions (`"model"`).
              - `"model_x_model"`: Squared values of the model predictions (`"model"`).


      ..
          !! processed by numpydoc !!


   .. py:method:: scalar_noise_std_update(*, state, y_x_model, model_x_model)
      :classmethod:


       Update rule for scalar `noise_std` (when directly a model parameter), 
       from state & sufficient statistics.

       Computes a common `noise_std` for all the features

      :Parameters:

          **state: :class:`State`**
                 A state dictionary containing precomputed values
              y_x_model : WeightedTensor[float]
                 The weighted inner product between the observations and the model predictions.

          **model_x_model** : WeightedTensor[float]
              The weighted inner product of the model predictions with themselves.


      :Returns:

          :class:`torch.Tensor`
              The updated scalar value of the `noise_std`.


      ..
          !! processed by numpydoc !!


   .. py:method:: diagonal_noise_std_update(*, state, y_x_model, model_x_model)
      :classmethod:


      Update rule for feature-wise `noise_std` (when directly a model parameter),
      from state & sufficient statistics.

      Computes one `noise_std` per feature.

      :Parameters:

          **state: :class:`State`**
              A state dictionary containing precomputed values

          **y_x_model** : :class:`.WeightedTensor`[:obj:`float`]
              The weighted inner product between the observations and the model predictions.

          **model_x_model** : :class:`.WeightedTensor`[:obj:`float`]
              The weighted inner product of the model predictions with themselves.


      :Returns:

          :class:`torch.Tensor`
              The updated value of the `noise_std` for each feature.


      ..
          !! processed by numpydoc !!


   .. py:method:: noise_std_specs(dimension)
      :classmethod:


      Default specifications of `noise_std` variable when directly
      modelled as a parameter (no latent population variable).


      :Parameters:

          **dimension** : :obj:`int`
              The dimension of the `noise_std`.
              - If `dimension == 1`, a scalar `noise_std` deviation is assumed.
              - If `dimension > 1`, feature-wise independent `noise_std` deviations
              are assumed (diagonal noise).


      :Returns:

          ModelParameter
              The specification of the `noise_std`, including:
              - `shape`: tuple defining the parameter shape.
              - `suff_stats`: collected sufficient statistics needed for updates.
              - `update_rule`: method to update the parameter based on statistics.


      ..
          !! processed by numpydoc !!


   .. py:method:: with_noise_std_as_model_parameter(dimension)
      :classmethod:


      Default instance of `FullGaussianObservationModel` with `noise_std`
      (scalar or diagonal depending on `dimension`) being a `ModelParameter`.


      :Parameters:

          **dimension** : :obj:`int`
              The dimension of the `noise_std`.
              - If `dimension == 1`, a scalar `noise_std` is assumed.
              - If `dimension > 1`, feature-wise independent `noise_std` deviations
              are assumed (diagonal noise).


      :Returns:

          FullGaussianObservationModel
              A configured instance with `noise_std` as a `ModelParameter`, along with the
              necessary sufficient statistics for inference.


      :Raises:

          ValueError
              If `dimension` is not a positive integer.


      ..
          !! processed by numpydoc !!


   .. py:method:: compute_rmse(*, y, model)
      :classmethod:


      Computes the Root Mean Square Error (RMSE) between predictions and observations.


      :Parameters:

          **y** : :class:`.WeightedTensor`[:obj:`float`]
              The observed target values with associated weights.

          **model** : :class:`.WeightedTensor`[:obj:`float`]
              The model predictions with the same shape and weighting scheme as `y`.


      :Returns:

          :class:`torch.Tensor`
              A scalar tensor representing the RMSE between `model` and `y`.


      ..
          !! processed by numpydoc !!


   .. py:method:: compute_rmse_per_ft(*, y, model)
      :classmethod:


      Computes the Root Mean Square Error (RMSE) between predictions and observations
      separately for each feature.


      :Parameters:

          **y** : :class:`.WeightedTensor`[:obj:`float`]
              The observed target values with associated weights.

          **model** : :class:`.WeightedTensor`[:obj:`float`]
              The model predictions with the same shape and weighting scheme as `y`.


      :Returns:

          :class:`torch.Tensor`
              A scalar tensor representing the RMSE between `model` and `y`.


      ..
          !! processed by numpydoc !!


   .. py:method:: to_string()

      
      method for parameter saving


      ..
          !! processed by numpydoc !!


.. py:class:: GaussianObservationModel(name, getter, loc, scale, **extra_vars)

   Bases: :py:obj:`leaspy.models.obs_models._base.ObservationModel`


   Specialized `ObservationModel` for noisy observations with Gaussian residuals assumption.


   :Parameters:

       **name** : :obj:`str`
           The name of observed variable (to name the data variable & attachment term related to this observation).

       **getter** : function :class:`.Dataset` -> :class:`.WeightedTensor`
           The way to retrieve the observed values from the :class:`.Dataset` (as a :class:`.WeightedTensor`):
           e.g. all values, subset of values - only x, y, z features, one-hot encoded features, ...

       **loc** : :obj:`str`
           The name of the variable representing the mean (location) of the Gaussian

       **scale** : :obj:`str`
           The name of the variable representing the standard deviation (scale) of the Gaussian (`noise_std`)

       **\*\*extra_vars** : VariableInterface
           Additional variables required by the model


   .. rubric:: Notes

   - The model uses `leaspy.variables.distributions.Normal` internally for computing
     the log-likelihood and related operations.


   ..
       !! processed by numpydoc !!

.. py:class:: AbstractWeibullRightCensoredObservationModel

   Bases: :py:obj:`leaspy.models.obs_models._base.ObservationModel`


   Base class for valid observation models that may be used in probabilistic models (stateless).

   In particular, it provides data & linked variables regarding observations and their attachment to the model
   (the negative log-likelihood - nll - to be minimized).

   :Parameters:

       **name** : :obj:`str`
           The name of observed variable (to name the data variable & attachment term related to this observation).

       **getter** : function :class:`.Dataset` -> :class:`.WeightedTensor`
           The way to retrieve the observed values from the :class:`.Dataset` (as a :class:`.WeightedTensor`):
           e.g. all values, subset of values - only x, y, z features, one-hot encoded features, ...

       **dist** : :class:`.SymbolicDistribution`
           The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).

       **extra_vars** : None (default) or Mapping[VarName, :class:`.VariableInterface`]
           Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics.
           (e.g. "noise_std", and "y_L2_per_ft" for instance for a Gaussian model)


   ..
       !! processed by numpydoc !!

   .. py:method:: getter(dataset)
      :staticmethod:


   .. py:method:: get_variables_specs(named_attach_vars = True)

      
      Automatic specifications of variables for this observation model.


      ..
          !! processed by numpydoc !!


.. py:class:: WeibullRightCensoredObservationModel(nu, rho, xi, tau, **extra_vars)

   Bases: :py:obj:`AbstractWeibullRightCensoredObservationModel`


   Base class for valid observation models that may be used in probabilistic models (stateless).

   In particular, it provides data & linked variables regarding observations and their attachment to the model
   (the negative log-likelihood - nll - to be minimized).

   :Parameters:

       **name** : :obj:`str`
           The name of observed variable (to name the data variable & attachment term related to this observation).

       **getter** : function :class:`.Dataset` -> :class:`.WeightedTensor`
           The way to retrieve the observed values from the :class:`.Dataset` (as a :class:`.WeightedTensor`):
           e.g. all values, subset of values - only x, y, z features, one-hot encoded features, ...

       **dist** : :class:`.SymbolicDistribution`
           The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).

       **extra_vars** : None (default) or Mapping[VarName, :class:`.VariableInterface`]
           Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics.
           (e.g. "noise_std", and "y_L2_per_ft" for instance for a Gaussian model)


   ..
       !! processed by numpydoc !!

   .. py:attribute:: string_for_json
      :value: 'weibull-right-censored'


   .. py:method:: default_init(**kwargs)
      :classmethod:


.. py:class:: WeibullRightCensoredWithSourcesObservationModel(nu, rho, xi, tau, survival_shifts, **extra_vars)

   Bases: :py:obj:`AbstractWeibullRightCensoredObservationModel`


   Base class for valid observation models that may be used in probabilistic models (stateless).

   In particular, it provides data & linked variables regarding observations and their attachment to the model
   (the negative log-likelihood - nll - to be minimized).

   :Parameters:

       **name** : :obj:`str`
           The name of observed variable (to name the data variable & attachment term related to this observation).

       **getter** : function :class:`.Dataset` -> :class:`.WeightedTensor`
           The way to retrieve the observed values from the :class:`.Dataset` (as a :class:`.WeightedTensor`):
           e.g. all values, subset of values - only x, y, z features, one-hot encoded features, ...

       **dist** : :class:`.SymbolicDistribution`
           The symbolic distribution, parametrized by model variables, for observed values (so to compute attachment).

       **extra_vars** : None (default) or Mapping[VarName, :class:`.VariableInterface`]
           Some new variables that are needed to fully define the symbolic distribution or the sufficient statistics.
           (e.g. "noise_std", and "y_L2_per_ft" for instance for a Gaussian model)


   ..
       !! processed by numpydoc !!

   .. py:attribute:: string_for_json
      :value: 'weibull-right-censored-with-sources'


   .. py:method:: default_init(**kwargs)
      :classmethod: