Posteriors¶

`DirectPosterior` ¶

Bases: NeuralPosterior

Posterior based on neural networks that directly estimate the posterior (NPE).

NPE trains a neural network to directly approximate the posterior distribution. However, for bounded priors, the neural network can have leakage: it puts non-zero mass in regions where the prior is zero. The DirectPosterior class wraps the trained network to deal with these cases.

Specifically, this class offers the following functionality:

correct the calculation of the log probability such that it compensates for the leakage.
reject samples that lie outside of the prior bounds.

This class can not be used in combination with NLE or NRE.

Source code in sbi/inference/posteriors/direct_posterior.py

class DirectPosterior(NeuralPosterior):
    r"""Posterior based on neural networks that directly estimate the posterior (NPE).

    NPE trains a neural network to directly approximate the posterior distribution.
    However, for bounded priors, the neural network can have leakage: it puts non-zero
    mass in regions where the prior is zero. The `DirectPosterior` class wraps the
    trained network to deal with these cases.

    Specifically, this class offers the following functionality:

    - correct the calculation of the log probability such that it compensates for the
      leakage.
    - reject samples that lie outside of the prior bounds.

    This class can not be used in combination with NLE or NRE.
    """

    def __init__(
        self,
        posterior_estimator: ConditionalDensityEstimator,
        prior: Distribution,
        max_sampling_batch_size: int = 10_000,
        device: Optional[Union[str, torch.device]] = None,
        x_shape: Optional[torch.Size] = None,
        enable_transform: bool = True,
    ):
        """
        Args:
            prior: Prior distribution with `.log_prob()` and `.sample()`.
            posterior_estimator: The trained neural posterior.
            max_sampling_batch_size: Batchsize of samples being drawn from
                the proposal at every iteration.
            device: Training device, e.g., "cpu", "cuda" or "cuda:0". If None,
                `potential_fn.device` is used.
            x_shape: Deprecated, should not be passed.
            enable_transform: Whether to transform parameters to unconstrained space
                during MAP optimization. When False, an identity transform will be
                returned for `theta_transform`.
        """
        # Because `DirectPosterior` does not take the `potential_fn` as input, it
        # builds it itself. The `potential_fn` and `theta_transform` are used only for
        # obtaining the MAP.
        check_prior(prior)
        self.enable_transform = enable_transform
        self.x_shape = x_shape
        potential_fn, theta_transform = posterior_estimator_based_potential(
            posterior_estimator,
            prior,
            x_o=None,
            enable_transform=enable_transform,
        )

        super().__init__(
            potential_fn=potential_fn,
            theta_transform=theta_transform,
            device=device,
            x_shape=x_shape,
        )

        self.device = device
        self.prior = prior
        self.posterior_estimator = posterior_estimator

        self.max_sampling_batch_size = max_sampling_batch_size
        self._leakage_density_correction_factor = None

        self._purpose = """It samples the posterior network and rejects samples that
            lie outside of the prior bounds."""

    def to(self, device: Union[str, torch.device]) -> None:
        """Move posterior_estimator, prior and x_o to device.

        Changes the device attribute, reinstanciates the
        posterior, and resets the default x.

        Args:
            device: device where to move the posterior to.
        """
        self.device = device
        if hasattr(self.prior, "to"):
            self.prior.to(device)  # type: ignore
        else:
            raise ValueError("""Prior has no attribute to(device).""")
        if hasattr(self.posterior_estimator, "to"):
            self.posterior_estimator.to(device)
        else:
            raise ValueError("""Posterior estimator has no attribute to(device).""")

        potential_fn, theta_transform = posterior_estimator_based_potential(
            self.posterior_estimator,
            self.prior,
            x_o=None,
            enable_transform=self.enable_transform,
        )
        x_o = None
        if hasattr(self, "_x") and (self._x is not None):
            x_o = self._x.to(device)

        super().__init__(
            potential_fn=potential_fn,
            theta_transform=theta_transform,
            device=device,
            x_shape=self.x_shape,
        )
        # super().__init__ erases the self._x, so we need to set it again
        if x_o is not None:
            self.set_default_x(x_o)

    def sample(
        self,
        sample_shape: Shape = torch.Size(),
        x: Optional[Tensor] = None,
        max_sampling_batch_size: int = 10_000,
        sample_with: Optional[str] = None,
        show_progress_bars: bool = True,
    ) -> Tensor:
        r"""Return samples from posterior distribution $p(\theta|x)$.

        Args:
            sample_shape: Desired shape of samples that are drawn from posterior. If
                sample_shape is multidimensional we simply draw `sample_shape.numel()`
                samples and then reshape into the desired shape.
            sample_with: This argument only exists to keep backward-compatibility with
                `sbi` v0.17.2 or older. If it is set, we instantly raise an error.
            show_progress_bars: Whether to show sampling progress monitor.
        """
        num_samples = torch.Size(sample_shape).numel()
        x = self._x_else_default_x(x)
        x = reshape_to_batch_event(
            x, event_shape=self.posterior_estimator.condition_shape
        )
        if x.shape[0] > 1:
            raise ValueError(
                ".sample() supports only `batchsize == 1`. If you intend "
                "to sample multiple observations, use `.sample_batched()`. "
                "If you intend to sample i.i.d. observations, set up the "
                "posterior density estimator with an appropriate permutation "
                "invariant embedding net."
            )

        max_sampling_batch_size = (
            self.max_sampling_batch_size
            if max_sampling_batch_size is None
            else max_sampling_batch_size
        )

        if sample_with is not None:
            raise ValueError(
                f"You set `sample_with={sample_with}`. As of sbi v0.18.0, setting "
                f"`sample_with` is no longer supported. You have to rerun "
                f"`.build_posterior(sample_with={sample_with}).`"
            )

        samples = rejection.accept_reject_sample(
            proposal=self.posterior_estimator.sample,
            accept_reject_fn=lambda theta: within_support(self.prior, theta),
            num_samples=num_samples,
            show_progress_bars=show_progress_bars,
            max_sampling_batch_size=max_sampling_batch_size,
            proposal_sampling_kwargs={"condition": x},
            alternative_method="build_posterior(..., sample_with='mcmc')",
        )[0]  # [0] to return only samples, not acceptance probabilities.

        return samples[:, 0]  # Remove batch dimension.

    def sample_batched(
        self,
        sample_shape: Shape,
        x: Tensor,
        max_sampling_batch_size: int = 10_000,
        show_progress_bars: bool = True,
    ) -> Tensor:
        r"""Given a batch of observations [x_1, ..., x_B] this function samples from
        posteriors $p(\theta|x_1)$, ... ,$p(\theta|x_B)$, in a batched (i.e. vectorized)
        manner.

        Args:
            sample_shape: Desired shape of samples that are drawn from the posterior
                given every observation.
            x: A batch of observations, of shape `(batch_dim, event_shape_x)`.
                `batch_dim` corresponds to the number of observations to be drawn.
            max_sampling_batch_size: Maximum batch size for rejection sampling.
            show_progress_bars: Whether to show sampling progress monitor.

        Returns:
            Samples from the posteriors of shape (*sample_shape, B, *input_shape)
        """
        num_samples = torch.Size(sample_shape).numel()
        condition_shape = self.posterior_estimator.condition_shape
        x = reshape_to_batch_event(x, event_shape=condition_shape)
        num_xos = x.shape[0]

        # throw warning if num_x * num_samples is too large
        if num_xos * num_samples > 2**21:  # 2 million-ish
            warnings.warn(
                f"Note that for batched sampling, the direct posterior sampling "
                f"generates {num_xos} * {num_samples} = {num_xos * num_samples} "
                "samples. This can be slow and memory-intensive. Consider "
                "reducing the number of samples or batch size.",
                stacklevel=2,
            )

        max_sampling_batch_size = (
            self.max_sampling_batch_size
            if max_sampling_batch_size is None
            else max_sampling_batch_size
        )

        # Adjust max_sampling_batch_size to avoid excessive memory usage
        if max_sampling_batch_size * num_xos > 100_000:
            capped = max(1, 100_000 // num_xos)
            warnings.warn(
                f"Capping max_sampling_batch_size from {max_sampling_batch_size} "
                f"to {capped} to avoid excessive memory usage.",
                stacklevel=2,
            )
            max_sampling_batch_size = capped

        samples = rejection.accept_reject_sample(
            proposal=self.posterior_estimator.sample,
            accept_reject_fn=lambda theta: within_support(self.prior, theta),
            num_samples=num_samples,
            show_progress_bars=show_progress_bars,
            max_sampling_batch_size=max_sampling_batch_size,
            proposal_sampling_kwargs={"condition": x},
            alternative_method="build_posterior(..., sample_with='mcmc')",
        )[0]

        return samples

    def log_prob(
        self,
        theta: Tensor,
        x: Optional[Tensor] = None,
        norm_posterior: bool = True,
        track_gradients: bool = False,
        leakage_correction_params: Optional[dict] = None,
    ) -> Tensor:
        r"""Returns the log-probability of the posterior $p(\theta|x)$.

        Args:
            theta: Parameters $\theta$.
            norm_posterior: Whether to enforce a normalized posterior density.
                Renormalization of the posterior is useful when some
                probability falls out or leaks out of the prescribed prior support.
                The normalizing factor is calculated via rejection sampling, so if you
                need speedier but unnormalized log posterior estimates set here
                `norm_posterior=False`. The returned log posterior is set to
                -∞ outside of the prior support regardless of this setting.
            track_gradients: Whether the returned tensor supports tracking gradients.
                This can be helpful for e.g. sensitivity analysis, but increases memory
                consumption.
            leakage_correction_params: A `dict` of keyword arguments to override the
                default values of `leakage_correction()`. Possible options are:
                `num_rejection_samples`, `force_update`, `show_progress_bars`, and
                `rejection_sampling_batch_size`.
                These parameters only have an effect if `norm_posterior=True`.

        Returns:
            `(len(θ),)`-shaped log posterior probability $\log p(\theta|x)$ for θ in the
            support of the prior, -∞ (corresponding to 0 probability) outside.
        """
        x = self._x_else_default_x(x)

        theta = ensure_theta_batched(torch.as_tensor(theta))
        theta_density_estimator = reshape_to_sample_batch_event(
            theta, theta.shape[1:], leading_is_sample=True
        )
        x_density_estimator = reshape_to_batch_event(
            x, event_shape=self.posterior_estimator.condition_shape
        )
        if x_density_estimator.shape[0] > 1:
            raise ValueError(
                ".log_prob() supports only `batchsize == 1`. If you intend "
                "to evaluate given multiple observations, use `.log_prob_batched()`. "
                "If you intend to evaluate given i.i.d. observations, set up the "
                "posterior density estimator with an appropriate permutation "
                "invariant embedding net."
            )

        self.posterior_estimator.eval()

        with torch.set_grad_enabled(track_gradients):
            # Evaluate on device, move back to cpu for comparison with prior.
            unnorm_log_prob = self.posterior_estimator.log_prob(
                theta_density_estimator, condition=x_density_estimator
            )
            # `log_prob` supports only a single observation (i.e. `batchsize==1`).
            # We now remove this additional dimension.
            unnorm_log_prob = unnorm_log_prob.squeeze(dim=1)

            # Force probability to be zero outside prior support.
            in_prior_support = within_support(self.prior, theta)

            masked_log_prob = torch.where(
                in_prior_support,
                unnorm_log_prob,
                torch.tensor(float("-inf"), dtype=torch.float32, device=self._device),
            )

            if leakage_correction_params is None:
                leakage_correction_params = dict()  # use defaults
            log_factor = (
                log(self.leakage_correction(x=x, **leakage_correction_params))
                if norm_posterior
                else 0
            )

            return masked_log_prob - log_factor

    def log_prob_batched(
        self,
        theta: Tensor,
        x: Tensor,
        norm_posterior: bool = True,
        track_gradients: bool = False,
        leakage_correction_params: Optional[dict] = None,
    ) -> Tensor:
        """Given a batch of observations [x_1, ..., x_B] and a batch of parameters \
            [$\theta_1$,..., $\theta_B$] this function evalautes the log-probabilities \
            of the posteriors $p(\theta_1|x_1)$, ..., $p(\theta_B|x_B)$ in a batched \
            (i.e. vectorized) manner.

        Args:
            theta: Batch of parameters $\theta$ of shape \
                `(*sample_shape, batch_dim, *theta_shape)`.
            x: Batch of observations $x$ of shape \
                `(batch_dim, *condition_shape)`.
            norm_posterior: Whether to enforce a normalized posterior density.
                Renormalization of the posterior is useful when some
                probability falls out or leaks out of the prescribed prior support.
                The normalizing factor is calculated via rejection sampling, so if you
                need speedier but unnormalized log posterior estimates set here
                `norm_posterior=False`. The returned log posterior is set to
                -∞ outside of the prior support regardless of this setting.
            track_gradients: Whether the returned tensor supports tracking gradients.
                This can be helpful for e.g. sensitivity analysis, but increases memory
                consumption.
            leakage_correction_params: A `dict` of keyword arguments to override the
                default values of `leakage_correction()`. Possible options are:
                `num_rejection_samples`, `force_update`, `show_progress_bars`, and
                `rejection_sampling_batch_size`.
                These parameters only have an effect if `norm_posterior=True`.

        Returns:
            `(len(θ), B)`-shaped log posterior probability $\\log p(\theta|x)$\\ for θ \
            in the support of the prior, -∞ (corresponding to 0 probability) outside.
        """

        theta = ensure_theta_batched(torch.as_tensor(theta))
        event_shape = self.posterior_estimator.input_shape
        theta_density_estimator = reshape_to_sample_batch_event(
            theta, event_shape, leading_is_sample=True
        )
        x_density_estimator = reshape_to_batch_event(
            x, event_shape=self.posterior_estimator.condition_shape
        )

        self.posterior_estimator.eval()

        with torch.set_grad_enabled(track_gradients):
            # Evaluate on device, move back to cpu for comparison with prior.
            unnorm_log_prob = self.posterior_estimator.log_prob(
                theta_density_estimator, condition=x_density_estimator
            )

            # Force probability to be zero outside prior support.
            in_prior_support = within_support(self.prior, theta)

            masked_log_prob = torch.where(
                in_prior_support,
                unnorm_log_prob,
                torch.tensor(float("-inf"), dtype=torch.float32, device=self._device),
            )

            if leakage_correction_params is None:
                leakage_correction_params = dict()  # use defaults
            log_factor = (
                log(self.leakage_correction(x=x, **leakage_correction_params))
                if norm_posterior
                else 0
            )

            return masked_log_prob - log_factor

    @torch.no_grad()
    def leakage_correction(
        self,
        x: Tensor,
        num_rejection_samples: int = 10_000,
        force_update: bool = False,
        show_progress_bars: bool = False,
        rejection_sampling_batch_size: int = 10_000,
    ) -> Tensor:
        r"""Return leakage correction factor for a leaky posterior density estimate.

        The factor is estimated from the acceptance probability during rejection
        sampling from the posterior.

        This is to avoid re-estimating the acceptance probability from scratch
        whenever `log_prob` is called and `norm_posterior=True`. Here, it
        is estimated only once for `self.default_x` and saved for later. We
        re-evaluate only whenever a new `x` is passed.

        Arguments:
            num_rejection_samples: Number of samples used to estimate correction factor.
            show_progress_bars: Whether to show a progress bar during sampling.
            rejection_sampling_batch_size: Batch size for rejection sampling.

        Returns:
            Saved or newly-estimated correction factor (as a scalar `Tensor`).
        """

        def acceptance_at(x: Tensor) -> Tensor:
            # [1:] to remove batch-dimension for `reshape_to_batch_event`.
            return rejection.accept_reject_sample(
                proposal=self.posterior_estimator.sample,
                accept_reject_fn=lambda theta: within_support(self.prior, theta),
                num_samples=num_rejection_samples,
                show_progress_bars=show_progress_bars,
                sample_for_correction_factor=True,
                max_sampling_batch_size=rejection_sampling_batch_size,
                proposal_sampling_kwargs={
                    "condition": reshape_to_batch_event(
                        x, event_shape=self.posterior_estimator.condition_shape
                    )
                },
            )[1]

        # Check if the provided x matches the default x (short-circuit on identity).
        is_new_x = self.default_x is None or (
            x is not self.default_x and (x != self.default_x).any()
        )

        not_saved_at_default_x = self._leakage_density_correction_factor is None

        if is_new_x:  # Calculate at x; don't save.
            return acceptance_at(x)
        elif not_saved_at_default_x or force_update:  # Calculate at default_x; save.
            assert self.default_x is not None
            self._leakage_density_correction_factor = acceptance_at(self.default_x)

        return self._leakage_density_correction_factor  # type: ignore

    def map(
        self,
        x: Optional[Tensor] = None,
        num_iter: int = 1_000,
        num_to_optimize: int = 100,
        learning_rate: float = 0.01,
        init_method: Union[str, Tensor] = "posterior",
        num_init_samples: int = 1_000,
        save_best_every: int = 10,
        show_progress_bars: bool = False,
        force_update: bool = False,
    ) -> Tensor:
        r"""Returns the maximum-a-posteriori estimate (MAP).

        The method can be interrupted (Ctrl-C) when the user sees that the
        log-probability converges. The best estimate will be saved in `self._map` and
        can be accessed with `self.map()`. The MAP is obtained by running gradient
        ascent from a given number of starting positions (samples from the posterior
        with the highest log-probability). After the optimization is done, we select the
        parameter set that has the highest log-probability after the optimization.

        Warning: The default values used by this function are not well-tested. They
        might require hand-tuning for the problem at hand.

        For developers: if the prior is a `BoxUniform`, we carry out the optimization
        in unbounded space and transform the result back into bounded space.

        Args:
            x: Deprecated - use `.set_default_x()` prior to `.map()`.
            num_iter: Number of optimization steps that the algorithm takes
                to find the MAP.
            learning_rate: Learning rate of the optimizer.
            init_method: How to select the starting parameters for the optimization. If
                it is a string, it can be either [`posterior`, `prior`], which samples
                the respective distribution `num_init_samples` times. If it is a
                tensor, the tensor will be used as init locations.
            num_init_samples: Draw this number of samples from the posterior and
                evaluate the log-probability of all of them.
            num_to_optimize: From the drawn `num_init_samples`, use the
                `num_to_optimize` with highest log-probability as the initial points
                for the optimization.
            save_best_every: The best log-probability is computed, saved in the
                `map`-attribute, and printed every `save_best_every`-th iteration.
                Computing the best log-probability creates a significant overhead
                (thus, the default is `10`.)
            show_progress_bars: Whether to show a progressbar during sampling from the
                posterior.
            force_update: Whether to re-calculate the MAP when x is unchanged and
                have a cached value.
            log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
                {'norm_posterior': True} for SNPE.

        Returns:
            The MAP estimate.
        """
        return super().map(
            x=x,
            num_iter=num_iter,
            num_to_optimize=num_to_optimize,
            learning_rate=learning_rate,
            init_method=init_method,
            num_init_samples=num_init_samples,
            save_best_every=save_best_every,
            show_progress_bars=show_progress_bars,
            force_update=force_update,
        )

`init(posterior_estimator, prior, max_sampling_batch_size=10000, device=None, x_shape=None, enable_transform=True)` ¶

Parameters:

Name	Type	Description	Default
`prior`	`Distribution`	Prior distribution with `.log_prob()` and `.sample()`.	required
`posterior_estimator`	`ConditionalDensityEstimator`	The trained neural posterior.	required
`max_sampling_batch_size`	`int`	Batchsize of samples being drawn from the proposal at every iteration.	`10000`
`device`	`Optional[Union[str, device]]`	Training device, e.g., “cpu”, “cuda” or “cuda:0”. If None, `potential_fn.device` is used.	`None`
`x_shape`	`Optional[Size]`	Deprecated, should not be passed.	`None`
`enable_transform`	`bool`	Whether to transform parameters to unconstrained space during MAP optimization. When False, an identity transform will be returned for `theta_transform`.	`True`

Source code in sbi/inference/posteriors/direct_posterior.py

def __init__(
    self,
    posterior_estimator: ConditionalDensityEstimator,
    prior: Distribution,
    max_sampling_batch_size: int = 10_000,
    device: Optional[Union[str, torch.device]] = None,
    x_shape: Optional[torch.Size] = None,
    enable_transform: bool = True,
):
    """
    Args:
        prior: Prior distribution with `.log_prob()` and `.sample()`.
        posterior_estimator: The trained neural posterior.
        max_sampling_batch_size: Batchsize of samples being drawn from
            the proposal at every iteration.
        device: Training device, e.g., "cpu", "cuda" or "cuda:0". If None,
            `potential_fn.device` is used.
        x_shape: Deprecated, should not be passed.
        enable_transform: Whether to transform parameters to unconstrained space
            during MAP optimization. When False, an identity transform will be
            returned for `theta_transform`.
    """
    # Because `DirectPosterior` does not take the `potential_fn` as input, it
    # builds it itself. The `potential_fn` and `theta_transform` are used only for
    # obtaining the MAP.
    check_prior(prior)
    self.enable_transform = enable_transform
    self.x_shape = x_shape
    potential_fn, theta_transform = posterior_estimator_based_potential(
        posterior_estimator,
        prior,
        x_o=None,
        enable_transform=enable_transform,
    )

    super().__init__(
        potential_fn=potential_fn,
        theta_transform=theta_transform,
        device=device,
        x_shape=x_shape,
    )

    self.device = device
    self.prior = prior
    self.posterior_estimator = posterior_estimator

    self.max_sampling_batch_size = max_sampling_batch_size
    self._leakage_density_correction_factor = None

    self._purpose = """It samples the posterior network and rejects samples that
        lie outside of the prior bounds."""

`leakage_correction(x, num_rejection_samples=10000, force_update=False, show_progress_bars=False, rejection_sampling_batch_size=10000)` ¶

Return leakage correction factor for a leaky posterior density estimate.

The factor is estimated from the acceptance probability during rejection sampling from the posterior.

This is to avoid re-estimating the acceptance probability from scratch whenever log_prob is called and norm_posterior=True. Here, it is estimated only once for self.default_x and saved for later. We re-evaluate only whenever a new x is passed.

Parameters:

Name	Type	Description	Default
`num_rejection_samples`	`int`	Number of samples used to estimate correction factor.	`10000`
`show_progress_bars`	`bool`	Whether to show a progress bar during sampling.	`False`
`rejection_sampling_batch_size`	`int`	Batch size for rejection sampling.	`10000`

Returns:

Type	Description
`Tensor`	Saved or newly-estimated correction factor (as a scalar `Tensor`).

Source code in sbi/inference/posteriors/direct_posterior.py

@torch.no_grad()
def leakage_correction(
    self,
    x: Tensor,
    num_rejection_samples: int = 10_000,
    force_update: bool = False,
    show_progress_bars: bool = False,
    rejection_sampling_batch_size: int = 10_000,
) -> Tensor:
    r"""Return leakage correction factor for a leaky posterior density estimate.

    The factor is estimated from the acceptance probability during rejection
    sampling from the posterior.

    This is to avoid re-estimating the acceptance probability from scratch
    whenever `log_prob` is called and `norm_posterior=True`. Here, it
    is estimated only once for `self.default_x` and saved for later. We
    re-evaluate only whenever a new `x` is passed.

    Arguments:
        num_rejection_samples: Number of samples used to estimate correction factor.
        show_progress_bars: Whether to show a progress bar during sampling.
        rejection_sampling_batch_size: Batch size for rejection sampling.

    Returns:
        Saved or newly-estimated correction factor (as a scalar `Tensor`).
    """

    def acceptance_at(x: Tensor) -> Tensor:
        # [1:] to remove batch-dimension for `reshape_to_batch_event`.
        return rejection.accept_reject_sample(
            proposal=self.posterior_estimator.sample,
            accept_reject_fn=lambda theta: within_support(self.prior, theta),
            num_samples=num_rejection_samples,
            show_progress_bars=show_progress_bars,
            sample_for_correction_factor=True,
            max_sampling_batch_size=rejection_sampling_batch_size,
            proposal_sampling_kwargs={
                "condition": reshape_to_batch_event(
                    x, event_shape=self.posterior_estimator.condition_shape
                )
            },
        )[1]

    # Check if the provided x matches the default x (short-circuit on identity).
    is_new_x = self.default_x is None or (
        x is not self.default_x and (x != self.default_x).any()
    )

    not_saved_at_default_x = self._leakage_density_correction_factor is None

    if is_new_x:  # Calculate at x; don't save.
        return acceptance_at(x)
    elif not_saved_at_default_x or force_update:  # Calculate at default_x; save.
        assert self.default_x is not None
        self._leakage_density_correction_factor = acceptance_at(self.default_x)

    return self._leakage_density_correction_factor  # type: ignore

`log_prob(theta, x=None, norm_posterior=True, track_gradients=False, leakage_correction_params=None)` ¶

Returns the log-probability of the posterior $p(\theta|x)$.

Parameters:

Name	Type	Description	Default
`theta`	`Tensor`	Parameters $\theta$.	required
`norm_posterior`	`bool`	Whether to enforce a normalized posterior density. Renormalization of the posterior is useful when some probability falls out or leaks out of the prescribed prior support. The normalizing factor is calculated via rejection sampling, so if you need speedier but unnormalized log posterior estimates set here `norm_posterior=False`. The returned log posterior is set to -∞ outside of the prior support regardless of this setting.	`True`
`track_gradients`	`bool`	Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.	`False`
`leakage_correction_params`	`Optional[dict]`	A `dict` of keyword arguments to override the default values of `leakage_correction()`. Possible options are: `num_rejection_samples`, `force_update`, `show_progress_bars`, and `rejection_sampling_batch_size`. These parameters only have an effect if `norm_posterior=True`.	`None`

Returns:

Type	Description
`Tensor`	`(len(θ),)`-shaped log posterior probability $\log p(\theta\|x)$ for θ in the
`Tensor`	support of the prior, -∞ (corresponding to 0 probability) outside.

Source code in sbi/inference/posteriors/direct_posterior.py

def log_prob(
    self,
    theta: Tensor,
    x: Optional[Tensor] = None,
    norm_posterior: bool = True,
    track_gradients: bool = False,
    leakage_correction_params: Optional[dict] = None,
) -> Tensor:
    r"""Returns the log-probability of the posterior $p(\theta|x)$.

    Args:
        theta: Parameters $\theta$.
        norm_posterior: Whether to enforce a normalized posterior density.
            Renormalization of the posterior is useful when some
            probability falls out or leaks out of the prescribed prior support.
            The normalizing factor is calculated via rejection sampling, so if you
            need speedier but unnormalized log posterior estimates set here
            `norm_posterior=False`. The returned log posterior is set to
            -∞ outside of the prior support regardless of this setting.
        track_gradients: Whether the returned tensor supports tracking gradients.
            This can be helpful for e.g. sensitivity analysis, but increases memory
            consumption.
        leakage_correction_params: A `dict` of keyword arguments to override the
            default values of `leakage_correction()`. Possible options are:
            `num_rejection_samples`, `force_update`, `show_progress_bars`, and
            `rejection_sampling_batch_size`.
            These parameters only have an effect if `norm_posterior=True`.

    Returns:
        `(len(θ),)`-shaped log posterior probability $\log p(\theta|x)$ for θ in the
        support of the prior, -∞ (corresponding to 0 probability) outside.
    """
    x = self._x_else_default_x(x)

    theta = ensure_theta_batched(torch.as_tensor(theta))
    theta_density_estimator = reshape_to_sample_batch_event(
        theta, theta.shape[1:], leading_is_sample=True
    )
    x_density_estimator = reshape_to_batch_event(
        x, event_shape=self.posterior_estimator.condition_shape
    )
    if x_density_estimator.shape[0] > 1:
        raise ValueError(
            ".log_prob() supports only `batchsize == 1`. If you intend "
            "to evaluate given multiple observations, use `.log_prob_batched()`. "
            "If you intend to evaluate given i.i.d. observations, set up the "
            "posterior density estimator with an appropriate permutation "
            "invariant embedding net."
        )

    self.posterior_estimator.eval()

    with torch.set_grad_enabled(track_gradients):
        # Evaluate on device, move back to cpu for comparison with prior.
        unnorm_log_prob = self.posterior_estimator.log_prob(
            theta_density_estimator, condition=x_density_estimator
        )
        # `log_prob` supports only a single observation (i.e. `batchsize==1`).
        # We now remove this additional dimension.
        unnorm_log_prob = unnorm_log_prob.squeeze(dim=1)

        # Force probability to be zero outside prior support.
        in_prior_support = within_support(self.prior, theta)

        masked_log_prob = torch.where(
            in_prior_support,
            unnorm_log_prob,
            torch.tensor(float("-inf"), dtype=torch.float32, device=self._device),
        )

        if leakage_correction_params is None:
            leakage_correction_params = dict()  # use defaults
        log_factor = (
            log(self.leakage_correction(x=x, **leakage_correction_params))
            if norm_posterior
            else 0
        )

        return masked_log_prob - log_factor

`log_prob_batched(theta, x, norm_posterior=True, track_gradients=False, leakage_correction_params=None)` ¶

Given a batch of observations [x_1, …, x_B] and a batch of parameters [$ heta_1$,…, $ heta_B$] this function evalautes the log-probabilities of the posteriors $p( heta_1|x_1)$, …, $p( heta_B|x_B)$ in a batched (i.e. vectorized) manner.

Parameters:

Name	Type	Description	Default
`theta`	`Tensor`	Batch of parameters $ heta$ of shape `(sample_shape, batch_dim, theta_shape)`.	required
`x`	`Tensor`	Batch of observations $x$ of shape `(batch_dim, *condition_shape)`.	required
`norm_posterior`	`bool`	Whether to enforce a normalized posterior density. Renormalization of the posterior is useful when some probability falls out or leaks out of the prescribed prior support. The normalizing factor is calculated via rejection sampling, so if you need speedier but unnormalized log posterior estimates set here `norm_posterior=False`. The returned log posterior is set to -∞ outside of the prior support regardless of this setting.	`True`
`track_gradients`	`bool`	Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.	`False`
`leakage_correction_params`	`Optional[dict]`	A `dict` of keyword arguments to override the default values of `leakage_correction()`. Possible options are: `num_rejection_samples`, `force_update`, `show_progress_bars`, and `rejection_sampling_batch_size`. These parameters only have an effect if `norm_posterior=True`.	`None`

Returns:

Type	Description
`Tensor`	`(len(θ), B)`-shaped log posterior probability $\log p( heta\|x)$ for θ in the support of the prior, -∞ (corresponding to 0 probability) outside.

Source code in sbi/inference/posteriors/direct_posterior.py

def log_prob_batched(
    self,
    theta: Tensor,
    x: Tensor,
    norm_posterior: bool = True,
    track_gradients: bool = False,
    leakage_correction_params: Optional[dict] = None,
) -> Tensor:
    """Given a batch of observations [x_1, ..., x_B] and a batch of parameters \
        [$\theta_1$,..., $\theta_B$] this function evalautes the log-probabilities \
        of the posteriors $p(\theta_1|x_1)$, ..., $p(\theta_B|x_B)$ in a batched \
        (i.e. vectorized) manner.

    Args:
        theta: Batch of parameters $\theta$ of shape \
            `(*sample_shape, batch_dim, *theta_shape)`.
        x: Batch of observations $x$ of shape \
            `(batch_dim, *condition_shape)`.
        norm_posterior: Whether to enforce a normalized posterior density.
            Renormalization of the posterior is useful when some
            probability falls out or leaks out of the prescribed prior support.
            The normalizing factor is calculated via rejection sampling, so if you
            need speedier but unnormalized log posterior estimates set here
            `norm_posterior=False`. The returned log posterior is set to
            -∞ outside of the prior support regardless of this setting.
        track_gradients: Whether the returned tensor supports tracking gradients.
            This can be helpful for e.g. sensitivity analysis, but increases memory
            consumption.
        leakage_correction_params: A `dict` of keyword arguments to override the
            default values of `leakage_correction()`. Possible options are:
            `num_rejection_samples`, `force_update`, `show_progress_bars`, and
            `rejection_sampling_batch_size`.
            These parameters only have an effect if `norm_posterior=True`.

    Returns:
        `(len(θ), B)`-shaped log posterior probability $\\log p(\theta|x)$\\ for θ \
        in the support of the prior, -∞ (corresponding to 0 probability) outside.
    """

    theta = ensure_theta_batched(torch.as_tensor(theta))
    event_shape = self.posterior_estimator.input_shape
    theta_density_estimator = reshape_to_sample_batch_event(
        theta, event_shape, leading_is_sample=True
    )
    x_density_estimator = reshape_to_batch_event(
        x, event_shape=self.posterior_estimator.condition_shape
    )

    self.posterior_estimator.eval()

    with torch.set_grad_enabled(track_gradients):
        # Evaluate on device, move back to cpu for comparison with prior.
        unnorm_log_prob = self.posterior_estimator.log_prob(
            theta_density_estimator, condition=x_density_estimator
        )

        # Force probability to be zero outside prior support.
        in_prior_support = within_support(self.prior, theta)

        masked_log_prob = torch.where(
            in_prior_support,
            unnorm_log_prob,
            torch.tensor(float("-inf"), dtype=torch.float32, device=self._device),
        )

        if leakage_correction_params is None:
            leakage_correction_params = dict()  # use defaults
        log_factor = (
            log(self.leakage_correction(x=x, **leakage_correction_params))
            if norm_posterior
            else 0
        )

        return masked_log_prob - log_factor

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='posterior', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

Returns the maximum-a-posteriori estimate (MAP).

The method can be interrupted (Ctrl-C) when the user sees that the log-probability converges. The best estimate will be saved in self._map and can be accessed with self.map(). The MAP is obtained by running gradient ascent from a given number of starting positions (samples from the posterior with the highest log-probability). After the optimization is done, we select the parameter set that has the highest log-probability after the optimization.

Warning: The default values used by this function are not well-tested. They might require hand-tuning for the problem at hand.

For developers: if the prior is a BoxUniform, we carry out the optimization in unbounded space and transform the result back into bounded space.

Parameters:

Name	Type	Description	Default
`x`	`Optional[Tensor]`	Deprecated - use `.set_default_x()` prior to `.map()`.	`None`
`num_iter`	`int`	Number of optimization steps that the algorithm takes to find the MAP.	`1000`
`learning_rate`	`float`	Learning rate of the optimizer.	`0.01`
`init_method`	`Union[str, Tensor]`	How to select the starting parameters for the optimization. If it is a string, it can be either [`posterior`, `prior`], which samples the respective distribution `num_init_samples` times. If it is a tensor, the tensor will be used as init locations.	`'posterior'`
`num_init_samples`	`int`	Draw this number of samples from the posterior and evaluate the log-probability of all of them.	`1000`
`num_to_optimize`	`int`	From the drawn `num_init_samples`, use the `num_to_optimize` with highest log-probability as the initial points for the optimization.	`100`
`save_best_every`	`int`	The best log-probability is computed, saved in the `map`-attribute, and printed every `save_best_every`-th iteration. Computing the best log-probability creates a significant overhead (thus, the default is `10`.)	`10`
`show_progress_bars`	`bool`	Whether to show a progressbar during sampling from the posterior.	`False`
`force_update`	`bool`	Whether to re-calculate the MAP when x is unchanged and have a cached value.	`False`
`log_prob_kwargs`		Will be empty for SNLE and SNRE. Will contain {‘norm_posterior’: True} for SNPE.	required

Returns:

Type	Description
`Tensor`	The MAP estimate.

Source code in sbi/inference/posteriors/direct_posterior.py

def map(
    self,
    x: Optional[Tensor] = None,
    num_iter: int = 1_000,
    num_to_optimize: int = 100,
    learning_rate: float = 0.01,
    init_method: Union[str, Tensor] = "posterior",
    num_init_samples: int = 1_000,
    save_best_every: int = 10,
    show_progress_bars: bool = False,
    force_update: bool = False,
) -> Tensor:
    r"""Returns the maximum-a-posteriori estimate (MAP).

    The method can be interrupted (Ctrl-C) when the user sees that the
    log-probability converges. The best estimate will be saved in `self._map` and
    can be accessed with `self.map()`. The MAP is obtained by running gradient
    ascent from a given number of starting positions (samples from the posterior
    with the highest log-probability). After the optimization is done, we select the
    parameter set that has the highest log-probability after the optimization.

    Warning: The default values used by this function are not well-tested. They
    might require hand-tuning for the problem at hand.

    For developers: if the prior is a `BoxUniform`, we carry out the optimization
    in unbounded space and transform the result back into bounded space.

    Args:
        x: Deprecated - use `.set_default_x()` prior to `.map()`.
        num_iter: Number of optimization steps that the algorithm takes
            to find the MAP.
        learning_rate: Learning rate of the optimizer.
        init_method: How to select the starting parameters for the optimization. If
            it is a string, it can be either [`posterior`, `prior`], which samples
            the respective distribution `num_init_samples` times. If it is a
            tensor, the tensor will be used as init locations.
        num_init_samples: Draw this number of samples from the posterior and
            evaluate the log-probability of all of them.
        num_to_optimize: From the drawn `num_init_samples`, use the
            `num_to_optimize` with highest log-probability as the initial points
            for the optimization.
        save_best_every: The best log-probability is computed, saved in the
            `map`-attribute, and printed every `save_best_every`-th iteration.
            Computing the best log-probability creates a significant overhead
            (thus, the default is `10`.)
        show_progress_bars: Whether to show a progressbar during sampling from the
            posterior.
        force_update: Whether to re-calculate the MAP when x is unchanged and
            have a cached value.
        log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
            {'norm_posterior': True} for SNPE.

    Returns:
        The MAP estimate.
    """
    return super().map(
        x=x,
        num_iter=num_iter,
        num_to_optimize=num_to_optimize,
        learning_rate=learning_rate,
        init_method=init_method,
        num_init_samples=num_init_samples,
        save_best_every=save_best_every,
        show_progress_bars=show_progress_bars,
        force_update=force_update,
    )

`sample(sample_shape=torch.Size(), x=None, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=True)` ¶

Return samples from posterior distribution $p(\theta|x)$.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	Desired shape of samples that are drawn from posterior. If sample_shape is multidimensional we simply draw `sample_shape.numel()` samples and then reshape into the desired shape.	`Size()`
`sample_with`	`Optional[str]`	This argument only exists to keep backward-compatibility with `sbi` v0.17.2 or older. If it is set, we instantly raise an error.	`None`
`show_progress_bars`	`bool`	Whether to show sampling progress monitor.	`True`

Source code in sbi/inference/posteriors/direct_posterior.py

def sample(
    self,
    sample_shape: Shape = torch.Size(),
    x: Optional[Tensor] = None,
    max_sampling_batch_size: int = 10_000,
    sample_with: Optional[str] = None,
    show_progress_bars: bool = True,
) -> Tensor:
    r"""Return samples from posterior distribution $p(\theta|x)$.

    Args:
        sample_shape: Desired shape of samples that are drawn from posterior. If
            sample_shape is multidimensional we simply draw `sample_shape.numel()`
            samples and then reshape into the desired shape.
        sample_with: This argument only exists to keep backward-compatibility with
            `sbi` v0.17.2 or older. If it is set, we instantly raise an error.
        show_progress_bars: Whether to show sampling progress monitor.
    """
    num_samples = torch.Size(sample_shape).numel()
    x = self._x_else_default_x(x)
    x = reshape_to_batch_event(
        x, event_shape=self.posterior_estimator.condition_shape
    )
    if x.shape[0] > 1:
        raise ValueError(
            ".sample() supports only `batchsize == 1`. If you intend "
            "to sample multiple observations, use `.sample_batched()`. "
            "If you intend to sample i.i.d. observations, set up the "
            "posterior density estimator with an appropriate permutation "
            "invariant embedding net."
        )

    max_sampling_batch_size = (
        self.max_sampling_batch_size
        if max_sampling_batch_size is None
        else max_sampling_batch_size
    )

    if sample_with is not None:
        raise ValueError(
            f"You set `sample_with={sample_with}`. As of sbi v0.18.0, setting "
            f"`sample_with` is no longer supported. You have to rerun "
            f"`.build_posterior(sample_with={sample_with}).`"
        )

    samples = rejection.accept_reject_sample(
        proposal=self.posterior_estimator.sample,
        accept_reject_fn=lambda theta: within_support(self.prior, theta),
        num_samples=num_samples,
        show_progress_bars=show_progress_bars,
        max_sampling_batch_size=max_sampling_batch_size,
        proposal_sampling_kwargs={"condition": x},
        alternative_method="build_posterior(..., sample_with='mcmc')",
    )[0]  # [0] to return only samples, not acceptance probabilities.

    return samples[:, 0]  # Remove batch dimension.

`sample_batched(sample_shape, x, max_sampling_batch_size=10000, show_progress_bars=True)` ¶

Given a batch of observations [x_1, …, x_B] this function samples from posteriors $p(\theta|x_1)$, … ,$p(\theta|x_B)$, in a batched (i.e. vectorized) manner.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	Desired shape of samples that are drawn from the posterior given every observation.	required
`x`	`Tensor`	A batch of observations, of shape `(batch_dim, event_shape_x)`. `batch_dim` corresponds to the number of observations to be drawn.	required
`max_sampling_batch_size`	`int`	Maximum batch size for rejection sampling.	`10000`
`show_progress_bars`	`bool`	Whether to show sampling progress monitor.	`True`

Returns:

Type	Description
`Tensor`	Samples from the posteriors of shape (sample_shape, B, input_shape)

Source code in sbi/inference/posteriors/direct_posterior.py

def sample_batched(
    self,
    sample_shape: Shape,
    x: Tensor,
    max_sampling_batch_size: int = 10_000,
    show_progress_bars: bool = True,
) -> Tensor:
    r"""Given a batch of observations [x_1, ..., x_B] this function samples from
    posteriors $p(\theta|x_1)$, ... ,$p(\theta|x_B)$, in a batched (i.e. vectorized)
    manner.

    Args:
        sample_shape: Desired shape of samples that are drawn from the posterior
            given every observation.
        x: A batch of observations, of shape `(batch_dim, event_shape_x)`.
            `batch_dim` corresponds to the number of observations to be drawn.
        max_sampling_batch_size: Maximum batch size for rejection sampling.
        show_progress_bars: Whether to show sampling progress monitor.

    Returns:
        Samples from the posteriors of shape (*sample_shape, B, *input_shape)
    """
    num_samples = torch.Size(sample_shape).numel()
    condition_shape = self.posterior_estimator.condition_shape
    x = reshape_to_batch_event(x, event_shape=condition_shape)
    num_xos = x.shape[0]

    # throw warning if num_x * num_samples is too large
    if num_xos * num_samples > 2**21:  # 2 million-ish
        warnings.warn(
            f"Note that for batched sampling, the direct posterior sampling "
            f"generates {num_xos} * {num_samples} = {num_xos * num_samples} "
            "samples. This can be slow and memory-intensive. Consider "
            "reducing the number of samples or batch size.",
            stacklevel=2,
        )

    max_sampling_batch_size = (
        self.max_sampling_batch_size
        if max_sampling_batch_size is None
        else max_sampling_batch_size
    )

    # Adjust max_sampling_batch_size to avoid excessive memory usage
    if max_sampling_batch_size * num_xos > 100_000:
        capped = max(1, 100_000 // num_xos)
        warnings.warn(
            f"Capping max_sampling_batch_size from {max_sampling_batch_size} "
            f"to {capped} to avoid excessive memory usage.",
            stacklevel=2,
        )
        max_sampling_batch_size = capped

    samples = rejection.accept_reject_sample(
        proposal=self.posterior_estimator.sample,
        accept_reject_fn=lambda theta: within_support(self.prior, theta),
        num_samples=num_samples,
        show_progress_bars=show_progress_bars,
        max_sampling_batch_size=max_sampling_batch_size,
        proposal_sampling_kwargs={"condition": x},
        alternative_method="build_posterior(..., sample_with='mcmc')",
    )[0]

    return samples

`to(device)` ¶

Move posterior_estimator, prior and x_o to device.

Changes the device attribute, reinstanciates the posterior, and resets the default x.

Parameters:

Name	Type	Description	Default
`device`	`Union[str, device]`	device where to move the posterior to.	required

Source code in sbi/inference/posteriors/direct_posterior.py

def to(self, device: Union[str, torch.device]) -> None:
    """Move posterior_estimator, prior and x_o to device.

    Changes the device attribute, reinstanciates the
    posterior, and resets the default x.

    Args:
        device: device where to move the posterior to.
    """
    self.device = device
    if hasattr(self.prior, "to"):
        self.prior.to(device)  # type: ignore
    else:
        raise ValueError("""Prior has no attribute to(device).""")
    if hasattr(self.posterior_estimator, "to"):
        self.posterior_estimator.to(device)
    else:
        raise ValueError("""Posterior estimator has no attribute to(device).""")

    potential_fn, theta_transform = posterior_estimator_based_potential(
        self.posterior_estimator,
        self.prior,
        x_o=None,
        enable_transform=self.enable_transform,
    )
    x_o = None
    if hasattr(self, "_x") and (self._x is not None):
        x_o = self._x.to(device)

    super().__init__(
        potential_fn=potential_fn,
        theta_transform=theta_transform,
        device=device,
        x_shape=self.x_shape,
    )
    # super().__init__ erases the self._x, so we need to set it again
    if x_o is not None:
        self.set_default_x(x_o)

`ImportanceSamplingPosterior` ¶

Bases: NeuralPosterior

Provides importance sampling to sample from the posterior.

SNLE or SNRE train neural networks to approximate the likelihood(-ratios). ImportanceSamplingPosterior allows to estimate the posterior log-probability by estimating the normlalization constant with importance sampling. It also allows to perform importance sampling (with .sample()) and to draw approximate samples with sampling-importance-resampling (SIR) (with .sir_sample())

Source code in sbi/inference/posteriors/importance_posterior.py

class ImportanceSamplingPosterior(NeuralPosterior):
    r"""Provides importance sampling to sample from the posterior.

    SNLE or SNRE train neural networks to approximate the likelihood(-ratios).
    `ImportanceSamplingPosterior` allows to estimate the posterior log-probability by
    estimating the normlalization constant with importance sampling. It also allows to
    perform importance sampling (with `.sample()`) and to draw approximate samples with
    sampling-importance-resampling (SIR) (with `.sir_sample()`)
    """

    def __init__(
        self,
        potential_fn: Union[Callable, BasePotential],
        proposal: Any,
        theta_transform: Optional[TorchTransform] = None,
        method: Literal["sir", "importance"] = "sir",
        oversampling_factor: int = 32,
        max_sampling_batch_size: int = 10_000,
        device: Optional[Union[str, torch.device]] = None,
        x_shape: Optional[torch.Size] = None,
    ):
        """
        Args:
            potential_fn: The potential function from which to draw samples. Must be a
                `BasePotential` or a `Callable` which takes `theta` and `x_o` as inputs.
            proposal: The proposal distribution.
            theta_transform: Transformation that is applied to parameters. Is not used
                during but only when calling `.map()`.
            method: Either of [`sir`|`importance`]. This sets the behavior of the
                `.sample()` method. With `sir`, approximate posterior samples are
                generated with sampling importance resampling (SIR). With
                `importance`, the `.sample()` method returns a tuple of samples and
                corresponding importance weights.
            oversampling_factor: Number of proposed samples from which only one is
                selected based on its importance weight.
            max_sampling_batch_size: The batch size of samples being drawn from the
                proposal at every iteration.
            device: Device on which to sample, e.g., "cpu", "cuda" or "cuda:0". If
                None, `potential_fn.device` is used.
            x_shape: Deprecated, should not be passed.
        """
        super().__init__(
            potential_fn,
            theta_transform=theta_transform,
            device=device,
            x_shape=x_shape,
        )

        self.proposal = proposal
        self._normalization_constant = None
        self.method = method
        self.theta_transform = theta_transform

        self.oversampling_factor = oversampling_factor
        self.max_sampling_batch_size = max_sampling_batch_size

        self._purpose = (
            "It provides sampling-importance resampling (SIR) to .sample() from the "
            "posterior and can evaluate the _unnormalized_ posterior density with "
            ".log_prob()."
        )
        self.x_shape = x_shape

    def to(self, device: Union[str, torch.device]) -> None:
        """
        Move the potential, the proposal and x_o to a new device.

        It also reinstantiates the posterior with the new device.

        Args:
            device: Device on which to move the posterior to.
        """
        self.device = device
        self.potential_fn.to(device)  # type: ignore
        self.proposal.to(device)
        x_o = None
        if hasattr(self, "_x") and (self._x is not None):
            x_o = self._x.to(device)

        self.theta_transform = mcmc_transform(self.proposal, device=device)
        super().__init__(
            self.potential_fn,
            theta_transform=self.theta_transform,
            device=device,
            x_shape=self.x_shape,
        )
        # super().__init__ erases the self._x, so we need to set it again
        if x_o is not None:
            self.set_default_x(x_o)

    def log_prob(
        self,
        theta: Tensor,
        x: Optional[Tensor] = None,
        track_gradients: bool = False,
        normalization_constant_params: Optional[dict] = None,
    ) -> Tensor:
        r"""Returns the log-probability of theta under the posterior.

        The normalization constant is estimated with importance sampling.

        Args:
            theta: Parameters $\theta$.
            track_gradients: Whether the returned tensor supports tracking gradients.
                This can be helpful for e.g. sensitivity analysis, but increases memory
                consumption.
            normalization_constant_params: Parameters passed on to
                `estimate_normalization_constant()`.

        Returns:
            `len($\theta$)`-shaped log-probability.
        """
        x = self._x_else_default_x(x)
        self.potential_fn.set_x(x)

        theta = ensure_theta_batched(torch.as_tensor(theta))

        with torch.set_grad_enabled(track_gradients):
            potential_values = self.potential_fn(
                theta.to(self._device), track_gradients=track_gradients
            )

            if normalization_constant_params is None:
                normalization_constant_params = dict()  # use defaults
            normalization_constant = self.estimate_normalization_constant(
                x, **normalization_constant_params
            )

            return (potential_values - torch.log(normalization_constant)).to(
                self._device
            )

    @torch.no_grad()
    def estimate_normalization_constant(
        self, x: Tensor, num_samples: int = 10_000, force_update: bool = False
    ) -> Tensor:
        """Returns the normalization constant via importance sampling.

        Args:
            num_samples: Number of importance samples used for the estimate.
            force_update: Whether to re-calculate the normlization constant when x is
                unchanged and have a cached value.
        """
        # Check if the provided x matches the default x (short-circuit on identity).
        is_new_x = self.default_x is None or (
            x is not self.default_x and (x != self.default_x).any()
        )

        not_saved_at_default_x = self._normalization_constant is None

        if is_new_x:  # Calculate at x; don't save.
            _, log_importance_weights = importance_sample(
                self.potential_fn,
                proposal=self.proposal,
                num_samples=num_samples,
            )
            return torch.mean(torch.exp(log_importance_weights))
        elif not_saved_at_default_x or force_update:  # Calculate at default_x; save.
            assert self.default_x is not None
            _, log_importance_weights = importance_sample(
                self.potential_fn,
                proposal=self.proposal,
                num_samples=num_samples,
            )
            self._normalization_constant = torch.mean(torch.exp(log_importance_weights))

        return self._normalization_constant.to(self._device)  # type: ignore

    def sample(
        self,
        sample_shape: Shape = torch.Size(),
        x: Optional[Tensor] = None,
        method: Optional[str] = None,
        oversampling_factor: int = 32,
        max_sampling_batch_size: int = 10_000,
        sample_with: Optional[str] = None,
        show_progress_bars: bool = False,
    ) -> Union[Tensor, Tuple[Tensor, Tensor]]:
        """Return samples from the approximate posterior distribution.

        Args:
            sample_shape: Shape of samples that are drawn from posterior.
            x: Observed data.
            method: Either of [`sir`|`importance`]. This sets the behavior of the
                `.sample()` method. With `sir`, approximate posterior samples are
                generated with sampling importance resampling (SIR). With
                `importance`, the `.sample()` method returns a tuple of samples and
                corresponding importance weights.
            oversampling_factor: Number of proposed samples from which only one is
                selected based on its importance weight.
            max_sampling_batch_size: The batch size of samples being drawn from the
                proposal at every iteration.
            show_progress_bars: Whether to show a progressbar during sampling.
        """

        method = self.method if method is None else method

        if sample_with is not None:
            raise ValueError(
                f"You set `sample_with={sample_with}`. As of sbi v0.18.0, setting "
                f"`sample_with` is no longer supported. You have to rerun "
                f"`.build_posterior(sample_with={sample_with}).`"
            )

        self.potential_fn.set_x(self._x_else_default_x(x))

        if method == "sir":
            return self._sir_sample(
                sample_shape,
                oversampling_factor=oversampling_factor,
                max_sampling_batch_size=max_sampling_batch_size,
                show_progress_bars=show_progress_bars,
            )
        elif method == "importance":
            return self._importance_sample(sample_shape)
        else:
            raise NameError

    def sample_batched(
        self,
        sample_shape: Shape,
        x: Tensor,
        max_sampling_batch_size: int = 10000,
        show_progress_bars: bool = True,
    ) -> Tensor:
        raise NotImplementedError(
            "Batched sampling is not implemented for ImportanceSamplingPosterior. \
           Alternatively you can use `sample` in a loop \
           [posterior.sample(theta, x_o) for x_o in x]."
        )

    def _importance_sample(
        self,
        sample_shape: Shape = torch.Size(),
        show_progress_bars: bool = False,
    ) -> Tuple[Tensor, Tensor]:
        """Returns samples from the proposal and log of their importance weights.

        Args:
            sample_shape: Desired shape of samples that are drawn from posterior.
            sample_with: This argument only exists to keep backward-compatibility with
                `sbi` v0.17.2 or older. If it is set, we instantly raise an error.
            show_progress_bars: Whether to show sampling progress monitor.

        Returns:
            Samples and logarithm of corresponding importance weights.
        """
        num_samples = torch.Size(sample_shape).numel()
        samples, log_importance_weights = importance_sample(
            self.potential_fn,
            proposal=self.proposal,
            num_samples=num_samples,
            show_progress_bars=show_progress_bars,
        )

        samples = samples.reshape((*sample_shape, -1)).to(self._device)
        return samples, log_importance_weights.to(self._device)

    def _sir_sample(
        self,
        sample_shape: Shape = torch.Size(),
        oversampling_factor: int = 32,
        max_sampling_batch_size: int = 10_000,
        show_progress_bars: bool = False,
    ):
        r"""Returns approximate samples from posterior $p(\theta|x)$ via SIR.

        Args:
            sample_shape: Desired shape of samples that are drawn from posterior. If
                sample_shape is multidimensional we simply draw `sample_shape.numel()`
                samples and then reshape into the desired shape.
            x: Observed data.
            sample_with: This argument only exists to keep backward-compatibility with
                `sbi` v0.17.2 or older. If it is set, we instantly raise an error.
            oversampling_factor: Number of proposed samples form which only one is
                selected based on its importance weight.
            max_sampling_batch_size: The batchsize of samples being drawn from
                the proposal at every iteration. Used only in `sir_sample()`.
            show_progress_bars: Whether to show sampling progress monitor.

        Returns:
            Samples from posterior.
        """
        # Replace arguments that were not passed with their default.
        oversampling_factor = (
            self.oversampling_factor
            if oversampling_factor is None
            else oversampling_factor
        )
        max_sampling_batch_size = (
            self.max_sampling_batch_size
            if max_sampling_batch_size is None
            else max_sampling_batch_size
        )

        num_samples = torch.Size(sample_shape).numel()
        samples = sampling_importance_resampling(
            self.potential_fn,
            proposal=self.proposal,
            num_samples=num_samples,
            num_candidate_samples=oversampling_factor,
            show_progress_bars=show_progress_bars,
            max_sampling_batch_size=max_sampling_batch_size,
            device=self._device,
        )

        return samples.reshape((*sample_shape, -1)).to(self._device)

    def map(
        self,
        x: Optional[Tensor] = None,
        num_iter: int = 1_000,
        num_to_optimize: int = 100,
        learning_rate: float = 0.01,
        init_method: Union[str, Tensor] = "proposal",
        num_init_samples: int = 1_000,
        save_best_every: int = 10,
        show_progress_bars: bool = False,
        force_update: bool = False,
    ) -> Tensor:
        r"""Returns the maximum-a-posteriori estimate (MAP).

        The method can be interrupted (Ctrl-C) when the user sees that the
        log-probability converges. The best estimate will be saved in `self._map` and
        can be accessed with `self.map()`. The MAP is obtained by running gradient
        ascent from a given number of starting positions (samples from the posterior
        with the highest log-probability). After the optimization is done, we select the
        parameter set that has the highest log-probability after the optimization.

        Warning: The default values used by this function are not well-tested. They
        might require hand-tuning for the problem at hand.

        For developers: if the prior is a `BoxUniform`, we carry out the optimization
        in unbounded space and transform the result back into bounded space.

        Args:
            x: Deprecated - use `.set_default_x()` prior to `.map()`.
            num_iter: Number of optimization steps that the algorithm takes
                to find the MAP.
            learning_rate: Learning rate of the optimizer.
            init_method: How to select the starting parameters for the optimization. If
                it is a string, it can be either [`posterior`, `prior`], which samples
                the respective distribution `num_init_samples` times. If it is a
                tensor, the tensor will be used as init locations.
            num_init_samples: Draw this number of samples from the posterior and
                evaluate the log-probability of all of them.
            num_to_optimize: From the drawn `num_init_samples`, use the
                `num_to_optimize` with highest log-probability as the initial points
                for the optimization.
            save_best_every: The best log-probability is computed, saved in the
                `map`-attribute, and printed every `save_best_every`-th iteration.
                Computing the best log-probability creates a significant overhead
                (thus, the default is `10`.)
            show_progress_bars: Whether to show a progressbar during sampling from the
                posterior.
            force_update: Whether to re-calculate the MAP when x is unchanged and
                have a cached value.
            log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
                {'norm_posterior': True} for SNPE.

        Returns:
            The MAP estimate.
        """
        return super().map(
            x=x,
            num_iter=num_iter,
            num_to_optimize=num_to_optimize,
            learning_rate=learning_rate,
            init_method=init_method,
            num_init_samples=num_init_samples,
            save_best_every=save_best_every,
            show_progress_bars=show_progress_bars,
            force_update=force_update,
        )

`init(potential_fn, proposal, theta_transform=None, method='sir', oversampling_factor=32, max_sampling_batch_size=10000, device=None, x_shape=None)` ¶

Parameters:

Name	Type	Description	Default
`potential_fn`	`Union[Callable, BasePotential]`	The potential function from which to draw samples. Must be a `BasePotential` or a `Callable` which takes `theta` and `x_o` as inputs.	required
`proposal`	`Any`	The proposal distribution.	required
`theta_transform`	`Optional[TorchTransform]`	Transformation that is applied to parameters. Is not used during but only when calling `.map()`.	`None`
`method`	`Literal['sir', 'importance']`	Either of [`sir`\|`importance`]. This sets the behavior of the `.sample()` method. With `sir`, approximate posterior samples are generated with sampling importance resampling (SIR). With `importance`, the `.sample()` method returns a tuple of samples and corresponding importance weights.	`'sir'`
`oversampling_factor`	`int`	Number of proposed samples from which only one is selected based on its importance weight.	`32`
`max_sampling_batch_size`	`int`	The batch size of samples being drawn from the proposal at every iteration.	`10000`
`device`	`Optional[Union[str, device]]`	Device on which to sample, e.g., “cpu”, “cuda” or “cuda:0”. If None, `potential_fn.device` is used.	`None`
`x_shape`	`Optional[Size]`	Deprecated, should not be passed.	`None`

Source code in sbi/inference/posteriors/importance_posterior.py

def __init__(
    self,
    potential_fn: Union[Callable, BasePotential],
    proposal: Any,
    theta_transform: Optional[TorchTransform] = None,
    method: Literal["sir", "importance"] = "sir",
    oversampling_factor: int = 32,
    max_sampling_batch_size: int = 10_000,
    device: Optional[Union[str, torch.device]] = None,
    x_shape: Optional[torch.Size] = None,
):
    """
    Args:
        potential_fn: The potential function from which to draw samples. Must be a
            `BasePotential` or a `Callable` which takes `theta` and `x_o` as inputs.
        proposal: The proposal distribution.
        theta_transform: Transformation that is applied to parameters. Is not used
            during but only when calling `.map()`.
        method: Either of [`sir`|`importance`]. This sets the behavior of the
            `.sample()` method. With `sir`, approximate posterior samples are
            generated with sampling importance resampling (SIR). With
            `importance`, the `.sample()` method returns a tuple of samples and
            corresponding importance weights.
        oversampling_factor: Number of proposed samples from which only one is
            selected based on its importance weight.
        max_sampling_batch_size: The batch size of samples being drawn from the
            proposal at every iteration.
        device: Device on which to sample, e.g., "cpu", "cuda" or "cuda:0". If
            None, `potential_fn.device` is used.
        x_shape: Deprecated, should not be passed.
    """
    super().__init__(
        potential_fn,
        theta_transform=theta_transform,
        device=device,
        x_shape=x_shape,
    )

    self.proposal = proposal
    self._normalization_constant = None
    self.method = method
    self.theta_transform = theta_transform

    self.oversampling_factor = oversampling_factor
    self.max_sampling_batch_size = max_sampling_batch_size

    self._purpose = (
        "It provides sampling-importance resampling (SIR) to .sample() from the "
        "posterior and can evaluate the _unnormalized_ posterior density with "
        ".log_prob()."
    )
    self.x_shape = x_shape

`estimate_normalization_constant(x, num_samples=10000, force_update=False)` ¶

Returns the normalization constant via importance sampling.

Parameters:

Name	Type	Description	Default
`num_samples`	`int`	Number of importance samples used for the estimate.	`10000`
`force_update`	`bool`	Whether to re-calculate the normlization constant when x is unchanged and have a cached value.	`False`

Source code in sbi/inference/posteriors/importance_posterior.py

@torch.no_grad()
def estimate_normalization_constant(
    self, x: Tensor, num_samples: int = 10_000, force_update: bool = False
) -> Tensor:
    """Returns the normalization constant via importance sampling.

    Args:
        num_samples: Number of importance samples used for the estimate.
        force_update: Whether to re-calculate the normlization constant when x is
            unchanged and have a cached value.
    """
    # Check if the provided x matches the default x (short-circuit on identity).
    is_new_x = self.default_x is None or (
        x is not self.default_x and (x != self.default_x).any()
    )

    not_saved_at_default_x = self._normalization_constant is None

    if is_new_x:  # Calculate at x; don't save.
        _, log_importance_weights = importance_sample(
            self.potential_fn,
            proposal=self.proposal,
            num_samples=num_samples,
        )
        return torch.mean(torch.exp(log_importance_weights))
    elif not_saved_at_default_x or force_update:  # Calculate at default_x; save.
        assert self.default_x is not None
        _, log_importance_weights = importance_sample(
            self.potential_fn,
            proposal=self.proposal,
            num_samples=num_samples,
        )
        self._normalization_constant = torch.mean(torch.exp(log_importance_weights))

    return self._normalization_constant.to(self._device)  # type: ignore

`log_prob(theta, x=None, track_gradients=False, normalization_constant_params=None)` ¶

Returns the log-probability of theta under the posterior.

The normalization constant is estimated with importance sampling.

Parameters:

Name	Type	Description	Default
`theta`	`Tensor`	Parameters $\theta$.	required
`track_gradients`	`bool`	Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.	`False`
`normalization_constant_params`	`Optional[dict]`	Parameters passed on to `estimate_normalization_constant()`.	`None`

Returns:

Type	Description
`Tensor`	`len($\theta$)`-shaped log-probability.

Source code in sbi/inference/posteriors/importance_posterior.py

def log_prob(
    self,
    theta: Tensor,
    x: Optional[Tensor] = None,
    track_gradients: bool = False,
    normalization_constant_params: Optional[dict] = None,
) -> Tensor:
    r"""Returns the log-probability of theta under the posterior.

    The normalization constant is estimated with importance sampling.

    Args:
        theta: Parameters $\theta$.
        track_gradients: Whether the returned tensor supports tracking gradients.
            This can be helpful for e.g. sensitivity analysis, but increases memory
            consumption.
        normalization_constant_params: Parameters passed on to
            `estimate_normalization_constant()`.

    Returns:
        `len($\theta$)`-shaped log-probability.
    """
    x = self._x_else_default_x(x)
    self.potential_fn.set_x(x)

    theta = ensure_theta_batched(torch.as_tensor(theta))

    with torch.set_grad_enabled(track_gradients):
        potential_values = self.potential_fn(
            theta.to(self._device), track_gradients=track_gradients
        )

        if normalization_constant_params is None:
            normalization_constant_params = dict()  # use defaults
        normalization_constant = self.estimate_normalization_constant(
            x, **normalization_constant_params
        )

        return (potential_values - torch.log(normalization_constant)).to(
            self._device
        )

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

Returns the maximum-a-posteriori estimate (MAP).

The method can be interrupted (Ctrl-C) when the user sees that the log-probability converges. The best estimate will be saved in self._map and can be accessed with self.map(). The MAP is obtained by running gradient ascent from a given number of starting positions (samples from the posterior with the highest log-probability). After the optimization is done, we select the parameter set that has the highest log-probability after the optimization.

Warning: The default values used by this function are not well-tested. They might require hand-tuning for the problem at hand.

For developers: if the prior is a BoxUniform, we carry out the optimization in unbounded space and transform the result back into bounded space.

Parameters:

Name	Type	Description	Default
`x`	`Optional[Tensor]`	Deprecated - use `.set_default_x()` prior to `.map()`.	`None`
`num_iter`	`int`	Number of optimization steps that the algorithm takes to find the MAP.	`1000`
`learning_rate`	`float`	Learning rate of the optimizer.	`0.01`
`init_method`	`Union[str, Tensor]`	How to select the starting parameters for the optimization. If it is a string, it can be either [`posterior`, `prior`], which samples the respective distribution `num_init_samples` times. If it is a tensor, the tensor will be used as init locations.	`'proposal'`
`num_init_samples`	`int`	Draw this number of samples from the posterior and evaluate the log-probability of all of them.	`1000`
`num_to_optimize`	`int`	From the drawn `num_init_samples`, use the `num_to_optimize` with highest log-probability as the initial points for the optimization.	`100`
`save_best_every`	`int`	The best log-probability is computed, saved in the `map`-attribute, and printed every `save_best_every`-th iteration. Computing the best log-probability creates a significant overhead (thus, the default is `10`.)	`10`
`show_progress_bars`	`bool`	Whether to show a progressbar during sampling from the posterior.	`False`
`force_update`	`bool`	Whether to re-calculate the MAP when x is unchanged and have a cached value.	`False`
`log_prob_kwargs`		Will be empty for SNLE and SNRE. Will contain {‘norm_posterior’: True} for SNPE.	required

Returns:

Type	Description
`Tensor`	The MAP estimate.

Source code in sbi/inference/posteriors/importance_posterior.py

def map(
    self,
    x: Optional[Tensor] = None,
    num_iter: int = 1_000,
    num_to_optimize: int = 100,
    learning_rate: float = 0.01,
    init_method: Union[str, Tensor] = "proposal",
    num_init_samples: int = 1_000,
    save_best_every: int = 10,
    show_progress_bars: bool = False,
    force_update: bool = False,
) -> Tensor:
    r"""Returns the maximum-a-posteriori estimate (MAP).

    The method can be interrupted (Ctrl-C) when the user sees that the
    log-probability converges. The best estimate will be saved in `self._map` and
    can be accessed with `self.map()`. The MAP is obtained by running gradient
    ascent from a given number of starting positions (samples from the posterior
    with the highest log-probability). After the optimization is done, we select the
    parameter set that has the highest log-probability after the optimization.

    Warning: The default values used by this function are not well-tested. They
    might require hand-tuning for the problem at hand.

    For developers: if the prior is a `BoxUniform`, we carry out the optimization
    in unbounded space and transform the result back into bounded space.

    Args:
        x: Deprecated - use `.set_default_x()` prior to `.map()`.
        num_iter: Number of optimization steps that the algorithm takes
            to find the MAP.
        learning_rate: Learning rate of the optimizer.
        init_method: How to select the starting parameters for the optimization. If
            it is a string, it can be either [`posterior`, `prior`], which samples
            the respective distribution `num_init_samples` times. If it is a
            tensor, the tensor will be used as init locations.
        num_init_samples: Draw this number of samples from the posterior and
            evaluate the log-probability of all of them.
        num_to_optimize: From the drawn `num_init_samples`, use the
            `num_to_optimize` with highest log-probability as the initial points
            for the optimization.
        save_best_every: The best log-probability is computed, saved in the
            `map`-attribute, and printed every `save_best_every`-th iteration.
            Computing the best log-probability creates a significant overhead
            (thus, the default is `10`.)
        show_progress_bars: Whether to show a progressbar during sampling from the
            posterior.
        force_update: Whether to re-calculate the MAP when x is unchanged and
            have a cached value.
        log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
            {'norm_posterior': True} for SNPE.

    Returns:
        The MAP estimate.
    """
    return super().map(
        x=x,
        num_iter=num_iter,
        num_to_optimize=num_to_optimize,
        learning_rate=learning_rate,
        init_method=init_method,
        num_init_samples=num_init_samples,
        save_best_every=save_best_every,
        show_progress_bars=show_progress_bars,
        force_update=force_update,
    )

`sample(sample_shape=torch.Size(), x=None, method=None, oversampling_factor=32, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=False)` ¶

Return samples from the approximate posterior distribution.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	Shape of samples that are drawn from posterior.	`Size()`
`x`	`Optional[Tensor]`	Observed data.	`None`
`method`	`Optional[str]`	Either of [`sir`\|`importance`]. This sets the behavior of the `.sample()` method. With `sir`, approximate posterior samples are generated with sampling importance resampling (SIR). With `importance`, the `.sample()` method returns a tuple of samples and corresponding importance weights.	`None`
`oversampling_factor`	`int`	Number of proposed samples from which only one is selected based on its importance weight.	`32`
`max_sampling_batch_size`	`int`	The batch size of samples being drawn from the proposal at every iteration.	`10000`
`show_progress_bars`	`bool`	Whether to show a progressbar during sampling.	`False`

Source code in sbi/inference/posteriors/importance_posterior.py

def sample(
    self,
    sample_shape: Shape = torch.Size(),
    x: Optional[Tensor] = None,
    method: Optional[str] = None,
    oversampling_factor: int = 32,
    max_sampling_batch_size: int = 10_000,
    sample_with: Optional[str] = None,
    show_progress_bars: bool = False,
) -> Union[Tensor, Tuple[Tensor, Tensor]]:
    """Return samples from the approximate posterior distribution.

    Args:
        sample_shape: Shape of samples that are drawn from posterior.
        x: Observed data.
        method: Either of [`sir`|`importance`]. This sets the behavior of the
            `.sample()` method. With `sir`, approximate posterior samples are
            generated with sampling importance resampling (SIR). With
            `importance`, the `.sample()` method returns a tuple of samples and
            corresponding importance weights.
        oversampling_factor: Number of proposed samples from which only one is
            selected based on its importance weight.
        max_sampling_batch_size: The batch size of samples being drawn from the
            proposal at every iteration.
        show_progress_bars: Whether to show a progressbar during sampling.
    """

    method = self.method if method is None else method

    if sample_with is not None:
        raise ValueError(
            f"You set `sample_with={sample_with}`. As of sbi v0.18.0, setting "
            f"`sample_with` is no longer supported. You have to rerun "
            f"`.build_posterior(sample_with={sample_with}).`"
        )

    self.potential_fn.set_x(self._x_else_default_x(x))

    if method == "sir":
        return self._sir_sample(
            sample_shape,
            oversampling_factor=oversampling_factor,
            max_sampling_batch_size=max_sampling_batch_size,
            show_progress_bars=show_progress_bars,
        )
    elif method == "importance":
        return self._importance_sample(sample_shape)
    else:
        raise NameError

`to(device)` ¶

Move the potential, the proposal and x_o to a new device.

It also reinstantiates the posterior with the new device.

Parameters:

Name	Type	Description	Default
`device`	`Union[str, device]`	Device on which to move the posterior to.	required

Source code in sbi/inference/posteriors/importance_posterior.py

def to(self, device: Union[str, torch.device]) -> None:
    """
    Move the potential, the proposal and x_o to a new device.

    It also reinstantiates the posterior with the new device.

    Args:
        device: Device on which to move the posterior to.
    """
    self.device = device
    self.potential_fn.to(device)  # type: ignore
    self.proposal.to(device)
    x_o = None
    if hasattr(self, "_x") and (self._x is not None):
        x_o = self._x.to(device)

    self.theta_transform = mcmc_transform(self.proposal, device=device)
    super().__init__(
        self.potential_fn,
        theta_transform=self.theta_transform,
        device=device,
        x_shape=self.x_shape,
    )
    # super().__init__ erases the self._x, so we need to set it again
    if x_o is not None:
        self.set_default_x(x_o)

`MCMCPosterior` ¶

Bases: NeuralPosterior

Provides MCMC to sample from the posterior.

SNLE or SNRE train neural networks to approximate the likelihood(-ratios). MCMCPosterior allows to sample from the posterior with MCMC.

Source code in sbi/inference/posteriors/mcmc_posterior.py

class MCMCPosterior(NeuralPosterior):
    r"""Provides MCMC to sample from the posterior.

    SNLE or SNRE train neural networks to approximate the likelihood(-ratios).
    `MCMCPosterior` allows to sample from the posterior with MCMC.
    """

    def __init__(
        self,
        potential_fn: Union[Callable, BasePotential],
        proposal: Any,
        theta_transform: Optional[TorchTransform] = None,
        method: Literal[
            "slice_np",
            "slice_np_vectorized",
            "hmc_pyro",
            "nuts_pyro",
            "slice_pymc",
            "hmc_pymc",
            "nuts_pymc",
        ] = "slice_np_vectorized",
        thin: int = -1,
        warmup_steps: int = 200,
        num_chains: int = 20,
        init_strategy: Literal["proposal", "sir", "resample"] = "resample",
        init_strategy_parameters: Optional[Dict[str, Any]] = None,
        init_strategy_num_candidates: Optional[int] = None,
        num_workers: int = 1,
        mp_context: Literal["fork", "spawn"] = "spawn",
        device: Optional[Union[str, torch.device]] = None,
        x_shape: Optional[torch.Size] = None,
    ):
        """
        Args:
            potential_fn: The potential function from which to draw samples. Must be a
                `BasePotential` or a `Callable` which takes `theta` and `x_o` as inputs.
            proposal: Proposal distribution that is used to initialize the MCMC chain.
            theta_transform: Transformation that will be applied during sampling.
                Allows to perform MCMC in unconstrained space.
            method: Method used for MCMC sampling, one of `slice_np`,
                `slice_np_vectorized`, `hmc_pyro`, `nuts_pyro`, `slice_pymc`,
                `hmc_pymc`, `nuts_pymc`. `slice_np` is a custom
                numpy implementation of slice sampling. `slice_np_vectorized` is
                identical to `slice_np`, but if `num_chains>1`, the chains are
                vectorized for `slice_np_vectorized` whereas they are run sequentially
                for `slice_np`. The samplers ending on `_pyro` are using Pyro, and
                likewise the samplers ending on `_pymc` are using PyMC.
            thin: The thinning factor for the chain, default 1 (no thinning).
            warmup_steps: The initial number of samples to discard.
            num_chains: The number of chains. Should generally be at most
                `num_workers - 1`.
            init_strategy: The initialisation strategy for chains; `proposal` will draw
                init locations from `proposal`, whereas `sir` will use Sequential-
                Importance-Resampling (SIR). SIR initially samples
                `init_strategy_num_candidates` from the `proposal`, evaluates all of
                them under the `potential_fn` and `proposal`, and then resamples the
                initial locations with weights proportional to `exp(potential_fn -
                proposal.log_prob`. `resample` is the same as `sir` but
                uses `exp(potential_fn)` as weights.
            init_strategy_parameters: Dictionary of keyword arguments passed to the
                init strategy, e.g., for `init_strategy=sir` this could be
                `num_candidate_samples`, i.e., the number of candidates to find init
                locations (internal default is `1000`), or `device`.
            init_strategy_num_candidates: Number of candidates to find init
                 locations in `init_strategy=sir` (deprecated, use
                 init_strategy_parameters instead).
            num_workers: number of cpu cores used to parallelize mcmc
            mp_context: Multiprocessing start method, either `"fork"` or `"spawn"`
                (default), used by Pyro and PyMC samplers. `"fork"` can be significantly
                faster than `"spawn"` but is only supported on POSIX-based systems
                (e.g. Linux and macOS, not Windows).
            device: Training device, e.g., "cpu", "cuda" or "cuda:0". If None,
                `potential_fn.device` is used.
            x_shape: Deprecated, should not be passed.
        """
        if method == "slice":
            warn(
                "The Pyro-based slice sampler is deprecated, and the method `slice` "
                "has been changed to `slice_np`, i.e., the custom "
                "numpy-based slice sampler.",
                DeprecationWarning,
                stacklevel=2,
            )
            method = "slice_np"

        thin = _process_thin_default(thin)

        super().__init__(
            potential_fn,
            theta_transform=theta_transform,
            device=device,
            x_shape=x_shape,
        )

        self.proposal = proposal
        self.method = method
        self.thin = thin
        self.warmup_steps = warmup_steps
        self.num_chains = num_chains
        self.init_strategy = init_strategy
        self.init_strategy_parameters = init_strategy_parameters or {}
        self.num_workers = num_workers
        self.mp_context = mp_context
        self._posterior_sampler = None
        # Hardcode parameter name to reduce clutter kwargs.
        self.param_name = "theta"
        self.x_shape = x_shape

        if init_strategy_num_candidates is not None:
            warn(
                "Passing `init_strategy_num_candidates` is deprecated as of sbi "
                "v0.19.0. Instead, use e.g., `init_strategy_parameters "
                f"={'num_candidate_samples': 1000}`",
                stacklevel=2,
            )
            self.init_strategy_parameters["num_candidate_samples"] = (
                init_strategy_num_candidates
            )

        self.potential_ = self._prepare_potential(method)

        self._purpose = (
            "It provides MCMC to .sample() from the posterior and "
            "can evaluate the _unnormalized_ posterior density with .log_prob()."
        )

    def to(self, device: Union[str, torch.device]) -> None:
        """Moves potential_fn, proposal, x_o and theta_transform to the

        specified device. Reinstantiates the posterior and resets the default x_o.

        Args:
            device: Device to move the posterior to.
        """
        self.device = device
        self.potential_fn.to(device)  # type: ignore
        self.proposal.to(device)
        x_o = None
        if hasattr(self, "_x") and (self._x is not None):
            x_o = self._x.to(device)

        self.theta_transform = mcmc_transform(self.proposal, device=device)

        super().__init__(
            self.potential_fn,
            theta_transform=self.theta_transform,
            device=device,
            x_shape=self.x_shape,
        )
        # super().__init__ erases the self._x, so we need to set it again
        if x_o is not None:
            self.set_default_x(x_o)
        self.potential_ = self._prepare_potential(self.method)

    @property
    def mcmc_method(self) -> str:
        """Returns MCMC method."""
        return self._mcmc_method

    @mcmc_method.setter
    def mcmc_method(self, method: str) -> None:
        """See `set_mcmc_method`."""
        self.set_mcmc_method(method)

    @property
    def posterior_sampler(self):
        """Returns sampler created by `sample`."""
        return self._posterior_sampler

    def set_mcmc_method(self, method: str) -> "NeuralPosterior":
        """Sets sampling method to for MCMC and returns `NeuralPosterior`.

        Args:
            method: Method to use.

        Returns:
            `NeuralPosterior` for chainable calls.
        """
        self._mcmc_method = method
        return self

    def log_prob(
        self, theta: Tensor, x: Optional[Tensor] = None, track_gradients: bool = False
    ) -> Tensor:
        r"""Returns the log-probability of theta under the posterior.

        Args:
            theta: Parameters $\theta$.
            track_gradients: Whether the returned tensor supports tracking gradients.
                This can be helpful for e.g. sensitivity analysis, but increases memory
                consumption.

        Returns:
            `len($\theta$)`-shaped log-probability.
        """
        warn(
            "`.log_prob()` is deprecated for methods that can only evaluate the "
            "log-probability up to a normalizing constant. Use `.potential()` instead.",
            stacklevel=2,
        )
        warn("The log-probability is unnormalized!", stacklevel=2)

        self.potential_fn.set_x(self._x_else_default_x(x))

        theta = ensure_theta_batched(torch.as_tensor(theta))
        return self.potential_fn(
            theta.to(self._device), track_gradients=track_gradients
        )

    def sample(
        self,
        sample_shape: Shape = torch.Size(),
        x: Optional[Tensor] = None,
        method: Optional[str] = None,
        thin: Optional[int] = None,
        warmup_steps: Optional[int] = None,
        num_chains: Optional[int] = None,
        init_strategy: Optional[str] = None,
        init_strategy_parameters: Optional[Dict[str, Any]] = None,
        init_strategy_num_candidates: Optional[int] = None,
        mcmc_parameters: Optional[Dict] = None,
        mcmc_method: Optional[str] = None,
        sample_with: Optional[str] = None,
        num_workers: Optional[int] = None,
        mp_context: Optional[str] = None,
        show_progress_bars: bool = True,
    ) -> Tensor:
        r"""Return samples from posterior distribution $p(\theta|x)$ with MCMC.

        Check the `__init__()` method for a description of all arguments as well as
        their default values.

        Args:
            sample_shape: Desired shape of samples that are drawn from posterior. If
                sample_shape is multidimensional we simply draw `sample_shape.numel()`
                samples and then reshape into the desired shape.
            mcmc_parameters: Dictionary that is passed only to support the API of
                `sbi` v0.17.2 or older.
            mcmc_method: This argument only exists to keep backward-compatibility with
                `sbi` v0.17.2 or older. Please use `method` instead.
            sample_with: This argument only exists to keep backward-compatibility with
                `sbi` v0.17.2 or older. If it is set, we instantly raise an error.
            show_progress_bars: Whether to show sampling progress monitor.

        Returns:
            Samples from posterior.
        """

        self.potential_fn.set_x(self._x_else_default_x(x))

        # Replace arguments that were not passed with their default.
        method = self.method if method is None else method
        thin = self.thin if thin is None else thin
        warmup_steps = self.warmup_steps if warmup_steps is None else warmup_steps
        num_chains = self.num_chains if num_chains is None else num_chains
        init_strategy = self.init_strategy if init_strategy is None else init_strategy
        num_workers = self.num_workers if num_workers is None else num_workers
        mp_context = self.mp_context if mp_context is None else mp_context
        init_strategy_parameters = (
            self.init_strategy_parameters
            if init_strategy_parameters is None
            else init_strategy_parameters
        )
        if init_strategy_num_candidates is not None:
            warn(
                f"Passing `init_strategy_num_candidates` is deprecated as of sbi \
                v0.19.0. Instead, use e.g., \
                `init_strategy_parameters={'num_candidate_samples': 1000}`",
                stacklevel=2,
            )
            self.init_strategy_parameters["num_candidate_samples"] = (
                init_strategy_num_candidates
            )
        if sample_with is not None:
            raise ValueError(
                f"You set `sample_with={sample_with}`. As of sbi v0.18.0, setting "
                "`sample_with` is no longer supported. You have to rerun "
                f"`.build_posterior(sample_with={sample_with}).`"
            )
        if mcmc_method is not None:
            warn(
                "You passed `mcmc_method` to `.sample()`. As of sbi v0.18.0, this "
                "is deprecated and will be removed in a future release. Use `method` "
                "instead of `mcmc_method`.",
                stacklevel=2,
            )
            method = mcmc_method
        if mcmc_parameters:
            warn(
                "You passed `mcmc_parameters` to `.sample()`. As of sbi v0.18.0, this "
                "is deprecated and will be removed in a future release. Instead, pass "
                "the variable to `.sample()` directly, e.g. "
                "`posterior.sample((1,), num_chains=5)`.",
                stacklevel=2,
            )
        # The following lines are only for backwards compatibility with sbi v0.17.2 or
        # older.
        m_p = mcmc_parameters or {}  # define to shorten the variable name
        method = _maybe_use_dict_entry(method, "mcmc_method", m_p)
        thin = _maybe_use_dict_entry(thin, "thin", m_p)
        warmup_steps = _maybe_use_dict_entry(warmup_steps, "warmup_steps", m_p)
        num_chains = _maybe_use_dict_entry(num_chains, "num_chains", m_p)
        init_strategy = _maybe_use_dict_entry(init_strategy, "init_strategy", m_p)
        self.potential_ = self._prepare_potential(method)  # type: ignore

        initial_params = self._get_initial_params(
            init_strategy,  # type: ignore
            num_chains,  # type: ignore
            num_workers,
            show_progress_bars,
            **init_strategy_parameters,
        )
        num_samples = torch.Size(sample_shape).numel()

        track_gradients = method in ("hmc_pyro", "nuts_pyro", "hmc_pymc", "nuts_pymc")
        with torch.set_grad_enabled(track_gradients):
            if method in ("slice_np", "slice_np_vectorized"):
                transformed_samples = self._slice_np_mcmc(
                    num_samples=num_samples,
                    potential_function=self.potential_,
                    initial_params=initial_params,
                    thin=thin,  # type: ignore
                    warmup_steps=warmup_steps,  # type: ignore
                    vectorized=(method == "slice_np_vectorized"),
                    interchangeable_chains=True,
                    num_workers=num_workers,
                    show_progress_bars=show_progress_bars,
                )
            elif method in ("hmc_pyro", "nuts_pyro"):
                transformed_samples = self._pyro_mcmc(
                    num_samples=num_samples,
                    potential_function=self.potential_,
                    initial_params=initial_params,
                    mcmc_method=method,  # type: ignore
                    thin=thin,  # type: ignore
                    warmup_steps=warmup_steps,  # type: ignore
                    num_chains=num_chains,
                    show_progress_bars=show_progress_bars,
                    mp_context=mp_context,
                )
            elif method in ("hmc_pymc", "nuts_pymc", "slice_pymc"):
                transformed_samples = self._pymc_mcmc(
                    num_samples=num_samples,
                    potential_function=self.potential_,
                    initial_params=initial_params,
                    mcmc_method=method,  # type: ignore
                    thin=thin,  # type: ignore
                    warmup_steps=warmup_steps,  # type: ignore
                    num_chains=num_chains,
                    show_progress_bars=show_progress_bars,
                    mp_context=mp_context,
                )
            else:
                raise NameError(f"The sampling method {method} is not implemented!")

        samples = self.theta_transform.inv(transformed_samples)
        # NOTE: Currently MCMCPosteriors will require a single dimension for the
        # parameter dimension. With recent ConditionalDensity(Ratio) estimators, we
        # can have multiple dimensions for the parameter dimension.
        samples = samples.reshape((*sample_shape, -1))  # type: ignore

        return samples

    def sample_batched(
        self,
        sample_shape: Shape,
        x: Tensor,
        method: Optional[str] = None,
        thin: Optional[int] = None,
        warmup_steps: Optional[int] = None,
        num_chains: Optional[int] = None,
        init_strategy: Optional[str] = None,
        init_strategy_parameters: Optional[Dict[str, Any]] = None,
        num_workers: Optional[int] = None,
        mp_context: Optional[str] = None,
        show_progress_bars: bool = True,
    ) -> Tensor:
        r"""Given a batch of observations [x_1, ..., x_B] this function samples from
        posteriors $p(\theta|x_1)$, ... ,$p(\theta|x_B)$, in a batched (i.e. vectorized)
        manner.

        Check the `__init__()` method for a description of all arguments as well as
        their default values.

        Args:
            sample_shape: Desired shape of samples that are drawn from the posterior
                given every observation.
            x: A batch of observations, of shape `(batch_dim, event_shape_x)`.
                `batch_dim` corresponds to the number of observations to be
                drawn.
            method: Method used for MCMC sampling, e.g., "slice_np_vectorized".
            thin: The thinning factor for the chain, default 1 (no thinning).
            warmup_steps: The initial number of samples to discard.
            num_chains: The number of chains used for each `x` passed in the batch.
            init_strategy: The initialisation strategy for chains.
            init_strategy_parameters: Dictionary of keyword arguments passed to
                the init strategy.
            num_workers: number of cpu cores used to parallelize initial
                parameter generation and mcmc sampling.
            mp_context: Multiprocessing start method, either `"fork"` or `"spawn"`
            show_progress_bars: Whether to show sampling progress monitor.

        Returns:
            Samples from the posteriors of shape (*sample_shape, B, *input_shape)
        """

        # Replace arguments that were not passed with their default.
        method = self.method if method is None else method
        thin = self.thin if thin is None else thin
        warmup_steps = self.warmup_steps if warmup_steps is None else warmup_steps
        num_chains = self.num_chains if num_chains is None else num_chains
        init_strategy = self.init_strategy if init_strategy is None else init_strategy
        num_workers = self.num_workers if num_workers is None else num_workers
        mp_context = self.mp_context if mp_context is None else mp_context
        init_strategy_parameters = (
            self.init_strategy_parameters
            if init_strategy_parameters is None
            else init_strategy_parameters
        )

        assert method == "slice_np_vectorized", (
            "Batched sampling only supported for vectorized samplers!"
        )

        # warn if num_chains is larger than num requested samples
        if num_chains > torch.Size(sample_shape).numel():
            warnings.warn(
                "The passed number of MCMC chains is larger than the number of "
                f"requested samples: {num_chains} > {torch.Size(sample_shape).numel()},"
                f" resetting it to {torch.Size(sample_shape).numel()}.",
                stacklevel=2,
            )
            num_chains = torch.Size(sample_shape).numel()

        # custom shape handling to make sure to match the batch size of x and theta
        # without unnecessary combinations.
        if len(x.shape) == 1:
            x = x.unsqueeze(0)
        batch_size = x.shape[0]

        x = reshape_to_batch_event(x, event_shape=x.shape[1:])

        # For batched sampling, we want `num_chains` for each observation in the batch.
        # Here we repeat the observations ABC -> AAABBBCCC, so that the chains are
        # in the order of the observations.
        x_ = x.repeat_interleave(num_chains, dim=0)

        self.potential_fn.set_x(x_, x_is_iid=False)
        self.potential_ = self._prepare_potential(method)  # type: ignore

        # For each observation in the batch, we have num_chains independent chains.
        num_chains_extended = batch_size * num_chains
        if num_chains_extended > 100:
            warnings.warn(
                "Note that for batched sampling, we use num_chains many chains "
                "for each x in the batch. With the given settings, this results "
                f"in a large number of chains ({num_chains_extended}), which can "
                "be slow and memory-intensive for vectorized MCMC. Consider "
                "reducing the number of chains or batch size.",
                stacklevel=2,
            )
        init_strategy_parameters["num_return_samples"] = num_chains_extended
        initial_params = self._get_initial_params_batched(
            x,
            init_strategy,  # type: ignore
            num_chains,  # type: ignore
            num_workers,
            show_progress_bars,
            **init_strategy_parameters,
        )
        # We need num_samples from each posterior in the batch
        num_samples = torch.Size(sample_shape).numel() * batch_size

        with torch.set_grad_enabled(False):
            transformed_samples = self._slice_np_mcmc(
                num_samples=num_samples,
                potential_function=self.potential_,
                initial_params=initial_params,
                thin=thin,  # type: ignore
                warmup_steps=warmup_steps,  # type: ignore
                vectorized=(method == "slice_np_vectorized"),
                interchangeable_chains=False,
                num_workers=num_workers,
                show_progress_bars=show_progress_bars,
            )

        # (num_chains_extended, samples_per_chain, *input_shape)
        samples_per_chain: Tensor = self.theta_transform.inv(transformed_samples)  # type: ignore
        dim_theta = samples_per_chain.shape[-1]
        # We need to collect samples for each x from the respective chains.
        # However, using samples.reshape(*sample_shape, batch_size, dim_theta)
        # does not combine the samples in the right order, since this mixes
        # samples that belong to different `x`. The following permute is a
        # workaround to reshape the samples in the right order.
        samples_per_x = samples_per_chain.reshape((
            batch_size,
            # We are flattening the sample shape here using -1 because we might have
            # generated more samples than requested (more chains, or multiple of
            # chains not matching sample_shape)
            -1,
            dim_theta,
        )).permute(1, 0, -1)

        # Shape is now (-1, batch_size, dim_theta)
        # We can now select the number of requested samples
        samples = samples_per_x[: torch.Size(sample_shape).numel()]
        # and reshape into (*sample_shape, batch_size, dim_theta)
        samples = samples.reshape((*sample_shape, batch_size, dim_theta))
        return samples

    def _build_mcmc_init_fn(
        self,
        proposal: Any,
        potential_fn: Callable,
        transform: torch_tf.Transform,
        init_strategy: str,
        **kwargs,
    ) -> Callable:
        """Return function that, when called, creates an initial parameter set for MCMC.

        Args:
            proposal: Proposal distribution.
            potential_fn: Potential function that the candidate samples are weighted
                with.
            init_strategy: Specifies the initialization method. Either of
                [`proposal`|`sir`|`resample`|`latest_sample`].
            kwargs: Passed on to init function. This way, init specific keywords can
                be set through `mcmc_parameters`. Unused arguments will be absorbed by
                the intitialization method.

        Returns: Initialization function.
        """
        if init_strategy == "proposal" or init_strategy == "prior":
            if init_strategy == "prior":
                warn(
                    "You set `init_strategy=prior`. As of sbi v0.18.0, this is "
                    "deprecated and it will be removed in a future release. Use "
                    "`init_strategy=proposal` instead.",
                    stacklevel=2,
                )
            return lambda: proposal_init(proposal, transform=transform, **kwargs)
        elif init_strategy == "sir":
            warn(
                "As of sbi v0.19.0, the behavior of the SIR initialization for MCMC "
                "has changed. If you wish to restore the behavior of sbi v0.18.0, set "
                "`init_strategy='resample'.`",
                stacklevel=2,
            )
            return lambda: sir_init(
                proposal, potential_fn, transform=transform, **kwargs
            )
        elif init_strategy == "resample":
            return lambda: resample_given_potential_fn(
                proposal, potential_fn, transform=transform, **kwargs
            )
        elif init_strategy == "latest_sample":
            latest_sample = IterateParameters(self._mcmc_init_params, **kwargs)
            return latest_sample
        else:
            raise NotImplementedError

    def _get_initial_params(
        self,
        init_strategy: str,
        num_chains: int,
        num_workers: int,
        show_progress_bars: bool,
        **kwargs,
    ) -> Tensor:
        """Return initial parameters for MCMC obtained with given init strategy.

        Parallelizes across CPU cores only for resample and SIR.

        Args:
            init_strategy: Specifies the initialization method. Either of
                [`proposal`|`sir`|`resample`|`latest_sample`].
            num_chains: number of MCMC chains, generates initial params for each
            num_workers: number of CPU cores for parallization
            show_progress_bars: whether to show progress bars for SIR init
            kwargs: Passed on to `_build_mcmc_init_fn`.

        Returns:
            Tensor: initial parameters, one for each chain
        """
        # Build init function
        init_fn = self._build_mcmc_init_fn(
            self.proposal,
            self.potential_fn,
            transform=self.theta_transform,
            init_strategy=init_strategy,  # type: ignore
            **kwargs,
        )

        # Parallelize inits for resampling only.
        if num_workers > 1 and (init_strategy == "resample" or init_strategy == "sir"):

            def seeded_init_fn(seed):
                torch.manual_seed(seed)
                return init_fn()

            seeds = torch.randint(high=2**31, size=(num_chains,))

            # Generate initial params parallelized over num_workers.
            initial_params = list(
                tqdm(
                    Parallel(return_as="generator", n_jobs=num_workers)(
                        delayed(seeded_init_fn)(seed) for seed in seeds
                    ),
                    total=len(seeds),
                    desc=f"Generating {num_chains} MCMC inits via {init_strategy} "
                    "strategy",
                    disable=not show_progress_bars,
                )
            )
            initial_params = torch.cat(initial_params)  # type: ignore
        else:
            initial_params = torch.cat(
                [
                    init_fn()
                    for _ in tqdm(
                        range(num_chains),
                        desc=f"Generating {num_chains} MCMC inits via {init_strategy} "
                        "strategy",
                        disable=not show_progress_bars,
                    )
                ]  # type: ignore
            )
        assert initial_params.shape[0] == num_chains, "Initial params shape mismatch."
        return initial_params

    def _get_initial_params_batched(
        self,
        x: torch.Tensor,
        init_strategy: str,
        num_chains_per_x: int,
        num_workers: int,
        show_progress_bars: bool,
        **kwargs,
    ) -> Tensor:
        """Return initial parameters for MCMC for a batch of `x`, obtained with given
           init strategy.

        Parallelizes across CPU cores only for resample and SIR.

        Args:
            x: Batch of observations to create different initial parameters for.
            init_strategy: Specifies the initialization method. Either of
                [`proposal`|`sir`|`resample`|`latest_sample`].
            num_chains_per_x: number of MCMC chains for each x, generates initial params
                for each x
            num_workers: number of CPU cores for parallization
            show_progress_bars: whether to show progress bars for SIR init
            kwargs: Passed on to `_build_mcmc_init_fn`.

        Returns:
            Tensor: initial parameters, one for each chain
        """

        potential_ = deepcopy(self.potential_fn)
        initial_params = []
        init_fn = self._build_mcmc_init_fn(
            self.proposal,
            potential_fn=potential_,
            transform=self.theta_transform,
            init_strategy=init_strategy,  # type: ignore
            **kwargs,
        )
        for xi in x:
            # Build init function
            potential_.set_x(xi)

            # Parallelize inits for resampling or sir.
            if num_workers > 1 and (
                init_strategy == "resample" or init_strategy == "sir"
            ):

                def seeded_init_fn(seed):
                    torch.manual_seed(seed)
                    return init_fn()

                seeds = torch.randint(high=2**31, size=(num_chains_per_x,))

                # Generate initial params parallelized over num_workers.
                initial_params = initial_params + list(
                    tqdm(
                        Parallel(return_as="generator", n_jobs=num_workers)(
                            delayed(seeded_init_fn)(seed) for seed in seeds
                        ),
                        total=len(seeds),
                        desc=f"""Generating {num_chains_per_x} MCMC inits with
                                {num_workers} workers.""",
                        disable=not show_progress_bars,
                    )
                )

            else:
                initial_params = initial_params + [
                    init_fn() for _ in range(num_chains_per_x)
                ]  # type: ignore

        initial_params = torch.cat(initial_params)
        return initial_params

    def _slice_np_mcmc(
        self,
        num_samples: int,
        potential_function: Callable,
        initial_params: Tensor,
        thin: int,
        warmup_steps: int,
        vectorized: bool = False,
        interchangeable_chains=True,
        num_workers: int = 1,
        init_width: Union[float, ndarray] = 0.01,
        show_progress_bars: bool = True,
    ) -> Tensor:
        """Custom implementation of slice sampling using Numpy.

        Args:
            num_samples: Desired number of samples.
            potential_function: A callable **class**.
            initial_params: Initial parameters for MCMC chain.
            thin: Thinning (subsampling) factor, default 1 (no thinning).
            warmup_steps: Initial number of samples to discard.
            vectorized: Whether to use a vectorized implementation of the
                `SliceSampler`.
            interchangeable_chains: Whether chains are interchangeable, i.e., whether
                we can mix samples between chains.
            num_workers: Number of CPU cores to use.
            init_width: Inital width of brackets.
            show_progress_bars: Whether to show a progressbar during sampling;
                can only be turned off for vectorized sampler.

        Returns:
            Tensor of shape (num_samples, shape_of_single_theta).
        """

        num_chains, dim_samples = initial_params.shape

        if not vectorized:
            SliceSamplerMultiChain = SliceSamplerSerial
        else:
            SliceSamplerMultiChain = SliceSamplerVectorized

        def multi_obs_potential(params):
            # Params are of shape (num_chains * num_obs, event).
            all_potentials = potential_function(params)  # Shape: (num_chains, num_obs)
            return all_potentials.flatten()

        posterior_sampler = SliceSamplerMultiChain(
            init_params=tensor2numpy(initial_params),
            log_prob_fn=multi_obs_potential,
            num_chains=num_chains,
            thin=thin,
            verbose=show_progress_bars,
            num_workers=num_workers,
            init_width=init_width,
        )
        warmup_ = warmup_steps * thin
        num_samples_ = ceil((num_samples * thin) / num_chains)
        # Run mcmc including warmup
        samples = posterior_sampler.run(warmup_ + num_samples_)
        samples = samples[:, warmup_steps:, :]  # discard warmup steps
        samples = torch.from_numpy(samples)  # chains x samples x dim

        # Save posterior sampler.
        self._posterior_sampler = posterior_sampler

        # Save sample as potential next init (if init_strategy == 'latest_sample').
        self._mcmc_init_params = samples[:, -1, :].reshape(num_chains, dim_samples)

        # Update: If chains are interchangeable, return concatenated samples. Otherwise
        # return samples per chain.
        if interchangeable_chains:
            # Collect samples from all chains.
            samples = samples.reshape(-1, dim_samples)[:num_samples]

        return samples.type(torch.float32).to(self._device)

    def _pyro_mcmc(
        self,
        num_samples: int,
        potential_function: Callable,
        initial_params: Tensor,
        mcmc_method: str = "nuts_pyro",
        thin: int = -1,
        warmup_steps: int = 200,
        num_chains: Optional[int] = 1,
        show_progress_bars: bool = True,
        mp_context: str = "spawn",
    ) -> Tensor:
        r"""Return samples obtained using Pyro's HMC or NUTS sampler.

        Args:
            num_samples: Desired number of samples.
            potential_function: A callable **class**. A class, but not a function,
                is picklable for Pyro MCMC to use it across chains in parallel,
                even when the potential function requires evaluating a neural network.
            initial_params: Initial parameters for MCMC chain.
            mcmc_method: Pyro MCMC method to use, either `"hmc_pyro"` or
                `"nuts_pyro"` (default).
            thin: Thinning (subsampling) factor, default 1 (no thinning).
            warmup_steps: Initial number of samples to discard.
            num_chains: Whether to sample in parallel. If None, use all but one CPU.
            show_progress_bars: Whether to show a progressbar during sampling.

        Returns:
            Tensor of shape (num_samples, shape_of_single_theta).
        """
        thin = _process_thin_default(thin)
        num_chains = mp.cpu_count() - 1 if num_chains is None else num_chains
        kernels = dict(hmc_pyro=HMC, nuts_pyro=NUTS)

        sampler = MCMC(
            kernel=kernels[mcmc_method](potential_fn=potential_function),
            num_samples=ceil((thin * num_samples) / num_chains),
            warmup_steps=warmup_steps,
            initial_params={self.param_name: initial_params},
            num_chains=num_chains,
            mp_context=mp_context,
            disable_progbar=not show_progress_bars,
            transforms={},
        )
        sampler.run()
        samples = next(iter(sampler.get_samples().values())).reshape(
            -1,
            initial_params.shape[1],  # .shape[1] = dim of theta
        )

        # Save posterior sampler.
        self._posterior_sampler = sampler

        samples = samples[::thin][:num_samples]

        return samples.detach()

    def _pymc_mcmc(
        self,
        num_samples: int,
        potential_function: Callable,
        initial_params: Tensor,
        mcmc_method: str = "nuts_pymc",
        thin: int = -1,
        warmup_steps: int = 200,
        num_chains: Optional[int] = 1,
        show_progress_bars: bool = True,
        mp_context: str = "spawn",
    ) -> Tensor:
        r"""Return samples obtained using PyMC's HMC, NUTS or slice samplers.

        Args:
            num_samples: Desired number of samples.
            potential_function: A callable **class**. A class, but not a function,
                is picklable for PyMC MCMC to use it across chains in parallel,
                even when the potential function requires evaluating a neural network.
            initial_params: Initial parameters for MCMC chain.
            mcmc_method: mcmc_method: Pyro MCMC method to use, either `"hmc_pymc"` or
                `"slice_pymc"`, or `"nuts_pymc"` (default).
            thin: Thinning (subsampling) factor, default 1 (no thinning).
            warmup_steps: Initial number of samples to discard.
            num_chains: Whether to sample in parallel. If None, use all but one CPU.
            show_progress_bars: Whether to show a progressbar during sampling.

        Returns:
            Tensor of shape (num_samples, shape_of_single_theta).
        """
        thin = _process_thin_default(thin)
        num_chains = mp.cpu_count() - 1 if num_chains is None else num_chains
        steps = dict(slice_pymc="slice", hmc_pymc="hmc", nuts_pymc="nuts")

        sampler = PyMCSampler(
            potential_fn=potential_function,
            step=steps[mcmc_method],
            initvals=tensor2numpy(initial_params),
            draws=ceil((thin * num_samples) / num_chains),
            tune=warmup_steps,
            chains=num_chains,
            mp_ctx=mp_context,
            progressbar=show_progress_bars,
            param_name=self.param_name,
            device=self._device,
        )
        samples = sampler.run()
        samples = torch.from_numpy(samples).to(dtype=torch.float32, device=self._device)
        samples = samples.reshape(-1, initial_params.shape[1])

        # Save posterior sampler.
        self._posterior_sampler = sampler

        samples = samples[::thin][:num_samples]

        return samples

    def _prepare_potential(self, method: str) -> Callable:
        """Combines potential and transform and takes care of gradients and pyro.

        Args:
            method: Which MCMC method to use.

        Returns:
            A potential function that is ready to be used in MCMC.
        """
        if method in ("hmc_pyro", "nuts_pyro"):
            track_gradients = True
            pyro = True
        elif method in ("hmc_pymc", "nuts_pymc"):
            track_gradients = True
            pyro = False
        elif method in ("slice_np", "slice_np_vectorized", "slice_pymc"):
            track_gradients = False
            pyro = False
        else:
            if "hmc" in method or "nuts" in method:
                warn(
                    "The kwargs 'hmc' and 'nuts' are deprecated. Use 'hmc_pyro', "
                    "'nuts_pyro', 'hmc_pymc', or 'nuts_pymc' instead.",
                    DeprecationWarning,
                    stacklevel=2,
                )
            raise NotImplementedError(f"MCMC method {method} is not implemented.")

        prepared_potential = partial(
            transformed_potential,
            potential_fn=self.potential_fn,
            theta_transform=self.theta_transform,
            device=self._device,
            track_gradients=track_gradients,
        )
        if pyro:
            prepared_potential = partial(
                pyro_potential_wrapper, potential=prepared_potential
            )

        return prepared_potential

    def map(
        self,
        x: Optional[Tensor] = None,
        num_iter: int = 1_000,
        num_to_optimize: int = 100,
        learning_rate: float = 0.01,
        init_method: Union[str, Tensor] = "proposal",
        num_init_samples: int = 1_000,
        save_best_every: int = 10,
        show_progress_bars: bool = False,
        force_update: bool = False,
    ) -> Tensor:
        r"""Returns the maximum-a-posteriori estimate (MAP).

        The method can be interrupted (Ctrl-C) when the user sees that the
        log-probability converges. The best estimate will be saved in `self._map` and
        can be accessed with `self.map()`. The MAP is obtained by running gradient
        ascent from a given number of starting positions (samples from the posterior
        with the highest log-probability). After the optimization is done, we select the
        parameter set that has the highest log-probability after the optimization.

        Warning: The default values used by this function are not well-tested. They
        might require hand-tuning for the problem at hand.

        For developers: if the prior is a `BoxUniform`, we carry out the optimization
        in unbounded space and transform the result back into bounded space.

        Args:
            x: Deprecated - use `.set_default_x()` prior to `.map()`.
            num_iter: Number of optimization steps that the algorithm takes
                to find the MAP.
            learning_rate: Learning rate of the optimizer.
            init_method: How to select the starting parameters for the optimization. If
                it is a string, it can be either [`posterior`, `prior`], which samples
                the respective distribution `num_init_samples` times. If it is a
                tensor, the tensor will be used as init locations.
            num_init_samples: Draw this number of samples from the posterior and
                evaluate the log-probability of all of them.
            num_to_optimize: From the drawn `num_init_samples`, use the
                `num_to_optimize` with highest log-probability as the initial points
                for the optimization.
            save_best_every: The best log-probability is computed, saved in the
                `map`-attribute, and printed every `save_best_every`-th iteration.
                Computing the best log-probability creates a significant overhead
                (thus, the default is `10`.)
            show_progress_bars: Whether to show a progressbar during sampling from
                the posterior.
            force_update: Whether to re-calculate the MAP when x is unchanged and
                have a cached value.
            log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
                {'norm_posterior': True} for SNPE.

        Returns:
            The MAP estimate.
        """
        return super().map(
            x=x,
            num_iter=num_iter,
            num_to_optimize=num_to_optimize,
            learning_rate=learning_rate,
            init_method=init_method,
            num_init_samples=num_init_samples,
            save_best_every=save_best_every,
            show_progress_bars=show_progress_bars,
            force_update=force_update,
        )

    def get_arviz_inference_data(self) -> InferenceData:
        """Returns arviz InferenceData object constructed most recent samples.

        Note: the InferenceData is constructed using the posterior samples generated in
        most recent call to `.sample(...)`.

        For Pyro and PyMC samplers, InferenceData will contain diagnostics, but for
        sbi slice samplers, only the samples are added.

        Returns:
            inference_data: Arviz InferenceData object.
        """
        assert self._posterior_sampler is not None, (
            """No samples have been generated, call .sample() first."""
        )

        sampler: Union[
            MCMC, SliceSamplerSerial, SliceSamplerVectorized, PyMCSampler
        ] = self._posterior_sampler

        # If Pyro sampler and samples not transformed, use arviz' from_pyro.
        if isinstance(sampler, (HMC, NUTS)) and isinstance(
            self.theta_transform, torch_tf.IndependentTransform
        ):
            inference_data = az.from_pyro(sampler)
        # If PyMC sampler and samples not transformed, get cached InferenceData.
        elif isinstance(sampler, PyMCSampler) and isinstance(
            self.theta_transform, torch_tf.IndependentTransform
        ):
            inference_data = sampler.get_inference_data()

        # otherwise get samples from sampler and transform to original space.
        else:
            transformed_samples = sampler.get_samples(group_by_chain=True)
            # Pyro samplers returns dicts, get values.
            if isinstance(transformed_samples, Dict):
                # popitem gets last items, [1] get the values as tensor.
                transformed_samples = transformed_samples.popitem()[1]
            # Our slice samplers return numpy arrays.
            elif isinstance(transformed_samples, ndarray):
                transformed_samples = torch.from_numpy(transformed_samples).type(
                    torch.float32
                )
            # For MultipleIndependent priors transforms first dim must be batch dim.
            # thus, reshape back and forth to have batch dim in front.
            samples_shape = transformed_samples.shape
            samples = self.theta_transform.inv(  # type: ignore
                transformed_samples.reshape(-1, samples_shape[-1])
            ).reshape(  # type: ignore
                *samples_shape
            )

            inference_data = az.convert_to_inference_data({
                f"{self.param_name}": samples
            })

        return inference_data

    def __getstate__(self) -> Dict:
        """Get state of MCMCPosterior.

        Removes the posterior sampler from the state, as it may not be picklable.

        Returns:
            Dict: State of MCMCPosterior.
        """
        state = self.__dict__.copy()
        state["_posterior_sampler"] = None

        return state

`mcmc_method` `property` `writable` ¶

Returns MCMC method.

`posterior_sampler` `property` ¶

Returns sampler created by sample.

`getstate()` ¶

Get state of MCMCPosterior.

Removes the posterior sampler from the state, as it may not be picklable.

Returns:

Name	Type	Description
`Dict`	`Dict`	State of MCMCPosterior.

Source code in sbi/inference/posteriors/mcmc_posterior.py

def __getstate__(self) -> Dict:
    """Get state of MCMCPosterior.

    Removes the posterior sampler from the state, as it may not be picklable.

    Returns:
        Dict: State of MCMCPosterior.
    """
    state = self.__dict__.copy()
    state["_posterior_sampler"] = None

    return state

`init(potential_fn, proposal, theta_transform=None, method='slice_np_vectorized', thin=-1, warmup_steps=200, num_chains=20, init_strategy='resample', init_strategy_parameters=None, init_strategy_num_candidates=None, num_workers=1, mp_context='spawn', device=None, x_shape=None)` ¶

Parameters:

Name	Type	Description	Default
`potential_fn`	`Union[Callable, BasePotential]`	The potential function from which to draw samples. Must be a `BasePotential` or a `Callable` which takes `theta` and `x_o` as inputs.	required
`proposal`	`Any`	Proposal distribution that is used to initialize the MCMC chain.	required
`theta_transform`	`Optional[TorchTransform]`	Transformation that will be applied during sampling. Allows to perform MCMC in unconstrained space.	`None`
`method`	`Literal['slice_np', 'slice_np_vectorized', 'hmc_pyro', 'nuts_pyro', 'slice_pymc', 'hmc_pymc', 'nuts_pymc']`	Method used for MCMC sampling, one of `slice_np`, `slice_np_vectorized`, `hmc_pyro`, `nuts_pyro`, `slice_pymc`, `hmc_pymc`, `nuts_pymc`. `slice_np` is a custom numpy implementation of slice sampling. `slice_np_vectorized` is identical to `slice_np`, but if `num_chains>1`, the chains are vectorized for `slice_np_vectorized` whereas they are run sequentially for `slice_np`. The samplers ending on `_pyro` are using Pyro, and likewise the samplers ending on `_pymc` are using PyMC.	`'slice_np_vectorized'`
`thin`	`int`	The thinning factor for the chain, default 1 (no thinning).	`-1`
`warmup_steps`	`int`	The initial number of samples to discard.	`200`
`num_chains`	`int`	The number of chains. Should generally be at most `num_workers - 1`.	`20`
`init_strategy`	`Literal['proposal', 'sir', 'resample']`	The initialisation strategy for chains; `proposal` will draw init locations from `proposal`, whereas `sir` will use Sequential- Importance-Resampling (SIR). SIR initially samples `init_strategy_num_candidates` from the `proposal`, evaluates all of them under the `potential_fn` and `proposal`, and then resamples the initial locations with weights proportional to `exp(potential_fn - proposal.log_prob`. `resample` is the same as `sir` but uses `exp(potential_fn)` as weights.	`'resample'`
`init_strategy_parameters`	`Optional[Dict[str, Any]]`	Dictionary of keyword arguments passed to the init strategy, e.g., for `init_strategy=sir` this could be `num_candidate_samples`, i.e., the number of candidates to find init locations (internal default is `1000`), or `device`.	`None`
`init_strategy_num_candidates`	`Optional[int]`	Number of candidates to find init locations in `init_strategy=sir` (deprecated, use init_strategy_parameters instead).	`None`
`num_workers`	`int`	number of cpu cores used to parallelize mcmc	`1`
`mp_context`	`Literal['fork', 'spawn']`	Multiprocessing start method, either `"fork"` or `"spawn"` (default), used by Pyro and PyMC samplers. `"fork"` can be significantly faster than `"spawn"` but is only supported on POSIX-based systems (e.g. Linux and macOS, not Windows).	`'spawn'`
`device`	`Optional[Union[str, device]]`	Training device, e.g., “cpu”, “cuda” or “cuda:0”. If None, `potential_fn.device` is used.	`None`
`x_shape`	`Optional[Size]`	Deprecated, should not be passed.	`None`

Source code in sbi/inference/posteriors/mcmc_posterior.py

def __init__(
    self,
    potential_fn: Union[Callable, BasePotential],
    proposal: Any,
    theta_transform: Optional[TorchTransform] = None,
    method: Literal[
        "slice_np",
        "slice_np_vectorized",
        "hmc_pyro",
        "nuts_pyro",
        "slice_pymc",
        "hmc_pymc",
        "nuts_pymc",
    ] = "slice_np_vectorized",
    thin: int = -1,
    warmup_steps: int = 200,
    num_chains: int = 20,
    init_strategy: Literal["proposal", "sir", "resample"] = "resample",
    init_strategy_parameters: Optional[Dict[str, Any]] = None,
    init_strategy_num_candidates: Optional[int] = None,
    num_workers: int = 1,
    mp_context: Literal["fork", "spawn"] = "spawn",
    device: Optional[Union[str, torch.device]] = None,
    x_shape: Optional[torch.Size] = None,
):
    """
    Args:
        potential_fn: The potential function from which to draw samples. Must be a
            `BasePotential` or a `Callable` which takes `theta` and `x_o` as inputs.
        proposal: Proposal distribution that is used to initialize the MCMC chain.
        theta_transform: Transformation that will be applied during sampling.
            Allows to perform MCMC in unconstrained space.
        method: Method used for MCMC sampling, one of `slice_np`,
            `slice_np_vectorized`, `hmc_pyro`, `nuts_pyro`, `slice_pymc`,
            `hmc_pymc`, `nuts_pymc`. `slice_np` is a custom
            numpy implementation of slice sampling. `slice_np_vectorized` is
            identical to `slice_np`, but if `num_chains>1`, the chains are
            vectorized for `slice_np_vectorized` whereas they are run sequentially
            for `slice_np`. The samplers ending on `_pyro` are using Pyro, and
            likewise the samplers ending on `_pymc` are using PyMC.
        thin: The thinning factor for the chain, default 1 (no thinning).
        warmup_steps: The initial number of samples to discard.
        num_chains: The number of chains. Should generally be at most
            `num_workers - 1`.
        init_strategy: The initialisation strategy for chains; `proposal` will draw
            init locations from `proposal`, whereas `sir` will use Sequential-
            Importance-Resampling (SIR). SIR initially samples
            `init_strategy_num_candidates` from the `proposal`, evaluates all of
            them under the `potential_fn` and `proposal`, and then resamples the
            initial locations with weights proportional to `exp(potential_fn -
            proposal.log_prob`. `resample` is the same as `sir` but
            uses `exp(potential_fn)` as weights.
        init_strategy_parameters: Dictionary of keyword arguments passed to the
            init strategy, e.g., for `init_strategy=sir` this could be
            `num_candidate_samples`, i.e., the number of candidates to find init
            locations (internal default is `1000`), or `device`.
        init_strategy_num_candidates: Number of candidates to find init
             locations in `init_strategy=sir` (deprecated, use
             init_strategy_parameters instead).
        num_workers: number of cpu cores used to parallelize mcmc
        mp_context: Multiprocessing start method, either `"fork"` or `"spawn"`
            (default), used by Pyro and PyMC samplers. `"fork"` can be significantly
            faster than `"spawn"` but is only supported on POSIX-based systems
            (e.g. Linux and macOS, not Windows).
        device: Training device, e.g., "cpu", "cuda" or "cuda:0". If None,
            `potential_fn.device` is used.
        x_shape: Deprecated, should not be passed.
    """
    if method == "slice":
        warn(
            "The Pyro-based slice sampler is deprecated, and the method `slice` "
            "has been changed to `slice_np`, i.e., the custom "
            "numpy-based slice sampler.",
            DeprecationWarning,
            stacklevel=2,
        )
        method = "slice_np"

    thin = _process_thin_default(thin)

    super().__init__(
        potential_fn,
        theta_transform=theta_transform,
        device=device,
        x_shape=x_shape,
    )

    self.proposal = proposal
    self.method = method
    self.thin = thin
    self.warmup_steps = warmup_steps
    self.num_chains = num_chains
    self.init_strategy = init_strategy
    self.init_strategy_parameters = init_strategy_parameters or {}
    self.num_workers = num_workers
    self.mp_context = mp_context
    self._posterior_sampler = None
    # Hardcode parameter name to reduce clutter kwargs.
    self.param_name = "theta"
    self.x_shape = x_shape

    if init_strategy_num_candidates is not None:
        warn(
            "Passing `init_strategy_num_candidates` is deprecated as of sbi "
            "v0.19.0. Instead, use e.g., `init_strategy_parameters "
            f"={'num_candidate_samples': 1000}`",
            stacklevel=2,
        )
        self.init_strategy_parameters["num_candidate_samples"] = (
            init_strategy_num_candidates
        )

    self.potential_ = self._prepare_potential(method)

    self._purpose = (
        "It provides MCMC to .sample() from the posterior and "
        "can evaluate the _unnormalized_ posterior density with .log_prob()."
    )

`get_arviz_inference_data()` ¶

Returns arviz InferenceData object constructed most recent samples.

Note: the InferenceData is constructed using the posterior samples generated in most recent call to .sample(...).

For Pyro and PyMC samplers, InferenceData will contain diagnostics, but for sbi slice samplers, only the samples are added.

Returns:

Name	Type	Description
`inference_data`	`InferenceData`	Arviz InferenceData object.

Source code in sbi/inference/posteriors/mcmc_posterior.py

def get_arviz_inference_data(self) -> InferenceData:
    """Returns arviz InferenceData object constructed most recent samples.

    Note: the InferenceData is constructed using the posterior samples generated in
    most recent call to `.sample(...)`.

    For Pyro and PyMC samplers, InferenceData will contain diagnostics, but for
    sbi slice samplers, only the samples are added.

    Returns:
        inference_data: Arviz InferenceData object.
    """
    assert self._posterior_sampler is not None, (
        """No samples have been generated, call .sample() first."""
    )

    sampler: Union[
        MCMC, SliceSamplerSerial, SliceSamplerVectorized, PyMCSampler
    ] = self._posterior_sampler

    # If Pyro sampler and samples not transformed, use arviz' from_pyro.
    if isinstance(sampler, (HMC, NUTS)) and isinstance(
        self.theta_transform, torch_tf.IndependentTransform
    ):
        inference_data = az.from_pyro(sampler)
    # If PyMC sampler and samples not transformed, get cached InferenceData.
    elif isinstance(sampler, PyMCSampler) and isinstance(
        self.theta_transform, torch_tf.IndependentTransform
    ):
        inference_data = sampler.get_inference_data()

    # otherwise get samples from sampler and transform to original space.
    else:
        transformed_samples = sampler.get_samples(group_by_chain=True)
        # Pyro samplers returns dicts, get values.
        if isinstance(transformed_samples, Dict):
            # popitem gets last items, [1] get the values as tensor.
            transformed_samples = transformed_samples.popitem()[1]
        # Our slice samplers return numpy arrays.
        elif isinstance(transformed_samples, ndarray):
            transformed_samples = torch.from_numpy(transformed_samples).type(
                torch.float32
            )
        # For MultipleIndependent priors transforms first dim must be batch dim.
        # thus, reshape back and forth to have batch dim in front.
        samples_shape = transformed_samples.shape
        samples = self.theta_transform.inv(  # type: ignore
            transformed_samples.reshape(-1, samples_shape[-1])
        ).reshape(  # type: ignore
            *samples_shape
        )

        inference_data = az.convert_to_inference_data({
            f"{self.param_name}": samples
        })

    return inference_data

`log_prob(theta, x=None, track_gradients=False)` ¶

Returns the log-probability of theta under the posterior.

Parameters:

Name	Type	Description	Default
`theta`	`Tensor`	Parameters $\theta$.	required
`track_gradients`	`bool`	Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.	`False`

Returns:

Type	Description
`Tensor`	`len($\theta$)`-shaped log-probability.

Source code in sbi/inference/posteriors/mcmc_posterior.py

def log_prob(
    self, theta: Tensor, x: Optional[Tensor] = None, track_gradients: bool = False
) -> Tensor:
    r"""Returns the log-probability of theta under the posterior.

    Args:
        theta: Parameters $\theta$.
        track_gradients: Whether the returned tensor supports tracking gradients.
            This can be helpful for e.g. sensitivity analysis, but increases memory
            consumption.

    Returns:
        `len($\theta$)`-shaped log-probability.
    """
    warn(
        "`.log_prob()` is deprecated for methods that can only evaluate the "
        "log-probability up to a normalizing constant. Use `.potential()` instead.",
        stacklevel=2,
    )
    warn("The log-probability is unnormalized!", stacklevel=2)

    self.potential_fn.set_x(self._x_else_default_x(x))

    theta = ensure_theta_batched(torch.as_tensor(theta))
    return self.potential_fn(
        theta.to(self._device), track_gradients=track_gradients
    )

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

Returns the maximum-a-posteriori estimate (MAP).

The method can be interrupted (Ctrl-C) when the user sees that the log-probability converges. The best estimate will be saved in self._map and can be accessed with self.map(). The MAP is obtained by running gradient ascent from a given number of starting positions (samples from the posterior with the highest log-probability). After the optimization is done, we select the parameter set that has the highest log-probability after the optimization.

Warning: The default values used by this function are not well-tested. They might require hand-tuning for the problem at hand.

For developers: if the prior is a BoxUniform, we carry out the optimization in unbounded space and transform the result back into bounded space.

Parameters:

Name	Type	Description	Default
`x`	`Optional[Tensor]`	Deprecated - use `.set_default_x()` prior to `.map()`.	`None`
`num_iter`	`int`	Number of optimization steps that the algorithm takes to find the MAP.	`1000`
`learning_rate`	`float`	Learning rate of the optimizer.	`0.01`
`init_method`	`Union[str, Tensor]`	How to select the starting parameters for the optimization. If it is a string, it can be either [`posterior`, `prior`], which samples the respective distribution `num_init_samples` times. If it is a tensor, the tensor will be used as init locations.	`'proposal'`
`num_init_samples`	`int`	Draw this number of samples from the posterior and evaluate the log-probability of all of them.	`1000`
`num_to_optimize`	`int`	From the drawn `num_init_samples`, use the `num_to_optimize` with highest log-probability as the initial points for the optimization.	`100`
`save_best_every`	`int`	The best log-probability is computed, saved in the `map`-attribute, and printed every `save_best_every`-th iteration. Computing the best log-probability creates a significant overhead (thus, the default is `10`.)	`10`
`show_progress_bars`	`bool`	Whether to show a progressbar during sampling from the posterior.	`False`
`force_update`	`bool`	Whether to re-calculate the MAP when x is unchanged and have a cached value.	`False`
`log_prob_kwargs`		Will be empty for SNLE and SNRE. Will contain {‘norm_posterior’: True} for SNPE.	required

Returns:

Type	Description
`Tensor`	The MAP estimate.

Source code in sbi/inference/posteriors/mcmc_posterior.py

def map(
    self,
    x: Optional[Tensor] = None,
    num_iter: int = 1_000,
    num_to_optimize: int = 100,
    learning_rate: float = 0.01,
    init_method: Union[str, Tensor] = "proposal",
    num_init_samples: int = 1_000,
    save_best_every: int = 10,
    show_progress_bars: bool = False,
    force_update: bool = False,
) -> Tensor:
    r"""Returns the maximum-a-posteriori estimate (MAP).

    The method can be interrupted (Ctrl-C) when the user sees that the
    log-probability converges. The best estimate will be saved in `self._map` and
    can be accessed with `self.map()`. The MAP is obtained by running gradient
    ascent from a given number of starting positions (samples from the posterior
    with the highest log-probability). After the optimization is done, we select the
    parameter set that has the highest log-probability after the optimization.

    Warning: The default values used by this function are not well-tested. They
    might require hand-tuning for the problem at hand.

    For developers: if the prior is a `BoxUniform`, we carry out the optimization
    in unbounded space and transform the result back into bounded space.

    Args:
        x: Deprecated - use `.set_default_x()` prior to `.map()`.
        num_iter: Number of optimization steps that the algorithm takes
            to find the MAP.
        learning_rate: Learning rate of the optimizer.
        init_method: How to select the starting parameters for the optimization. If
            it is a string, it can be either [`posterior`, `prior`], which samples
            the respective distribution `num_init_samples` times. If it is a
            tensor, the tensor will be used as init locations.
        num_init_samples: Draw this number of samples from the posterior and
            evaluate the log-probability of all of them.
        num_to_optimize: From the drawn `num_init_samples`, use the
            `num_to_optimize` with highest log-probability as the initial points
            for the optimization.
        save_best_every: The best log-probability is computed, saved in the
            `map`-attribute, and printed every `save_best_every`-th iteration.
            Computing the best log-probability creates a significant overhead
            (thus, the default is `10`.)
        show_progress_bars: Whether to show a progressbar during sampling from
            the posterior.
        force_update: Whether to re-calculate the MAP when x is unchanged and
            have a cached value.
        log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
            {'norm_posterior': True} for SNPE.

    Returns:
        The MAP estimate.
    """
    return super().map(
        x=x,
        num_iter=num_iter,
        num_to_optimize=num_to_optimize,
        learning_rate=learning_rate,
        init_method=init_method,
        num_init_samples=num_init_samples,
        save_best_every=save_best_every,
        show_progress_bars=show_progress_bars,
        force_update=force_update,
    )

`sample(sample_shape=torch.Size(), x=None, method=None, thin=None, warmup_steps=None, num_chains=None, init_strategy=None, init_strategy_parameters=None, init_strategy_num_candidates=None, mcmc_parameters=None, mcmc_method=None, sample_with=None, num_workers=None, mp_context=None, show_progress_bars=True)` ¶

Return samples from posterior distribution $p(\theta|x)$ with MCMC.

Check the __init__() method for a description of all arguments as well as their default values.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	Desired shape of samples that are drawn from posterior. If sample_shape is multidimensional we simply draw `sample_shape.numel()` samples and then reshape into the desired shape.	`Size()`
`mcmc_parameters`	`Optional[Dict]`	Dictionary that is passed only to support the API of `sbi` v0.17.2 or older.	`None`
`mcmc_method`	`Optional[str]`	This argument only exists to keep backward-compatibility with `sbi` v0.17.2 or older. Please use `method` instead.	`None`
`sample_with`	`Optional[str]`	This argument only exists to keep backward-compatibility with `sbi` v0.17.2 or older. If it is set, we instantly raise an error.	`None`
`show_progress_bars`	`bool`	Whether to show sampling progress monitor.	`True`

Returns:

Type	Description
`Tensor`	Samples from posterior.

Source code in sbi/inference/posteriors/mcmc_posterior.py

def sample(
    self,
    sample_shape: Shape = torch.Size(),
    x: Optional[Tensor] = None,
    method: Optional[str] = None,
    thin: Optional[int] = None,
    warmup_steps: Optional[int] = None,
    num_chains: Optional[int] = None,
    init_strategy: Optional[str] = None,
    init_strategy_parameters: Optional[Dict[str, Any]] = None,
    init_strategy_num_candidates: Optional[int] = None,
    mcmc_parameters: Optional[Dict] = None,
    mcmc_method: Optional[str] = None,
    sample_with: Optional[str] = None,
    num_workers: Optional[int] = None,
    mp_context: Optional[str] = None,
    show_progress_bars: bool = True,
) -> Tensor:
    r"""Return samples from posterior distribution $p(\theta|x)$ with MCMC.

    Check the `__init__()` method for a description of all arguments as well as
    their default values.

    Args:
        sample_shape: Desired shape of samples that are drawn from posterior. If
            sample_shape is multidimensional we simply draw `sample_shape.numel()`
            samples and then reshape into the desired shape.
        mcmc_parameters: Dictionary that is passed only to support the API of
            `sbi` v0.17.2 or older.
        mcmc_method: This argument only exists to keep backward-compatibility with
            `sbi` v0.17.2 or older. Please use `method` instead.
        sample_with: This argument only exists to keep backward-compatibility with
            `sbi` v0.17.2 or older. If it is set, we instantly raise an error.
        show_progress_bars: Whether to show sampling progress monitor.

    Returns:
        Samples from posterior.
    """

    self.potential_fn.set_x(self._x_else_default_x(x))

    # Replace arguments that were not passed with their default.
    method = self.method if method is None else method
    thin = self.thin if thin is None else thin
    warmup_steps = self.warmup_steps if warmup_steps is None else warmup_steps
    num_chains = self.num_chains if num_chains is None else num_chains
    init_strategy = self.init_strategy if init_strategy is None else init_strategy
    num_workers = self.num_workers if num_workers is None else num_workers
    mp_context = self.mp_context if mp_context is None else mp_context
    init_strategy_parameters = (
        self.init_strategy_parameters
        if init_strategy_parameters is None
        else init_strategy_parameters
    )
    if init_strategy_num_candidates is not None:
        warn(
            f"Passing `init_strategy_num_candidates` is deprecated as of sbi \
            v0.19.0. Instead, use e.g., \
            `init_strategy_parameters={'num_candidate_samples': 1000}`",
            stacklevel=2,
        )
        self.init_strategy_parameters["num_candidate_samples"] = (
            init_strategy_num_candidates
        )
    if sample_with is not None:
        raise ValueError(
            f"You set `sample_with={sample_with}`. As of sbi v0.18.0, setting "
            "`sample_with` is no longer supported. You have to rerun "
            f"`.build_posterior(sample_with={sample_with}).`"
        )
    if mcmc_method is not None:
        warn(
            "You passed `mcmc_method` to `.sample()`. As of sbi v0.18.0, this "
            "is deprecated and will be removed in a future release. Use `method` "
            "instead of `mcmc_method`.",
            stacklevel=2,
        )
        method = mcmc_method
    if mcmc_parameters:
        warn(
            "You passed `mcmc_parameters` to `.sample()`. As of sbi v0.18.0, this "
            "is deprecated and will be removed in a future release. Instead, pass "
            "the variable to `.sample()` directly, e.g. "
            "`posterior.sample((1,), num_chains=5)`.",
            stacklevel=2,
        )
    # The following lines are only for backwards compatibility with sbi v0.17.2 or
    # older.
    m_p = mcmc_parameters or {}  # define to shorten the variable name
    method = _maybe_use_dict_entry(method, "mcmc_method", m_p)
    thin = _maybe_use_dict_entry(thin, "thin", m_p)
    warmup_steps = _maybe_use_dict_entry(warmup_steps, "warmup_steps", m_p)
    num_chains = _maybe_use_dict_entry(num_chains, "num_chains", m_p)
    init_strategy = _maybe_use_dict_entry(init_strategy, "init_strategy", m_p)
    self.potential_ = self._prepare_potential(method)  # type: ignore

    initial_params = self._get_initial_params(
        init_strategy,  # type: ignore
        num_chains,  # type: ignore
        num_workers,
        show_progress_bars,
        **init_strategy_parameters,
    )
    num_samples = torch.Size(sample_shape).numel()

    track_gradients = method in ("hmc_pyro", "nuts_pyro", "hmc_pymc", "nuts_pymc")
    with torch.set_grad_enabled(track_gradients):
        if method in ("slice_np", "slice_np_vectorized"):
            transformed_samples = self._slice_np_mcmc(
                num_samples=num_samples,
                potential_function=self.potential_,
                initial_params=initial_params,
                thin=thin,  # type: ignore
                warmup_steps=warmup_steps,  # type: ignore
                vectorized=(method == "slice_np_vectorized"),
                interchangeable_chains=True,
                num_workers=num_workers,
                show_progress_bars=show_progress_bars,
            )
        elif method in ("hmc_pyro", "nuts_pyro"):
            transformed_samples = self._pyro_mcmc(
                num_samples=num_samples,
                potential_function=self.potential_,
                initial_params=initial_params,
                mcmc_method=method,  # type: ignore
                thin=thin,  # type: ignore
                warmup_steps=warmup_steps,  # type: ignore
                num_chains=num_chains,
                show_progress_bars=show_progress_bars,
                mp_context=mp_context,
            )
        elif method in ("hmc_pymc", "nuts_pymc", "slice_pymc"):
            transformed_samples = self._pymc_mcmc(
                num_samples=num_samples,
                potential_function=self.potential_,
                initial_params=initial_params,
                mcmc_method=method,  # type: ignore
                thin=thin,  # type: ignore
                warmup_steps=warmup_steps,  # type: ignore
                num_chains=num_chains,
                show_progress_bars=show_progress_bars,
                mp_context=mp_context,
            )
        else:
            raise NameError(f"The sampling method {method} is not implemented!")

    samples = self.theta_transform.inv(transformed_samples)
    # NOTE: Currently MCMCPosteriors will require a single dimension for the
    # parameter dimension. With recent ConditionalDensity(Ratio) estimators, we
    # can have multiple dimensions for the parameter dimension.
    samples = samples.reshape((*sample_shape, -1))  # type: ignore

    return samples

`sample_batched(sample_shape, x, method=None, thin=None, warmup_steps=None, num_chains=None, init_strategy=None, init_strategy_parameters=None, num_workers=None, mp_context=None, show_progress_bars=True)` ¶

Given a batch of observations [x_1, …, x_B] this function samples from posteriors $p(\theta|x_1)$, … ,$p(\theta|x_B)$, in a batched (i.e. vectorized) manner.

Check the __init__() method for a description of all arguments as well as their default values.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	Desired shape of samples that are drawn from the posterior given every observation.	required
`x`	`Tensor`	A batch of observations, of shape `(batch_dim, event_shape_x)`. `batch_dim` corresponds to the number of observations to be drawn.	required
`method`	`Optional[str]`	Method used for MCMC sampling, e.g., “slice_np_vectorized”.	`None`
`thin`	`Optional[int]`	The thinning factor for the chain, default 1 (no thinning).	`None`
`warmup_steps`	`Optional[int]`	The initial number of samples to discard.	`None`
`num_chains`	`Optional[int]`	The number of chains used for each `x` passed in the batch.	`None`
`init_strategy`	`Optional[str]`	The initialisation strategy for chains.	`None`
`init_strategy_parameters`	`Optional[Dict[str, Any]]`	Dictionary of keyword arguments passed to the init strategy.	`None`
`num_workers`	`Optional[int]`	number of cpu cores used to parallelize initial parameter generation and mcmc sampling.	`None`
`mp_context`	`Optional[str]`	Multiprocessing start method, either `"fork"` or `"spawn"`	`None`
`show_progress_bars`	`bool`	Whether to show sampling progress monitor.	`True`

Returns:

Type	Description
`Tensor`	Samples from the posteriors of shape (sample_shape, B, input_shape)

Source code in sbi/inference/posteriors/mcmc_posterior.py

def sample_batched(
    self,
    sample_shape: Shape,
    x: Tensor,
    method: Optional[str] = None,
    thin: Optional[int] = None,
    warmup_steps: Optional[int] = None,
    num_chains: Optional[int] = None,
    init_strategy: Optional[str] = None,
    init_strategy_parameters: Optional[Dict[str, Any]] = None,
    num_workers: Optional[int] = None,
    mp_context: Optional[str] = None,
    show_progress_bars: bool = True,
) -> Tensor:
    r"""Given a batch of observations [x_1, ..., x_B] this function samples from
    posteriors $p(\theta|x_1)$, ... ,$p(\theta|x_B)$, in a batched (i.e. vectorized)
    manner.

    Check the `__init__()` method for a description of all arguments as well as
    their default values.

    Args:
        sample_shape: Desired shape of samples that are drawn from the posterior
            given every observation.
        x: A batch of observations, of shape `(batch_dim, event_shape_x)`.
            `batch_dim` corresponds to the number of observations to be
            drawn.
        method: Method used for MCMC sampling, e.g., "slice_np_vectorized".
        thin: The thinning factor for the chain, default 1 (no thinning).
        warmup_steps: The initial number of samples to discard.
        num_chains: The number of chains used for each `x` passed in the batch.
        init_strategy: The initialisation strategy for chains.
        init_strategy_parameters: Dictionary of keyword arguments passed to
            the init strategy.
        num_workers: number of cpu cores used to parallelize initial
            parameter generation and mcmc sampling.
        mp_context: Multiprocessing start method, either `"fork"` or `"spawn"`
        show_progress_bars: Whether to show sampling progress monitor.

    Returns:
        Samples from the posteriors of shape (*sample_shape, B, *input_shape)
    """

    # Replace arguments that were not passed with their default.
    method = self.method if method is None else method
    thin = self.thin if thin is None else thin
    warmup_steps = self.warmup_steps if warmup_steps is None else warmup_steps
    num_chains = self.num_chains if num_chains is None else num_chains
    init_strategy = self.init_strategy if init_strategy is None else init_strategy
    num_workers = self.num_workers if num_workers is None else num_workers
    mp_context = self.mp_context if mp_context is None else mp_context
    init_strategy_parameters = (
        self.init_strategy_parameters
        if init_strategy_parameters is None
        else init_strategy_parameters
    )

    assert method == "slice_np_vectorized", (
        "Batched sampling only supported for vectorized samplers!"
    )

    # warn if num_chains is larger than num requested samples
    if num_chains > torch.Size(sample_shape).numel():
        warnings.warn(
            "The passed number of MCMC chains is larger than the number of "
            f"requested samples: {num_chains} > {torch.Size(sample_shape).numel()},"
            f" resetting it to {torch.Size(sample_shape).numel()}.",
            stacklevel=2,
        )
        num_chains = torch.Size(sample_shape).numel()

    # custom shape handling to make sure to match the batch size of x and theta
    # without unnecessary combinations.
    if len(x.shape) == 1:
        x = x.unsqueeze(0)
    batch_size = x.shape[0]

    x = reshape_to_batch_event(x, event_shape=x.shape[1:])

    # For batched sampling, we want `num_chains` for each observation in the batch.
    # Here we repeat the observations ABC -> AAABBBCCC, so that the chains are
    # in the order of the observations.
    x_ = x.repeat_interleave(num_chains, dim=0)

    self.potential_fn.set_x(x_, x_is_iid=False)
    self.potential_ = self._prepare_potential(method)  # type: ignore

    # For each observation in the batch, we have num_chains independent chains.
    num_chains_extended = batch_size * num_chains
    if num_chains_extended > 100:
        warnings.warn(
            "Note that for batched sampling, we use num_chains many chains "
            "for each x in the batch. With the given settings, this results "
            f"in a large number of chains ({num_chains_extended}), which can "
            "be slow and memory-intensive for vectorized MCMC. Consider "
            "reducing the number of chains or batch size.",
            stacklevel=2,
        )
    init_strategy_parameters["num_return_samples"] = num_chains_extended
    initial_params = self._get_initial_params_batched(
        x,
        init_strategy,  # type: ignore
        num_chains,  # type: ignore
        num_workers,
        show_progress_bars,
        **init_strategy_parameters,
    )
    # We need num_samples from each posterior in the batch
    num_samples = torch.Size(sample_shape).numel() * batch_size

    with torch.set_grad_enabled(False):
        transformed_samples = self._slice_np_mcmc(
            num_samples=num_samples,
            potential_function=self.potential_,
            initial_params=initial_params,
            thin=thin,  # type: ignore
            warmup_steps=warmup_steps,  # type: ignore
            vectorized=(method == "slice_np_vectorized"),
            interchangeable_chains=False,
            num_workers=num_workers,
            show_progress_bars=show_progress_bars,
        )

    # (num_chains_extended, samples_per_chain, *input_shape)
    samples_per_chain: Tensor = self.theta_transform.inv(transformed_samples)  # type: ignore
    dim_theta = samples_per_chain.shape[-1]
    # We need to collect samples for each x from the respective chains.
    # However, using samples.reshape(*sample_shape, batch_size, dim_theta)
    # does not combine the samples in the right order, since this mixes
    # samples that belong to different `x`. The following permute is a
    # workaround to reshape the samples in the right order.
    samples_per_x = samples_per_chain.reshape((
        batch_size,
        # We are flattening the sample shape here using -1 because we might have
        # generated more samples than requested (more chains, or multiple of
        # chains not matching sample_shape)
        -1,
        dim_theta,
    )).permute(1, 0, -1)

    # Shape is now (-1, batch_size, dim_theta)
    # We can now select the number of requested samples
    samples = samples_per_x[: torch.Size(sample_shape).numel()]
    # and reshape into (*sample_shape, batch_size, dim_theta)
    samples = samples.reshape((*sample_shape, batch_size, dim_theta))
    return samples

`set_mcmc_method(method)` ¶

Sets sampling method to for MCMC and returns NeuralPosterior.

Parameters:

Name	Type	Description	Default
`method`	`str`	Method to use.	required

Returns:

Type	Description
`NeuralPosterior`	`NeuralPosterior` for chainable calls.

Source code in sbi/inference/posteriors/mcmc_posterior.py

def set_mcmc_method(self, method: str) -> "NeuralPosterior":
    """Sets sampling method to for MCMC and returns `NeuralPosterior`.

    Args:
        method: Method to use.

    Returns:
        `NeuralPosterior` for chainable calls.
    """
    self._mcmc_method = method
    return self

`to(device)` ¶

Moves potential_fn, proposal, x_o and theta_transform to the

specified device. Reinstantiates the posterior and resets the default x_o.

Parameters:

Name	Type	Description	Default
`device`	`Union[str, device]`	Device to move the posterior to.	required

Source code in sbi/inference/posteriors/mcmc_posterior.py

def to(self, device: Union[str, torch.device]) -> None:
    """Moves potential_fn, proposal, x_o and theta_transform to the

    specified device. Reinstantiates the posterior and resets the default x_o.

    Args:
        device: Device to move the posterior to.
    """
    self.device = device
    self.potential_fn.to(device)  # type: ignore
    self.proposal.to(device)
    x_o = None
    if hasattr(self, "_x") and (self._x is not None):
        x_o = self._x.to(device)

    self.theta_transform = mcmc_transform(self.proposal, device=device)

    super().__init__(
        self.potential_fn,
        theta_transform=self.theta_transform,
        device=device,
        x_shape=self.x_shape,
    )
    # super().__init__ erases the self._x, so we need to set it again
    if x_o is not None:
        self.set_default_x(x_o)
    self.potential_ = self._prepare_potential(self.method)

`RejectionPosterior` ¶

Bases: NeuralPosterior

Provides rejection sampling to sample from the posterior.

SNLE or SNRE train neural networks to approximate the likelihood(-ratios). RejectionPosterior allows to sample from the posterior with rejection sampling.

Source code in sbi/inference/posteriors/rejection_posterior.py

class RejectionPosterior(NeuralPosterior):
    r"""Provides rejection sampling to sample from the posterior.

    SNLE or SNRE train neural networks to approximate the likelihood(-ratios).
    `RejectionPosterior` allows to sample from the posterior with rejection sampling.
    """

    def __init__(
        self,
        potential_fn: Union[BasePotential, CustomPotential],
        proposal: Any,
        theta_transform: Optional[TorchTransform] = None,
        max_sampling_batch_size: int = 10_000,
        num_samples_to_find_max: int = 10_000,
        num_iter_to_find_max: int = 100,
        m: float = 1.2,
        device: Optional[Union[str, torch.device]] = None,
        x_shape: Optional[torch.Size] = None,
    ):
        """
        Args:
            potential_fn: The potential function from which to draw samples. Must be a
                `BasePotential` or a `CustomPotential`.
            proposal: The proposal distribution.
            theta_transform: Transformation that is applied to parameters. Is not used
                during but only when calling `.map()`.
            max_sampling_batch_size: The batchsize of samples being drawn from
                the proposal at every iteration.
            num_samples_to_find_max: The number of samples that are used to find the
                maximum of the `potential_fn / proposal` ratio.
            num_iter_to_find_max: The number of gradient ascent iterations to find the
                maximum of the `potential_fn / proposal` ratio.
            m: Multiplier to the `potential_fn / proposal` ratio.
            device: Training device, e.g., "cpu", "cuda" or "cuda:0". If None,
                `potential_fn.device` is used.
            x_shape: Deprecated, should not be passed.
        """
        super().__init__(
            potential_fn,
            theta_transform=theta_transform,
            device=device,
            x_shape=x_shape,
        )

        self.proposal = proposal
        self.max_sampling_batch_size = max_sampling_batch_size
        self.num_samples_to_find_max = num_samples_to_find_max
        self.num_iter_to_find_max = num_iter_to_find_max
        self.m = m
        self.x_shape = x_shape

        self._purpose = (
            "It provides rejection sampling to .sample() from the posterior and "
            "can evaluate the _unnormalized_ posterior density with .log_prob()."
        )

    def to(self, device: Union[str, torch.device]) -> None:
        """
        Move potential fucntion, proposal and x_o to the device.

        This method reinstantiates the posterior and resets the default x_o

        Args:
            device: The device to move the posterior to.
        """
        self.device = device
        self.potential_fn.to(device)  # type: ignore
        self.proposal.to(device)
        x_o = None
        if hasattr(self, "_x") and (self._x is not None):
            x_o = self._x.to(device)

        self.theta_transform = mcmc_transform(self.proposal, device=device)
        super().__init__(
            self.potential_fn,
            theta_transform=self.theta_transform,
            device=device,
            x_shape=self.x_shape,
        )
        # super().__init__ erases the self._x, so we need to set it again
        if x_o is not None:
            self.set_default_x(x_o)

    def log_prob(
        self, theta: Tensor, x: Optional[Tensor] = None, track_gradients: bool = False
    ) -> Tensor:
        r"""Returns the log-probability of theta under the posterior.

        Args:
            theta: Parameters $\theta$.
            track_gradients: Whether the returned tensor supports tracking gradients.
                This can be helpful for e.g. sensitivity analysis, but increases memory
                consumption.

        Returns:
            `len($\theta$)`-shaped log-probability.
        """
        warn(
            "`.log_prob()` is deprecated for methods that can only evaluate the "
            "log-probability up to a normalizing constant. Use `.potential()` instead.",
            stacklevel=2,
        )
        warn("The log-probability is unnormalized!", stacklevel=2)

        self.potential_fn.set_x(self._x_else_default_x(x))

        theta = ensure_theta_batched(torch.as_tensor(theta))
        return self.potential_fn(
            theta.to(self._device), track_gradients=track_gradients
        )

    def sample(
        self,
        sample_shape: Shape = torch.Size(),
        x: Optional[Tensor] = None,
        max_sampling_batch_size: Optional[int] = None,
        num_samples_to_find_max: Optional[int] = None,
        num_iter_to_find_max: Optional[int] = None,
        m: Optional[float] = None,
        sample_with: Optional[str] = None,
        show_progress_bars: bool = True,
    ):
        r"""Return samples from posterior $p(\theta|x)$ via rejection sampling.

        Args:
            sample_shape: Desired shape of samples that are drawn from posterior. If
                sample_shape is multidimensional we simply draw `sample_shape.numel()`
                samples and then reshape into the desired shape.
            sample_with: This argument only exists to keep backward-compatibility with
                `sbi` v0.17.2 or older. If it is set, we instantly raise an error.
            show_progress_bars: Whether to show sampling progress monitor.

        Returns:
            Samples from posterior.
        """
        num_samples = torch.Size(sample_shape).numel()
        self.potential_fn.set_x(self._x_else_default_x(x))

        potential = partial(self.potential_fn, track_gradients=True)

        if sample_with is not None:
            raise ValueError(
                f"You set `sample_with={sample_with}`. As of sbi v0.18.0, setting "
                f"`sample_with` is no longer supported. You have to rerun "
                f"`.build_posterior(sample_with={sample_with}).`"
            )
        # Replace arguments that were not passed with their default.
        max_sampling_batch_size = (
            self.max_sampling_batch_size
            if max_sampling_batch_size is None
            else max_sampling_batch_size
        )
        num_samples_to_find_max = (
            self.num_samples_to_find_max
            if num_samples_to_find_max is None
            else num_samples_to_find_max
        )
        num_iter_to_find_max = (
            self.num_iter_to_find_max
            if num_iter_to_find_max is None
            else num_iter_to_find_max
        )
        m = self.m if m is None else m

        samples, _ = rejection_sample(
            potential,
            proposal=self.proposal,
            num_samples=num_samples,
            show_progress_bars=show_progress_bars,
            warn_acceptance=0.01,
            max_sampling_batch_size=max_sampling_batch_size,
            num_samples_to_find_max=num_samples_to_find_max,
            num_iter_to_find_max=num_iter_to_find_max,
            m=m,
            device=self._device,
        )

        return samples.reshape((*sample_shape, -1))

    def sample_batched(
        self,
        sample_shape: Shape,
        x: Tensor,
        max_sampling_batch_size: int = 10000,
        show_progress_bars: bool = True,
    ) -> Tensor:
        raise NotImplementedError(
            "Batched sampling is not implemented for RejectionPosterior. \
            Alternatively you can use `sample` in a loop \
            [posterior.sample(theta, x_o) for x_o in x]."
        )

    def map(
        self,
        x: Optional[Tensor] = None,
        num_iter: int = 1_000,
        num_to_optimize: int = 100,
        learning_rate: float = 0.01,
        init_method: Union[str, Tensor] = "proposal",
        num_init_samples: int = 1_000,
        save_best_every: int = 10,
        show_progress_bars: bool = False,
        force_update: bool = False,
    ) -> Tensor:
        r"""Returns the maximum-a-posteriori estimate (MAP).

        The method can be interrupted (Ctrl-C) when the user sees that the
        log-probability converges. The best estimate will be saved in `self._map` and
        can be accessed with `self.map()`. The MAP is obtained by running gradient
        ascent from a given number of starting positions (samples from the posterior
        with the highest log-probability). After the optimization is done, we select the
        parameter set that has the highest log-probability after the optimization.

        Warning: The default values used by this function are not well-tested. They
        might require hand-tuning for the problem at hand.

        For developers: if the prior is a `BoxUniform`, we carry out the optimization
        in unbounded space and transform the result back into bounded space.

        Args:
            x: Deprecated - use `.set_default_x()` prior to `.map()`.
            num_iter: Number of optimization steps that the algorithm takes
                to find the MAP.
            learning_rate: Learning rate of the optimizer.
            init_method: How to select the starting parameters for the optimization. If
                it is a string, it can be either [`posterior`, `prior`], which samples
                the respective distribution `num_init_samples` times. If it is a
                tensor, the tensor will be used as init locations.
            num_init_samples: Draw this number of samples from the posterior and
                evaluate the log-probability of all of them.
            num_to_optimize: From the drawn `num_init_samples`, use the
                `num_to_optimize` with highest log-probability as the initial points
                for the optimization.
            save_best_every: The best log-probability is computed, saved in the
                `map`-attribute, and printed every `save_best_every`-th iteration.
                Computing the best log-probability creates a significant overhead
                (thus, the default is `10`.)
            show_progress_bars: Whether to show a progressbar during sampling from
                the posterior.
            force_update: Whether to re-calculate the MAP when x is unchanged and
                have a cached value.
            log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
                {'norm_posterior': True} for SNPE.

        Returns:
            The MAP estimate.
        """
        return super().map(
            x=x,
            num_iter=num_iter,
            num_to_optimize=num_to_optimize,
            learning_rate=learning_rate,
            init_method=init_method,
            num_init_samples=num_init_samples,
            save_best_every=save_best_every,
            show_progress_bars=show_progress_bars,
            force_update=force_update,
        )

`init(potential_fn, proposal, theta_transform=None, max_sampling_batch_size=10000, num_samples_to_find_max=10000, num_iter_to_find_max=100, m=1.2, device=None, x_shape=None)` ¶

Parameters:

Name	Type	Description	Default
`potential_fn`	`Union[BasePotential, CustomPotential]`	The potential function from which to draw samples. Must be a `BasePotential` or a `CustomPotential`.	required
`proposal`	`Any`	The proposal distribution.	required
`theta_transform`	`Optional[TorchTransform]`	Transformation that is applied to parameters. Is not used during but only when calling `.map()`.	`None`
`max_sampling_batch_size`	`int`	The batchsize of samples being drawn from the proposal at every iteration.	`10000`
`num_samples_to_find_max`	`int`	The number of samples that are used to find the maximum of the `potential_fn / proposal` ratio.	`10000`
`num_iter_to_find_max`	`int`	The number of gradient ascent iterations to find the maximum of the `potential_fn / proposal` ratio.	`100`
`m`	`float`	Multiplier to the `potential_fn / proposal` ratio.	`1.2`
`device`	`Optional[Union[str, device]]`	Training device, e.g., “cpu”, “cuda” or “cuda:0”. If None, `potential_fn.device` is used.	`None`
`x_shape`	`Optional[Size]`	Deprecated, should not be passed.	`None`

Source code in sbi/inference/posteriors/rejection_posterior.py

def __init__(
    self,
    potential_fn: Union[BasePotential, CustomPotential],
    proposal: Any,
    theta_transform: Optional[TorchTransform] = None,
    max_sampling_batch_size: int = 10_000,
    num_samples_to_find_max: int = 10_000,
    num_iter_to_find_max: int = 100,
    m: float = 1.2,
    device: Optional[Union[str, torch.device]] = None,
    x_shape: Optional[torch.Size] = None,
):
    """
    Args:
        potential_fn: The potential function from which to draw samples. Must be a
            `BasePotential` or a `CustomPotential`.
        proposal: The proposal distribution.
        theta_transform: Transformation that is applied to parameters. Is not used
            during but only when calling `.map()`.
        max_sampling_batch_size: The batchsize of samples being drawn from
            the proposal at every iteration.
        num_samples_to_find_max: The number of samples that are used to find the
            maximum of the `potential_fn / proposal` ratio.
        num_iter_to_find_max: The number of gradient ascent iterations to find the
            maximum of the `potential_fn / proposal` ratio.
        m: Multiplier to the `potential_fn / proposal` ratio.
        device: Training device, e.g., "cpu", "cuda" or "cuda:0". If None,
            `potential_fn.device` is used.
        x_shape: Deprecated, should not be passed.
    """
    super().__init__(
        potential_fn,
        theta_transform=theta_transform,
        device=device,
        x_shape=x_shape,
    )

    self.proposal = proposal
    self.max_sampling_batch_size = max_sampling_batch_size
    self.num_samples_to_find_max = num_samples_to_find_max
    self.num_iter_to_find_max = num_iter_to_find_max
    self.m = m
    self.x_shape = x_shape

    self._purpose = (
        "It provides rejection sampling to .sample() from the posterior and "
        "can evaluate the _unnormalized_ posterior density with .log_prob()."
    )

`log_prob(theta, x=None, track_gradients=False)` ¶

Returns the log-probability of theta under the posterior.

Parameters:

Name	Type	Description	Default
`theta`	`Tensor`	Parameters $\theta$.	required
`track_gradients`	`bool`	Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.	`False`

Returns:

Type	Description
`Tensor`	`len($\theta$)`-shaped log-probability.

Source code in sbi/inference/posteriors/rejection_posterior.py

def log_prob(
    self, theta: Tensor, x: Optional[Tensor] = None, track_gradients: bool = False
) -> Tensor:
    r"""Returns the log-probability of theta under the posterior.

    Args:
        theta: Parameters $\theta$.
        track_gradients: Whether the returned tensor supports tracking gradients.
            This can be helpful for e.g. sensitivity analysis, but increases memory
            consumption.

    Returns:
        `len($\theta$)`-shaped log-probability.
    """
    warn(
        "`.log_prob()` is deprecated for methods that can only evaluate the "
        "log-probability up to a normalizing constant. Use `.potential()` instead.",
        stacklevel=2,
    )
    warn("The log-probability is unnormalized!", stacklevel=2)

    self.potential_fn.set_x(self._x_else_default_x(x))

    theta = ensure_theta_batched(torch.as_tensor(theta))
    return self.potential_fn(
        theta.to(self._device), track_gradients=track_gradients
    )

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

Returns the maximum-a-posteriori estimate (MAP).

The method can be interrupted (Ctrl-C) when the user sees that the log-probability converges. The best estimate will be saved in self._map and can be accessed with self.map(). The MAP is obtained by running gradient ascent from a given number of starting positions (samples from the posterior with the highest log-probability). After the optimization is done, we select the parameter set that has the highest log-probability after the optimization.

Warning: The default values used by this function are not well-tested. They might require hand-tuning for the problem at hand.

For developers: if the prior is a BoxUniform, we carry out the optimization in unbounded space and transform the result back into bounded space.

Parameters:

Name	Type	Description	Default
`x`	`Optional[Tensor]`	Deprecated - use `.set_default_x()` prior to `.map()`.	`None`
`num_iter`	`int`	Number of optimization steps that the algorithm takes to find the MAP.	`1000`
`learning_rate`	`float`	Learning rate of the optimizer.	`0.01`
`init_method`	`Union[str, Tensor]`	How to select the starting parameters for the optimization. If it is a string, it can be either [`posterior`, `prior`], which samples the respective distribution `num_init_samples` times. If it is a tensor, the tensor will be used as init locations.	`'proposal'`
`num_init_samples`	`int`	Draw this number of samples from the posterior and evaluate the log-probability of all of them.	`1000`
`num_to_optimize`	`int`	From the drawn `num_init_samples`, use the `num_to_optimize` with highest log-probability as the initial points for the optimization.	`100`
`save_best_every`	`int`	The best log-probability is computed, saved in the `map`-attribute, and printed every `save_best_every`-th iteration. Computing the best log-probability creates a significant overhead (thus, the default is `10`.)	`10`
`show_progress_bars`	`bool`	Whether to show a progressbar during sampling from the posterior.	`False`
`force_update`	`bool`	Whether to re-calculate the MAP when x is unchanged and have a cached value.	`False`
`log_prob_kwargs`		Will be empty for SNLE and SNRE. Will contain {‘norm_posterior’: True} for SNPE.	required

Returns:

Type	Description
`Tensor`	The MAP estimate.

Source code in sbi/inference/posteriors/rejection_posterior.py

def map(
    self,
    x: Optional[Tensor] = None,
    num_iter: int = 1_000,
    num_to_optimize: int = 100,
    learning_rate: float = 0.01,
    init_method: Union[str, Tensor] = "proposal",
    num_init_samples: int = 1_000,
    save_best_every: int = 10,
    show_progress_bars: bool = False,
    force_update: bool = False,
) -> Tensor:
    r"""Returns the maximum-a-posteriori estimate (MAP).

    The method can be interrupted (Ctrl-C) when the user sees that the
    log-probability converges. The best estimate will be saved in `self._map` and
    can be accessed with `self.map()`. The MAP is obtained by running gradient
    ascent from a given number of starting positions (samples from the posterior
    with the highest log-probability). After the optimization is done, we select the
    parameter set that has the highest log-probability after the optimization.

    Warning: The default values used by this function are not well-tested. They
    might require hand-tuning for the problem at hand.

    For developers: if the prior is a `BoxUniform`, we carry out the optimization
    in unbounded space and transform the result back into bounded space.

    Args:
        x: Deprecated - use `.set_default_x()` prior to `.map()`.
        num_iter: Number of optimization steps that the algorithm takes
            to find the MAP.
        learning_rate: Learning rate of the optimizer.
        init_method: How to select the starting parameters for the optimization. If
            it is a string, it can be either [`posterior`, `prior`], which samples
            the respective distribution `num_init_samples` times. If it is a
            tensor, the tensor will be used as init locations.
        num_init_samples: Draw this number of samples from the posterior and
            evaluate the log-probability of all of them.
        num_to_optimize: From the drawn `num_init_samples`, use the
            `num_to_optimize` with highest log-probability as the initial points
            for the optimization.
        save_best_every: The best log-probability is computed, saved in the
            `map`-attribute, and printed every `save_best_every`-th iteration.
            Computing the best log-probability creates a significant overhead
            (thus, the default is `10`.)
        show_progress_bars: Whether to show a progressbar during sampling from
            the posterior.
        force_update: Whether to re-calculate the MAP when x is unchanged and
            have a cached value.
        log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
            {'norm_posterior': True} for SNPE.

    Returns:
        The MAP estimate.
    """
    return super().map(
        x=x,
        num_iter=num_iter,
        num_to_optimize=num_to_optimize,
        learning_rate=learning_rate,
        init_method=init_method,
        num_init_samples=num_init_samples,
        save_best_every=save_best_every,
        show_progress_bars=show_progress_bars,
        force_update=force_update,
    )

`sample(sample_shape=torch.Size(), x=None, max_sampling_batch_size=None, num_samples_to_find_max=None, num_iter_to_find_max=None, m=None, sample_with=None, show_progress_bars=True)` ¶

Return samples from posterior $p(\theta|x)$ via rejection sampling.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	Desired shape of samples that are drawn from posterior. If sample_shape is multidimensional we simply draw `sample_shape.numel()` samples and then reshape into the desired shape.	`Size()`
`sample_with`	`Optional[str]`	This argument only exists to keep backward-compatibility with `sbi` v0.17.2 or older. If it is set, we instantly raise an error.	`None`
`show_progress_bars`	`bool`	Whether to show sampling progress monitor.	`True`

Returns:

Type	Description
	Samples from posterior.

Source code in sbi/inference/posteriors/rejection_posterior.py

def sample(
    self,
    sample_shape: Shape = torch.Size(),
    x: Optional[Tensor] = None,
    max_sampling_batch_size: Optional[int] = None,
    num_samples_to_find_max: Optional[int] = None,
    num_iter_to_find_max: Optional[int] = None,
    m: Optional[float] = None,
    sample_with: Optional[str] = None,
    show_progress_bars: bool = True,
):
    r"""Return samples from posterior $p(\theta|x)$ via rejection sampling.

    Args:
        sample_shape: Desired shape of samples that are drawn from posterior. If
            sample_shape is multidimensional we simply draw `sample_shape.numel()`
            samples and then reshape into the desired shape.
        sample_with: This argument only exists to keep backward-compatibility with
            `sbi` v0.17.2 or older. If it is set, we instantly raise an error.
        show_progress_bars: Whether to show sampling progress monitor.

    Returns:
        Samples from posterior.
    """
    num_samples = torch.Size(sample_shape).numel()
    self.potential_fn.set_x(self._x_else_default_x(x))

    potential = partial(self.potential_fn, track_gradients=True)

    if sample_with is not None:
        raise ValueError(
            f"You set `sample_with={sample_with}`. As of sbi v0.18.0, setting "
            f"`sample_with` is no longer supported. You have to rerun "
            f"`.build_posterior(sample_with={sample_with}).`"
        )
    # Replace arguments that were not passed with their default.
    max_sampling_batch_size = (
        self.max_sampling_batch_size
        if max_sampling_batch_size is None
        else max_sampling_batch_size
    )
    num_samples_to_find_max = (
        self.num_samples_to_find_max
        if num_samples_to_find_max is None
        else num_samples_to_find_max
    )
    num_iter_to_find_max = (
        self.num_iter_to_find_max
        if num_iter_to_find_max is None
        else num_iter_to_find_max
    )
    m = self.m if m is None else m

    samples, _ = rejection_sample(
        potential,
        proposal=self.proposal,
        num_samples=num_samples,
        show_progress_bars=show_progress_bars,
        warn_acceptance=0.01,
        max_sampling_batch_size=max_sampling_batch_size,
        num_samples_to_find_max=num_samples_to_find_max,
        num_iter_to_find_max=num_iter_to_find_max,
        m=m,
        device=self._device,
    )

    return samples.reshape((*sample_shape, -1))

`to(device)` ¶

Move potential fucntion, proposal and x_o to the device.

This method reinstantiates the posterior and resets the default x_o

Parameters:

Name	Type	Description	Default
`device`	`Union[str, device]`	The device to move the posterior to.	required

Source code in sbi/inference/posteriors/rejection_posterior.py

def to(self, device: Union[str, torch.device]) -> None:
    """
    Move potential fucntion, proposal and x_o to the device.

    This method reinstantiates the posterior and resets the default x_o

    Args:
        device: The device to move the posterior to.
    """
    self.device = device
    self.potential_fn.to(device)  # type: ignore
    self.proposal.to(device)
    x_o = None
    if hasattr(self, "_x") and (self._x is not None):
        x_o = self._x.to(device)

    self.theta_transform = mcmc_transform(self.proposal, device=device)
    super().__init__(
        self.potential_fn,
        theta_transform=self.theta_transform,
        device=device,
        x_shape=self.x_shape,
    )
    # super().__init__ erases the self._x, so we need to set it again
    if x_o is not None:
        self.set_default_x(x_o)

`VectorFieldPosterior` ¶

Bases: NeuralPosterior

Posterior based on flow- or score-matching estimators.

This posterior samples from the vector field model - typically a score-based or a flow matching model - given the vector_field_estimator and rejects samples that lie outside of the prior bounds.

The posterior is defined by a vector field estimator and a prior. The vector field estimator defines a continuous transformation from a base distribution to the approximated posterior distribution. Sampling is done by running either an ordinary differential equation (ODE) or a stochastic differential equation (SDE) defined by the vector field estimator with the starting points sampled from the base distribution.

Log probabilities are obtained by calling the potential function, which in turn uses the ODE to compute the log-probability.

Source code in sbi/inference/posteriors/vector_field_posterior.py

class VectorFieldPosterior(NeuralPosterior):
    r"""Posterior based on flow- or score-matching estimators.

    This posterior samples from the vector field model - typically a score-based or a
    flow matching model - given the `vector_field_estimator` and rejects samples that
    lie outside of the prior bounds.

    The posterior is defined by a vector field estimator and a prior. The vector field
    estimator defines a continuous transformation from a base distribution to the
    approximated posterior distribution. Sampling is done by running either
    an ordinary differential equation (ODE) or a stochastic differential equation
    (SDE) defined by the vector field estimator with the starting points sampled from
    the base distribution.

    Log probabilities are obtained by calling the potential function, which in turn uses
    the ODE to compute the log-probability.
    """

    def __init__(
        self,
        vector_field_estimator: ConditionalVectorFieldEstimator,
        prior: Distribution,  # type: ignore
        max_sampling_batch_size: int = 10_000,
        device: Optional[Union[str, torch.device]] = None,
        enable_transform: bool = True,
        sample_with: Literal["ode", "sde"] = "sde",
        **kwargs,
    ):
        """
        Args:
            prior: Prior distribution with `.log_prob()` and `.sample()`.
            vector_field_estimator: The trained vector field estimator.
            max_sampling_batch_size: Batchsize of samples being drawn from
                the proposal at every iteration.
            device: Training device, e.g., "cpu", "cuda" or "cuda:0". If None,
                `potential_fn.device` is used.
            enable_transform: Whether to transform parameters to unconstrained space
                during MAP optimization. When False, an identity transform will be
                returned for `theta_transform`. True is not supported yet.
            sample_with: Whether to sample from the posterior using the ODE-based
                sampler or the SDE-based sampler.
            **kwargs: Additional keyword arguments passed to
                `VectorFieldBasedPotential`.
        """

        check_prior(prior)
        potential_fn, theta_transform = vector_field_estimator_based_potential(
            vector_field_estimator,
            prior,
            x_o=None,
            enable_transform=enable_transform,
            **kwargs,
        )
        super().__init__(
            potential_fn=potential_fn,
            theta_transform=theta_transform,
            device=device,
        )
        # Set the potential function type.
        self.potential_fn: VectorFieldBasedPotential = potential_fn

        self.prior = prior
        self.enable_transform = enable_transform
        self.vector_field_estimator = vector_field_estimator
        self.device = device

        self.sample_with = sample_with
        assert self.sample_with in [
            "ode",
            "sde",
        ], f"sample_with must be 'ode' or 'sde', but is {self.sample_with}."
        self.max_sampling_batch_size = max_sampling_batch_size

        self._purpose = """It samples from the vector field model given the \
            vector_field_estimator."""

    def to(self, device: Union[str, torch.device]) -> None:
        """Move posterior to device.

        Args:
            device: device where to move the posterior to.
        """
        self.device = device
        if hasattr(self.prior, "to"):
            self.prior.to(device)  # type: ignore
        else:
            raise ValueError("""Prior has no attribute to(device).""")
        if hasattr(self.vector_field_estimator, "to"):
            self.vector_field_estimator.to(device)
        else:
            raise ValueError("""Posterior estimator has no attribute to(device).""")

        potential_fn, theta_transform = vector_field_estimator_based_potential(
            self.vector_field_estimator,
            self.prior,
            x_o=None,
            enable_transform=self.enable_transform,
        )
        x_o = None
        if hasattr(self, "_x") and (self._x is not None):
            x_o = self._x.to(device)
        super().__init__(
            potential_fn=potential_fn,
            theta_transform=theta_transform,
            device=device,
        )
        # super().__init__ erases the self._x, so we need to set it again
        if x_o is not None:
            self.set_default_x(x_o)

        self.potential_fn: VectorFieldBasedPotential = potential_fn

    def sample(
        self,
        sample_shape: Shape = torch.Size(),
        x: Optional[Tensor] = None,
        predictor: Union[str, Predictor] = "euler_maruyama",
        corrector: Optional[Union[str, Corrector]] = None,
        predictor_params: Optional[Dict] = None,
        corrector_params: Optional[Dict] = None,
        steps: int = 500,
        ts: Optional[Tensor] = None,
        iid_method: Optional[
            Literal["fnpe", "gauss", "auto_gauss", "jac_gauss"]
        ] = None,
        iid_params: Optional[Dict] = None,
        max_sampling_batch_size: int = 10_000,
        sample_with: Optional[str] = None,
        show_progress_bars: bool = True,
    ) -> Tensor:
        r"""Return samples from posterior distribution $p(\theta|x)$.

        Args:
            sample_shape: Shape of the samples to be drawn.
            predictor: The predictor for the vector field sampler. Can be a string or
                a custom predictor following the API in `sbi.samplers.score.predictors`.
                Currently, only `euler_maruyama` is implemented.
            corrector: The corrector for the vector field sampler. Either of
                [None].
            predictor_params: Additional parameters passed to predictor.
            corrector_params: Additional parameters passed to corrector.
            steps: Number of steps to take for the Euler-Maruyama method.
                If `sample_with` is "ode", this is ignored.
            ts: Time points at which to evaluate the vector field process. If None, a
                linear grid between t_max and t_min is used. If `sample_with` is "ode",
                this is ignored.
            iid_method: Which method to use for computing the score in the iid setting.
                We currently support "fnpe", "gauss", "auto_gauss", "jac_gauss". The
                fnpe method is simple and generally applicable. However, it can become
                inaccurate already for quite a few iid samples (as it based on heuristic
                approximations), and should be used at best only with a `corrector`. The
                "gauss" methods are more accurate, by aiming for an efficient
                approximation of the correct marginal score in the iid case. This
                however requires estimating some hyperparamters, which is done in a
                systematic way in the "auto_gauss" (initial overhead) and "jac_gauss"
                (iterative jacobian computations are expensive). We default to
                "auto_gauss" for these reasons. Note that in order to use the iid
                method, the vector field estimator must support it and have
                SCORE_DEFINED and MARGINALS_DEFINED class attributes set to True.
            iid_params: Additional parameters passed to the iid method. See the specific
                `IIDScoreFunction` child class for details.
            max_sampling_batch_size: Maximum batch size for sampling.
            sample_with: Sampling method to use - 'ode' or 'sde'. Note that in order to
                use the 'sde' sampling method, the vector field estimator must support
                it and have the SCORE_DEFINED class attribute set to True.
            show_progress_bars: Whether to show a progress bar during sampling.
        """

        if sample_with is None:
            sample_with = self.sample_with

        x = self._x_else_default_x(x)
        x = reshape_to_batch_event(x, self.vector_field_estimator.condition_shape)
        is_iid = x.shape[0] > 1
        self.potential_fn.set_x(
            x,
            x_is_iid=is_iid,
            iid_method=iid_method or self.potential_fn.iid_method,
            iid_params=iid_params,
        )

        num_samples = torch.Size(sample_shape).numel()

        if sample_with == "ode":
            samples, _ = rejection.accept_reject_sample(
                proposal=self.sample_via_ode,
                accept_reject_fn=lambda theta: within_support(self.prior, theta),
                num_samples=num_samples,
                show_progress_bars=show_progress_bars,
                max_sampling_batch_size=max_sampling_batch_size,
            )
        elif sample_with == "sde":
            proposal_sampling_kwargs = {
                "predictor": predictor,
                "corrector": corrector,
                "predictor_params": predictor_params,
                "corrector_params": corrector_params,
                "steps": steps,
                "ts": ts,
                "max_sampling_batch_size": max_sampling_batch_size,
                "show_progress_bars": show_progress_bars,
            }
            samples, _ = rejection.accept_reject_sample(
                proposal=self._sample_via_diffusion,
                accept_reject_fn=lambda theta: within_support(self.prior, theta),
                num_samples=num_samples,
                show_progress_bars=show_progress_bars,
                max_sampling_batch_size=max_sampling_batch_size,
                proposal_sampling_kwargs=proposal_sampling_kwargs,
            )
        else:
            raise ValueError(
                f"Expected sample_with to be 'ode' or 'sde', but got {sample_with}."
            )

        samples = samples.reshape(
            sample_shape + self.vector_field_estimator.input_shape
        )
        return samples

    def _sample_via_diffusion(
        self,
        sample_shape: Shape = torch.Size(),
        predictor: Union[str, Predictor] = "euler_maruyama",
        corrector: Optional[Union[str, Corrector]] = None,
        predictor_params: Optional[Dict] = None,
        corrector_params: Optional[Dict] = None,
        steps: int = 500,
        ts: Optional[Tensor] = None,
        max_sampling_batch_size: int = 10_000,
        show_progress_bars: bool = True,
        save_intermediate: bool = False,
    ) -> Tensor:
        r"""Return samples from posterior distribution $p(\theta|x)$.

        NOTE: this method can be unsupported for some vector field estimators, e.g.,
        if the vector field estimator was trained with a custom flow matching routine
        for which the corresponding score is not defined.

        Args:
            sample_shape: Shape of the samples to be drawn.
            predictor: The predictor for the diffusion-based sampler. Can be a string or
                a custom predictor following the API in `sbi.samplers.score.predictors`.
                Currently, only `euler_maruyama` is implemented.
            corrector: The corrector for the diffusion-based sampler. Either of
                [None].
            steps: Number of steps to take for the Euler-Maruyama method.
            ts: Time points at which to evaluate the diffusion process. If None, a
                linear grid between t_max and t_min is used.
            max_sampling_batch_size: Maximum batch size for sampling.
            sample_with: Deprecated - use `.build_posterior(sample_with=...)` prior to
                `.sample()`.
            show_progress_bars: Whether to show a progress bar during sampling.
            save_intermediate: Whether to save intermediate results of the diffusion
                process. If True, the returned tensor has shape
                `(*sample_shape, steps, *input_shape)`.
        """

        if not self.vector_field_estimator.SCORE_DEFINED:
            raise ValueError(
                "The vector field estimator does not support the 'sde' sampling method."
            )

        total_samples_needed = torch.Size(sample_shape).numel()

        # Determine effective batch size for sampling
        effective_batch_size = (
            self.max_sampling_batch_size
            if max_sampling_batch_size is None
            else max_sampling_batch_size
        )
        # Ensure we don't use larger batches than total samples needed
        effective_batch_size = min(effective_batch_size, total_samples_needed)

        # TODO: the time schedule should be provided by the estimator, see issue #1437
        if ts is None:
            t_max = self.vector_field_estimator.t_max
            t_min = self.vector_field_estimator.t_min
            ts = torch.linspace(t_max, t_min, steps)
        ts = ts.to(self.device)

        # Initialize the diffusion sampler
        diffuser = Diffuser(
            self.potential_fn,
            predictor=predictor,
            corrector=corrector,
            predictor_params=predictor_params,
            corrector_params=corrector_params,
        )

        # Calculate how many batches we need
        num_batches = math.ceil(total_samples_needed / effective_batch_size)

        # Generate samples in batches
        all_samples = []
        samples_generated = 0

        for _ in range(num_batches):
            # Calculate how many samples to generate in this batch
            remaining_samples = total_samples_needed - samples_generated
            current_batch_size = min(effective_batch_size, remaining_samples)

            # Generate samples for this batch
            batch_samples = diffuser.run(
                num_samples=current_batch_size,
                ts=ts,
                show_progress_bars=show_progress_bars,
                save_intermediate=save_intermediate,
            )

            all_samples.append(batch_samples)
            samples_generated += current_batch_size

        # Concatenate all batches and ensure we return exactly the requested number
        samples = torch.cat(all_samples, dim=0)[:total_samples_needed]

        if torch.isnan(samples).all():
            raise RuntimeError(
                "All samples NaN after diffusion sampling. "
                "This may indicate numerical instability in the vector field."
            )

        return samples

    def sample_via_ode(
        self,
        sample_shape: Shape = torch.Size(),
        **kwargs,
    ) -> Tensor:
        r"""
        Return samples from posterior distribution with probability flow ODE.

        This builds the probability flow ODE and then samples from the corresponding
        flow.

        Args:
            sample_shape: The shape of the samples to be returned.
            **kwargs: Additional keyword arguments for the ODE solver that
                depend on the used ODE backend.

        Returns:
            Samples from the approximated posterior distribution
                :math:`\theta \sim p(\theta|x)`.
        """
        num_samples = torch.Size(sample_shape).numel()

        samples = self.potential_fn.neural_ode(self.potential_fn.x_o, **kwargs).sample(
            torch.Size((num_samples,))
        )

        return samples

    def log_prob(
        self,
        theta: Tensor,
        x: Optional[Tensor] = None,
        track_gradients: bool = False,
        ode_kwargs: Optional[Dict] = None,
    ) -> Tensor:
        r"""Returns the log-probability of the posterior $p(\theta|x)$.

        This requires building and evaluating the probability flow ODE.

        Args:
            theta: Parameters $\theta$.
            x: Observed data $x_o$. If None, the default $x_o$ is used.
            track_gradients: Whether the returned tensor supports tracking gradients.
                This can be helpful for e.g. sensitivity analysis, but increases memory
                consumption.
            ode_kwargs: Additional keyword arguments for the ODE solver.

        Returns:
            `(len(θ),)`-shaped log posterior probability $\log p(\theta|x)$ for θ in the
            support of the prior, -∞ (corresponding to 0 probability) outside.
        """
        x = self._x_else_default_x(x)
        x = reshape_to_batch_event(x, self.vector_field_estimator.condition_shape)
        is_iid = x.shape[0] > 1
        self.potential_fn.set_x(x, x_is_iid=is_iid, **(ode_kwargs or {}))

        theta = ensure_theta_batched(torch.as_tensor(theta))
        return self.potential_fn(
            theta.to(self._device),
            track_gradients=track_gradients,
        )

    def sample_batched(
        self,
        sample_shape: torch.Size,
        x: Tensor,
        predictor: Union[str, Predictor] = "euler_maruyama",
        corrector: Optional[Union[str, Corrector]] = None,
        predictor_params: Optional[Dict] = None,
        corrector_params: Optional[Dict] = None,
        steps: int = 500,
        ts: Optional[Tensor] = None,
        max_sampling_batch_size: int = 10000,
        show_progress_bars: bool = True,
    ) -> Tensor:
        r"""Given a batch of observations [x_1, ..., x_B] this function samples from
        posteriors $p(\theta|x_1)$, ... ,$p(\theta|x_B)$, in a batched (i.e. vectorized)
        manner.

        Args:
            sample_shape: Desired shape of samples that are drawn from the posterior
                given every observation.
            x: A batch of observations, of shape `(batch_dim, event_shape_x)`.
                `batch_dim` corresponds to the number of observations to be
                drawn.
            predictor: The predictor for the diffusion-based sampler. Can be a string or
                a custom predictor following the API in `sbi.samplers.score.predictors`.
                Currently, only `euler_maruyama` is implemented.
            corrector: The corrector for the diffusion-based sampler.
            predictor_params: Additional parameters passed to predictor.
            corrector_params: Additional parameters passed to corrector.
            steps: Number of steps to take for the Euler-Maruyama method.
            ts: Time points at which to evaluate the diffusion process. If None, a
                linear grid between t_max and t_min is used.
            max_sampling_batch_size: Maximum batch size for sampling.
            show_progress_bars: Whether to show sampling progress monitor.

        Returns:
            Samples from the posteriors of shape (*sample_shape, B, *input_shape)
        """
        num_samples = torch.Size(sample_shape).numel()
        x = reshape_to_batch_event(x, self.vector_field_estimator.condition_shape)
        condition_dim = len(self.vector_field_estimator.condition_shape)
        batch_shape = x.shape[:-condition_dim]
        batch_size = batch_shape.numel()
        self.potential_fn.set_x(x)

        max_sampling_batch_size = (
            self.max_sampling_batch_size
            if max_sampling_batch_size is None
            else max_sampling_batch_size
        )

        # Adjust max_sampling_batch_size to avoid excessive memory usage
        if max_sampling_batch_size * batch_size > 100_000:
            capped = max(1, 100_000 // batch_size)
            warnings.warn(
                f"Capping max_sampling_batch_size from {max_sampling_batch_size} "
                f"to {capped} to avoid excessive memory usage.",
                stacklevel=2,
            )
            max_sampling_batch_size = capped

        if self.sample_with == "ode":
            samples, _ = rejection.accept_reject_sample(
                proposal=self.sample_via_ode,
                accept_reject_fn=lambda theta: within_support(self.prior, theta),
                num_samples=num_samples,
                num_xos=batch_size,
                show_progress_bars=show_progress_bars,
                max_sampling_batch_size=max_sampling_batch_size,
            )
            samples = samples.reshape(
                sample_shape + batch_shape + self.vector_field_estimator.input_shape
            )
        elif self.sample_with == "sde":
            proposal_sampling_kwargs = {
                "predictor": predictor,
                "corrector": corrector,
                "predictor_params": predictor_params,
                "corrector_params": corrector_params,
                "steps": steps,
                "ts": ts,
                "max_sampling_batch_size": max_sampling_batch_size,
                "show_progress_bars": show_progress_bars,
            }
            samples, _ = rejection.accept_reject_sample(
                proposal=self._sample_via_diffusion,
                accept_reject_fn=lambda theta: within_support(self.prior, theta),
                num_samples=num_samples,
                num_xos=batch_size,
                show_progress_bars=show_progress_bars,
                max_sampling_batch_size=max_sampling_batch_size,
                proposal_sampling_kwargs=proposal_sampling_kwargs,
            )
            samples = samples.reshape(
                sample_shape + batch_shape + self.vector_field_estimator.input_shape
            )

        return samples

    def map(
        self,
        x: Optional[Tensor] = None,
        num_iter: int = 1000,
        num_to_optimize: int = 1000,
        learning_rate: float = 0.01,
        init_method: Union[str, Tensor] = "posterior",
        num_init_samples: int = 1000,
        save_best_every: int = 1000,
        show_progress_bars: bool = False,
        force_update: bool = False,
    ) -> Tensor:
        r"""Returns the maximum-a-posteriori estimate (MAP).

        The method can be interrupted (Ctrl-C) when the user sees that the
        log-probability converges. The best estimate will be saved in `self._map` and
        can be accessed with `self.map()`. The MAP is obtained by running gradient
        ascent from a given number of starting positions (samples from the posterior
        with the highest log-probability). After the optimization is done, we select the
        parameter set that has the highest log-probability after the optimization.

        Warning: The default values used by this function are not well-tested. They
        might require hand-tuning for the problem at hand.

        For developers: if the prior is a `BoxUniform`, we carry out the optimization
        in unbounded space and transform the result back into bounded space.

        Args:
            x: Deprecated - use `.set_default_x()` prior to `.map()`.
            num_iter: Number of optimization steps that the algorithm takes
                to find the MAP.
            num_to_optimize: From the drawn `num_init_samples`, use the
                `num_to_optimize` with highest log-probability as the initial points
                for the optimization.
            learning_rate: Learning rate of the optimizer.
            init_method: How to select the starting parameters for the optimization. If
                it is a string, it can be either [`posterior`, `prior`], which samples
                the respective distribution `num_init_samples` times. If it is a
                tensor, the tensor will be used as init locations.
            num_init_samples: Draw this number of samples from the posterior and
                evaluate the log-probability of all of them.
            save_best_every: The best log-probability is computed, saved in the
                `map`-attribute, and printed every `save_best_every`-th iteration.
                Computing the best log-probability creates a significant overhead
                (thus, the default is `10`.)
            show_progress_bars: Whether to show a progressbar during sampling from
                the posterior.
            force_update: Whether to re-calculate the MAP when x is unchanged and
                have a cached value.

        Returns:
            The MAP estimate.
        """
        if x is not None:
            raise ValueError(
                "Passing `x` directly to `.map()` has been deprecated."
                "Use `.self_default_x()` to set `x`, and then run `.map()` "
            )

        if self.default_x is None:
            raise ValueError(
                "Default `x` has not been set."
                "To set the default, use the `.set_default_x()` method."
            )

        if self._map is None or force_update:
            # rebuild coarse flow fast for MAP optimization.
            self.potential_fn.set_x(self.default_x, atol=1e-2, rtol=1e-3, exact=True)
            callable_potential_fn = CallableDifferentiablePotentialFunction(
                self.potential_fn
            )
            if init_method == "posterior":
                inits = self.sample((num_init_samples,))
            elif init_method == "proposal":
                inits = self.proposal.sample((num_init_samples,))  # type: ignore
            elif isinstance(init_method, Tensor):
                inits = init_method
            else:
                raise ValueError

            self._map = gradient_ascent(
                potential_fn=callable_potential_fn,
                inits=inits,
                theta_transform=self.theta_transform,
                num_iter=num_iter,
                num_to_optimize=num_to_optimize,
                learning_rate=learning_rate,
                save_best_every=save_best_every,
                show_progress_bars=show_progress_bars,
            )[0]

        return self._map

`init(vector_field_estimator, prior, max_sampling_batch_size=10000, device=None, enable_transform=True, sample_with='sde', **kwargs)` ¶

Parameters:

Name	Type	Description	Default
`prior`	`Distribution`	Prior distribution with `.log_prob()` and `.sample()`.	required
`vector_field_estimator`	`ConditionalVectorFieldEstimator`	The trained vector field estimator.	required
`max_sampling_batch_size`	`int`	Batchsize of samples being drawn from the proposal at every iteration.	`10000`
`device`	`Optional[Union[str, device]]`	Training device, e.g., “cpu”, “cuda” or “cuda:0”. If None, `potential_fn.device` is used.	`None`
`enable_transform`	`bool`	Whether to transform parameters to unconstrained space during MAP optimization. When False, an identity transform will be returned for `theta_transform`. True is not supported yet.	`True`
`sample_with`	`Literal['ode', 'sde']`	Whether to sample from the posterior using the ODE-based sampler or the SDE-based sampler.	`'sde'`
`**kwargs`		Additional keyword arguments passed to `VectorFieldBasedPotential`.	`{}`

Source code in sbi/inference/posteriors/vector_field_posterior.py

def __init__(
    self,
    vector_field_estimator: ConditionalVectorFieldEstimator,
    prior: Distribution,  # type: ignore
    max_sampling_batch_size: int = 10_000,
    device: Optional[Union[str, torch.device]] = None,
    enable_transform: bool = True,
    sample_with: Literal["ode", "sde"] = "sde",
    **kwargs,
):
    """
    Args:
        prior: Prior distribution with `.log_prob()` and `.sample()`.
        vector_field_estimator: The trained vector field estimator.
        max_sampling_batch_size: Batchsize of samples being drawn from
            the proposal at every iteration.
        device: Training device, e.g., "cpu", "cuda" or "cuda:0". If None,
            `potential_fn.device` is used.
        enable_transform: Whether to transform parameters to unconstrained space
            during MAP optimization. When False, an identity transform will be
            returned for `theta_transform`. True is not supported yet.
        sample_with: Whether to sample from the posterior using the ODE-based
            sampler or the SDE-based sampler.
        **kwargs: Additional keyword arguments passed to
            `VectorFieldBasedPotential`.
    """

    check_prior(prior)
    potential_fn, theta_transform = vector_field_estimator_based_potential(
        vector_field_estimator,
        prior,
        x_o=None,
        enable_transform=enable_transform,
        **kwargs,
    )
    super().__init__(
        potential_fn=potential_fn,
        theta_transform=theta_transform,
        device=device,
    )
    # Set the potential function type.
    self.potential_fn: VectorFieldBasedPotential = potential_fn

    self.prior = prior
    self.enable_transform = enable_transform
    self.vector_field_estimator = vector_field_estimator
    self.device = device

    self.sample_with = sample_with
    assert self.sample_with in [
        "ode",
        "sde",
    ], f"sample_with must be 'ode' or 'sde', but is {self.sample_with}."
    self.max_sampling_batch_size = max_sampling_batch_size

    self._purpose = """It samples from the vector field model given the \
        vector_field_estimator."""

`log_prob(theta, x=None, track_gradients=False, ode_kwargs=None)` ¶

Returns the log-probability of the posterior $p(\theta|x)$.

This requires building and evaluating the probability flow ODE.

Parameters:

Name	Type	Description	Default
`theta`	`Tensor`	Parameters $\theta$.	required
`x`	`Optional[Tensor]`	Observed data $x_o$. If None, the default $x_o$ is used.	`None`
`track_gradients`	`bool`	Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.	`False`
`ode_kwargs`	`Optional[Dict]`	Additional keyword arguments for the ODE solver.	`None`

Returns:

Type	Description
`Tensor`	`(len(θ),)`-shaped log posterior probability $\log p(\theta\|x)$ for θ in the
`Tensor`	support of the prior, -∞ (corresponding to 0 probability) outside.

Source code in sbi/inference/posteriors/vector_field_posterior.py

def log_prob(
    self,
    theta: Tensor,
    x: Optional[Tensor] = None,
    track_gradients: bool = False,
    ode_kwargs: Optional[Dict] = None,
) -> Tensor:
    r"""Returns the log-probability of the posterior $p(\theta|x)$.

    This requires building and evaluating the probability flow ODE.

    Args:
        theta: Parameters $\theta$.
        x: Observed data $x_o$. If None, the default $x_o$ is used.
        track_gradients: Whether the returned tensor supports tracking gradients.
            This can be helpful for e.g. sensitivity analysis, but increases memory
            consumption.
        ode_kwargs: Additional keyword arguments for the ODE solver.

    Returns:
        `(len(θ),)`-shaped log posterior probability $\log p(\theta|x)$ for θ in the
        support of the prior, -∞ (corresponding to 0 probability) outside.
    """
    x = self._x_else_default_x(x)
    x = reshape_to_batch_event(x, self.vector_field_estimator.condition_shape)
    is_iid = x.shape[0] > 1
    self.potential_fn.set_x(x, x_is_iid=is_iid, **(ode_kwargs or {}))

    theta = ensure_theta_batched(torch.as_tensor(theta))
    return self.potential_fn(
        theta.to(self._device),
        track_gradients=track_gradients,
    )

`map(x=None, num_iter=1000, num_to_optimize=1000, learning_rate=0.01, init_method='posterior', num_init_samples=1000, save_best_every=1000, show_progress_bars=False, force_update=False)` ¶

Returns the maximum-a-posteriori estimate (MAP).

The method can be interrupted (Ctrl-C) when the user sees that the log-probability converges. The best estimate will be saved in self._map and can be accessed with self.map(). The MAP is obtained by running gradient ascent from a given number of starting positions (samples from the posterior with the highest log-probability). After the optimization is done, we select the parameter set that has the highest log-probability after the optimization.

Warning: The default values used by this function are not well-tested. They might require hand-tuning for the problem at hand.

For developers: if the prior is a BoxUniform, we carry out the optimization in unbounded space and transform the result back into bounded space.

Parameters:

Name	Type	Description	Default
`x`	`Optional[Tensor]`	Deprecated - use `.set_default_x()` prior to `.map()`.	`None`
`num_iter`	`int`	Number of optimization steps that the algorithm takes to find the MAP.	`1000`
`num_to_optimize`	`int`	From the drawn `num_init_samples`, use the `num_to_optimize` with highest log-probability as the initial points for the optimization.	`1000`
`learning_rate`	`float`	Learning rate of the optimizer.	`0.01`
`init_method`	`Union[str, Tensor]`	How to select the starting parameters for the optimization. If it is a string, it can be either [`posterior`, `prior`], which samples the respective distribution `num_init_samples` times. If it is a tensor, the tensor will be used as init locations.	`'posterior'`
`num_init_samples`	`int`	Draw this number of samples from the posterior and evaluate the log-probability of all of them.	`1000`
`save_best_every`	`int`	The best log-probability is computed, saved in the `map`-attribute, and printed every `save_best_every`-th iteration. Computing the best log-probability creates a significant overhead (thus, the default is `10`.)	`1000`
`show_progress_bars`	`bool`	Whether to show a progressbar during sampling from the posterior.	`False`
`force_update`	`bool`	Whether to re-calculate the MAP when x is unchanged and have a cached value.	`False`

Returns:

Type	Description
`Tensor`	The MAP estimate.

Source code in sbi/inference/posteriors/vector_field_posterior.py

def map(
    self,
    x: Optional[Tensor] = None,
    num_iter: int = 1000,
    num_to_optimize: int = 1000,
    learning_rate: float = 0.01,
    init_method: Union[str, Tensor] = "posterior",
    num_init_samples: int = 1000,
    save_best_every: int = 1000,
    show_progress_bars: bool = False,
    force_update: bool = False,
) -> Tensor:
    r"""Returns the maximum-a-posteriori estimate (MAP).

    The method can be interrupted (Ctrl-C) when the user sees that the
    log-probability converges. The best estimate will be saved in `self._map` and
    can be accessed with `self.map()`. The MAP is obtained by running gradient
    ascent from a given number of starting positions (samples from the posterior
    with the highest log-probability). After the optimization is done, we select the
    parameter set that has the highest log-probability after the optimization.

    Warning: The default values used by this function are not well-tested. They
    might require hand-tuning for the problem at hand.

    For developers: if the prior is a `BoxUniform`, we carry out the optimization
    in unbounded space and transform the result back into bounded space.

    Args:
        x: Deprecated - use `.set_default_x()` prior to `.map()`.
        num_iter: Number of optimization steps that the algorithm takes
            to find the MAP.
        num_to_optimize: From the drawn `num_init_samples`, use the
            `num_to_optimize` with highest log-probability as the initial points
            for the optimization.
        learning_rate: Learning rate of the optimizer.
        init_method: How to select the starting parameters for the optimization. If
            it is a string, it can be either [`posterior`, `prior`], which samples
            the respective distribution `num_init_samples` times. If it is a
            tensor, the tensor will be used as init locations.
        num_init_samples: Draw this number of samples from the posterior and
            evaluate the log-probability of all of them.
        save_best_every: The best log-probability is computed, saved in the
            `map`-attribute, and printed every `save_best_every`-th iteration.
            Computing the best log-probability creates a significant overhead
            (thus, the default is `10`.)
        show_progress_bars: Whether to show a progressbar during sampling from
            the posterior.
        force_update: Whether to re-calculate the MAP when x is unchanged and
            have a cached value.

    Returns:
        The MAP estimate.
    """
    if x is not None:
        raise ValueError(
            "Passing `x` directly to `.map()` has been deprecated."
            "Use `.self_default_x()` to set `x`, and then run `.map()` "
        )

    if self.default_x is None:
        raise ValueError(
            "Default `x` has not been set."
            "To set the default, use the `.set_default_x()` method."
        )

    if self._map is None or force_update:
        # rebuild coarse flow fast for MAP optimization.
        self.potential_fn.set_x(self.default_x, atol=1e-2, rtol=1e-3, exact=True)
        callable_potential_fn = CallableDifferentiablePotentialFunction(
            self.potential_fn
        )
        if init_method == "posterior":
            inits = self.sample((num_init_samples,))
        elif init_method == "proposal":
            inits = self.proposal.sample((num_init_samples,))  # type: ignore
        elif isinstance(init_method, Tensor):
            inits = init_method
        else:
            raise ValueError

        self._map = gradient_ascent(
            potential_fn=callable_potential_fn,
            inits=inits,
            theta_transform=self.theta_transform,
            num_iter=num_iter,
            num_to_optimize=num_to_optimize,
            learning_rate=learning_rate,
            save_best_every=save_best_every,
            show_progress_bars=show_progress_bars,
        )[0]

    return self._map

`sample(sample_shape=torch.Size(), x=None, predictor='euler_maruyama', corrector=None, predictor_params=None, corrector_params=None, steps=500, ts=None, iid_method=None, iid_params=None, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=True)` ¶

Return samples from posterior distribution $p(\theta|x)$.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	Shape of the samples to be drawn.	`Size()`
`predictor`	`Union[str, Predictor]`	The predictor for the vector field sampler. Can be a string or a custom predictor following the API in `sbi.samplers.score.predictors`. Currently, only `euler_maruyama` is implemented.	`'euler_maruyama'`
`corrector`	`Optional[Union[str, Corrector]]`	The corrector for the vector field sampler. Either of [None].	`None`
`predictor_params`	`Optional[Dict]`	Additional parameters passed to predictor.	`None`
`corrector_params`	`Optional[Dict]`	Additional parameters passed to corrector.	`None`
`steps`	`int`	Number of steps to take for the Euler-Maruyama method. If `sample_with` is “ode”, this is ignored.	`500`
`ts`	`Optional[Tensor]`	Time points at which to evaluate the vector field process. If None, a linear grid between t_max and t_min is used. If `sample_with` is “ode”, this is ignored.	`None`
`iid_method`	`Optional[Literal['fnpe', 'gauss', 'auto_gauss', 'jac_gauss']]`	Which method to use for computing the score in the iid setting. We currently support “fnpe”, “gauss”, “auto_gauss”, “jac_gauss”. The fnpe method is simple and generally applicable. However, it can become inaccurate already for quite a few iid samples (as it based on heuristic approximations), and should be used at best only with a `corrector`. The “gauss” methods are more accurate, by aiming for an efficient approximation of the correct marginal score in the iid case. This however requires estimating some hyperparamters, which is done in a systematic way in the “auto_gauss” (initial overhead) and “jac_gauss” (iterative jacobian computations are expensive). We default to “auto_gauss” for these reasons. Note that in order to use the iid method, the vector field estimator must support it and have SCORE_DEFINED and MARGINALS_DEFINED class attributes set to True.	`None`
`iid_params`	`Optional[Dict]`	Additional parameters passed to the iid method. See the specific `IIDScoreFunction` child class for details.	`None`
`max_sampling_batch_size`	`int`	Maximum batch size for sampling.	`10000`
`sample_with`	`Optional[str]`	Sampling method to use - ‘ode’ or ‘sde’. Note that in order to use the ‘sde’ sampling method, the vector field estimator must support it and have the SCORE_DEFINED class attribute set to True.	`None`
`show_progress_bars`	`bool`	Whether to show a progress bar during sampling.	`True`

Source code in sbi/inference/posteriors/vector_field_posterior.py

def sample(
    self,
    sample_shape: Shape = torch.Size(),
    x: Optional[Tensor] = None,
    predictor: Union[str, Predictor] = "euler_maruyama",
    corrector: Optional[Union[str, Corrector]] = None,
    predictor_params: Optional[Dict] = None,
    corrector_params: Optional[Dict] = None,
    steps: int = 500,
    ts: Optional[Tensor] = None,
    iid_method: Optional[
        Literal["fnpe", "gauss", "auto_gauss", "jac_gauss"]
    ] = None,
    iid_params: Optional[Dict] = None,
    max_sampling_batch_size: int = 10_000,
    sample_with: Optional[str] = None,
    show_progress_bars: bool = True,
) -> Tensor:
    r"""Return samples from posterior distribution $p(\theta|x)$.

    Args:
        sample_shape: Shape of the samples to be drawn.
        predictor: The predictor for the vector field sampler. Can be a string or
            a custom predictor following the API in `sbi.samplers.score.predictors`.
            Currently, only `euler_maruyama` is implemented.
        corrector: The corrector for the vector field sampler. Either of
            [None].
        predictor_params: Additional parameters passed to predictor.
        corrector_params: Additional parameters passed to corrector.
        steps: Number of steps to take for the Euler-Maruyama method.
            If `sample_with` is "ode", this is ignored.
        ts: Time points at which to evaluate the vector field process. If None, a
            linear grid between t_max and t_min is used. If `sample_with` is "ode",
            this is ignored.
        iid_method: Which method to use for computing the score in the iid setting.
            We currently support "fnpe", "gauss", "auto_gauss", "jac_gauss". The
            fnpe method is simple and generally applicable. However, it can become
            inaccurate already for quite a few iid samples (as it based on heuristic
            approximations), and should be used at best only with a `corrector`. The
            "gauss" methods are more accurate, by aiming for an efficient
            approximation of the correct marginal score in the iid case. This
            however requires estimating some hyperparamters, which is done in a
            systematic way in the "auto_gauss" (initial overhead) and "jac_gauss"
            (iterative jacobian computations are expensive). We default to
            "auto_gauss" for these reasons. Note that in order to use the iid
            method, the vector field estimator must support it and have
            SCORE_DEFINED and MARGINALS_DEFINED class attributes set to True.
        iid_params: Additional parameters passed to the iid method. See the specific
            `IIDScoreFunction` child class for details.
        max_sampling_batch_size: Maximum batch size for sampling.
        sample_with: Sampling method to use - 'ode' or 'sde'. Note that in order to
            use the 'sde' sampling method, the vector field estimator must support
            it and have the SCORE_DEFINED class attribute set to True.
        show_progress_bars: Whether to show a progress bar during sampling.
    """

    if sample_with is None:
        sample_with = self.sample_with

    x = self._x_else_default_x(x)
    x = reshape_to_batch_event(x, self.vector_field_estimator.condition_shape)
    is_iid = x.shape[0] > 1
    self.potential_fn.set_x(
        x,
        x_is_iid=is_iid,
        iid_method=iid_method or self.potential_fn.iid_method,
        iid_params=iid_params,
    )

    num_samples = torch.Size(sample_shape).numel()

    if sample_with == "ode":
        samples, _ = rejection.accept_reject_sample(
            proposal=self.sample_via_ode,
            accept_reject_fn=lambda theta: within_support(self.prior, theta),
            num_samples=num_samples,
            show_progress_bars=show_progress_bars,
            max_sampling_batch_size=max_sampling_batch_size,
        )
    elif sample_with == "sde":
        proposal_sampling_kwargs = {
            "predictor": predictor,
            "corrector": corrector,
            "predictor_params": predictor_params,
            "corrector_params": corrector_params,
            "steps": steps,
            "ts": ts,
            "max_sampling_batch_size": max_sampling_batch_size,
            "show_progress_bars": show_progress_bars,
        }
        samples, _ = rejection.accept_reject_sample(
            proposal=self._sample_via_diffusion,
            accept_reject_fn=lambda theta: within_support(self.prior, theta),
            num_samples=num_samples,
            show_progress_bars=show_progress_bars,
            max_sampling_batch_size=max_sampling_batch_size,
            proposal_sampling_kwargs=proposal_sampling_kwargs,
        )
    else:
        raise ValueError(
            f"Expected sample_with to be 'ode' or 'sde', but got {sample_with}."
        )

    samples = samples.reshape(
        sample_shape + self.vector_field_estimator.input_shape
    )
    return samples

`sample_batched(sample_shape, x, predictor='euler_maruyama', corrector=None, predictor_params=None, corrector_params=None, steps=500, ts=None, max_sampling_batch_size=10000, show_progress_bars=True)` ¶

Given a batch of observations [x_1, …, x_B] this function samples from posteriors $p(\theta|x_1)$, … ,$p(\theta|x_B)$, in a batched (i.e. vectorized) manner.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Size`	Desired shape of samples that are drawn from the posterior given every observation.	required
`x`	`Tensor`	A batch of observations, of shape `(batch_dim, event_shape_x)`. `batch_dim` corresponds to the number of observations to be drawn.	required
`predictor`	`Union[str, Predictor]`	The predictor for the diffusion-based sampler. Can be a string or a custom predictor following the API in `sbi.samplers.score.predictors`. Currently, only `euler_maruyama` is implemented.	`'euler_maruyama'`
`corrector`	`Optional[Union[str, Corrector]]`	The corrector for the diffusion-based sampler.	`None`
`predictor_params`	`Optional[Dict]`	Additional parameters passed to predictor.	`None`
`corrector_params`	`Optional[Dict]`	Additional parameters passed to corrector.	`None`
`steps`	`int`	Number of steps to take for the Euler-Maruyama method.	`500`
`ts`	`Optional[Tensor]`	Time points at which to evaluate the diffusion process. If None, a linear grid between t_max and t_min is used.	`None`
`max_sampling_batch_size`	`int`	Maximum batch size for sampling.	`10000`
`show_progress_bars`	`bool`	Whether to show sampling progress monitor.	`True`

Returns:

Type	Description
`Tensor`	Samples from the posteriors of shape (sample_shape, B, input_shape)

Source code in sbi/inference/posteriors/vector_field_posterior.py

def sample_batched(
    self,
    sample_shape: torch.Size,
    x: Tensor,
    predictor: Union[str, Predictor] = "euler_maruyama",
    corrector: Optional[Union[str, Corrector]] = None,
    predictor_params: Optional[Dict] = None,
    corrector_params: Optional[Dict] = None,
    steps: int = 500,
    ts: Optional[Tensor] = None,
    max_sampling_batch_size: int = 10000,
    show_progress_bars: bool = True,
) -> Tensor:
    r"""Given a batch of observations [x_1, ..., x_B] this function samples from
    posteriors $p(\theta|x_1)$, ... ,$p(\theta|x_B)$, in a batched (i.e. vectorized)
    manner.

    Args:
        sample_shape: Desired shape of samples that are drawn from the posterior
            given every observation.
        x: A batch of observations, of shape `(batch_dim, event_shape_x)`.
            `batch_dim` corresponds to the number of observations to be
            drawn.
        predictor: The predictor for the diffusion-based sampler. Can be a string or
            a custom predictor following the API in `sbi.samplers.score.predictors`.
            Currently, only `euler_maruyama` is implemented.
        corrector: The corrector for the diffusion-based sampler.
        predictor_params: Additional parameters passed to predictor.
        corrector_params: Additional parameters passed to corrector.
        steps: Number of steps to take for the Euler-Maruyama method.
        ts: Time points at which to evaluate the diffusion process. If None, a
            linear grid between t_max and t_min is used.
        max_sampling_batch_size: Maximum batch size for sampling.
        show_progress_bars: Whether to show sampling progress monitor.

    Returns:
        Samples from the posteriors of shape (*sample_shape, B, *input_shape)
    """
    num_samples = torch.Size(sample_shape).numel()
    x = reshape_to_batch_event(x, self.vector_field_estimator.condition_shape)
    condition_dim = len(self.vector_field_estimator.condition_shape)
    batch_shape = x.shape[:-condition_dim]
    batch_size = batch_shape.numel()
    self.potential_fn.set_x(x)

    max_sampling_batch_size = (
        self.max_sampling_batch_size
        if max_sampling_batch_size is None
        else max_sampling_batch_size
    )

    # Adjust max_sampling_batch_size to avoid excessive memory usage
    if max_sampling_batch_size * batch_size > 100_000:
        capped = max(1, 100_000 // batch_size)
        warnings.warn(
            f"Capping max_sampling_batch_size from {max_sampling_batch_size} "
            f"to {capped} to avoid excessive memory usage.",
            stacklevel=2,
        )
        max_sampling_batch_size = capped

    if self.sample_with == "ode":
        samples, _ = rejection.accept_reject_sample(
            proposal=self.sample_via_ode,
            accept_reject_fn=lambda theta: within_support(self.prior, theta),
            num_samples=num_samples,
            num_xos=batch_size,
            show_progress_bars=show_progress_bars,
            max_sampling_batch_size=max_sampling_batch_size,
        )
        samples = samples.reshape(
            sample_shape + batch_shape + self.vector_field_estimator.input_shape
        )
    elif self.sample_with == "sde":
        proposal_sampling_kwargs = {
            "predictor": predictor,
            "corrector": corrector,
            "predictor_params": predictor_params,
            "corrector_params": corrector_params,
            "steps": steps,
            "ts": ts,
            "max_sampling_batch_size": max_sampling_batch_size,
            "show_progress_bars": show_progress_bars,
        }
        samples, _ = rejection.accept_reject_sample(
            proposal=self._sample_via_diffusion,
            accept_reject_fn=lambda theta: within_support(self.prior, theta),
            num_samples=num_samples,
            num_xos=batch_size,
            show_progress_bars=show_progress_bars,
            max_sampling_batch_size=max_sampling_batch_size,
            proposal_sampling_kwargs=proposal_sampling_kwargs,
        )
        samples = samples.reshape(
            sample_shape + batch_shape + self.vector_field_estimator.input_shape
        )

    return samples

`sample_via_ode(sample_shape=torch.Size(), **kwargs)` ¶

Return samples from posterior distribution with probability flow ODE.

This builds the probability flow ODE and then samples from the corresponding flow.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	The shape of the samples to be returned.	`Size()`
`**kwargs`		Additional keyword arguments for the ODE solver that depend on the used ODE backend.	`{}`

Returns:

Type	Description
`Tensor`	Samples from the approximated posterior distribution :math:`\theta \sim p(\theta\|x)`.

Source code in sbi/inference/posteriors/vector_field_posterior.py

def sample_via_ode(
    self,
    sample_shape: Shape = torch.Size(),
    **kwargs,
) -> Tensor:
    r"""
    Return samples from posterior distribution with probability flow ODE.

    This builds the probability flow ODE and then samples from the corresponding
    flow.

    Args:
        sample_shape: The shape of the samples to be returned.
        **kwargs: Additional keyword arguments for the ODE solver that
            depend on the used ODE backend.

    Returns:
        Samples from the approximated posterior distribution
            :math:`\theta \sim p(\theta|x)`.
    """
    num_samples = torch.Size(sample_shape).numel()

    samples = self.potential_fn.neural_ode(self.potential_fn.x_o, **kwargs).sample(
        torch.Size((num_samples,))
    )

    return samples

`to(device)` ¶

Move posterior to device.

Parameters:

Name	Type	Description	Default
`device`	`Union[str, device]`	device where to move the posterior to.	required

Source code in sbi/inference/posteriors/vector_field_posterior.py

def to(self, device: Union[str, torch.device]) -> None:
    """Move posterior to device.

    Args:
        device: device where to move the posterior to.
    """
    self.device = device
    if hasattr(self.prior, "to"):
        self.prior.to(device)  # type: ignore
    else:
        raise ValueError("""Prior has no attribute to(device).""")
    if hasattr(self.vector_field_estimator, "to"):
        self.vector_field_estimator.to(device)
    else:
        raise ValueError("""Posterior estimator has no attribute to(device).""")

    potential_fn, theta_transform = vector_field_estimator_based_potential(
        self.vector_field_estimator,
        self.prior,
        x_o=None,
        enable_transform=self.enable_transform,
    )
    x_o = None
    if hasattr(self, "_x") and (self._x is not None):
        x_o = self._x.to(device)
    super().__init__(
        potential_fn=potential_fn,
        theta_transform=theta_transform,
        device=device,
    )
    # super().__init__ erases the self._x, so we need to set it again
    if x_o is not None:
        self.set_default_x(x_o)

    self.potential_fn: VectorFieldBasedPotential = potential_fn

`VIPosterior` ¶

Bases: NeuralPosterior

Provides VI (Variational Inference) to sample from the posterior.

SNLE or SNRE train neural networks to approximate the likelihood (or likelihood ratios). VIPosterior allows learning a tractable variational posterior :math:q(\theta) which approximates the true posterior :math:p(\theta|x_o). After this second training stage, we can produce approximate posterior samples by sampling from :math:q at no additional cost.

For additional information, see [1]_ and [2]_.

References¶

.. [1] Glöckler, M., Deistler, M., & Macke, J. (2022). Variational methods for simulation-based inference. https://openreview.net/forum?id=kZ0UYdhqkNY

.. [2] Wiqvist, S., Frellsen, J., & Picchini, U. (2021). Sequential Neural Posterior and Likelihood Approximation. https://arxiv.org/abs/2102.06522

Source code in sbi/inference/posteriors/vi_posterior.py

class VIPosterior(NeuralPosterior):
    r"""Provides VI (Variational Inference) to sample from the posterior.

    SNLE or SNRE train neural networks to approximate the likelihood (or likelihood
    ratios). ``VIPosterior`` allows learning a tractable variational posterior
    :math:`q(\theta)` which approximates the true posterior
    :math:`p(\theta|x_o)`. After this second training stage, we can produce
    approximate posterior samples by sampling from :math:`q` at no additional cost.

    For additional information, see [1]_ and [2]_.

    References
    ----------

    .. [1] Glöckler, M., Deistler, M., & Macke, J. (2022).
        Variational methods for simulation-based inference.
        https://openreview.net/forum?id=kZ0UYdhqkNY

    .. [2] Wiqvist, S., Frellsen, J., & Picchini, U. (2021).
        Sequential Neural Posterior and Likelihood Approximation.
        https://arxiv.org/abs/2102.06522
    """

    def __init__(
        self,
        potential_fn: Union[BasePotential, CustomPotential],
        prior: Optional[TorchDistribution] = None,  # type: ignore
        q: Union[
            Literal["nsf", "scf", "maf", "mcf", "gaussian", "gaussian_diag"],
            PyroTransformedDistribution,
            "VIPosterior",
            Callable,
        ] = "maf",
        theta_transform: Optional[TorchTransform] = None,
        vi_method: Literal["rKL", "fKL", "IW", "alpha"] = "rKL",
        device: Union[str, torch.device] = "cpu",
        x_shape: Optional[torch.Size] = None,
        parameters: Optional[Iterable] = None,
        modules: Optional[Iterable] = None,
    ):
        """
        Args:
            potential_fn: The potential function from which to draw samples. Must be a
                `BasePotential` or a `CustomPotential`.
            prior: This is the prior distribution. Note that this is only
                used to check/construct the variational distribution or within some
                quality metrics. Please make sure that this matches with the prior
                within the potential_fn. If `None` is given, we will try to infer it
                from potential_fn or q, if this fails we raise an Error.
            q: Variational distribution, either string, `TransformedDistribution`, or a
                `VIPosterior` object. This specifies a parametric class of distribution
                over which the best possible posterior approximation is searched. For
                string input, we currently support [nsf, scf, maf, mcf, gaussian,
                gaussian_diag]. You can also specify your own variational family by
                passing a pyro `TransformedDistribution`.
                Additionally, we allow a `Callable`, which allows you the pass a
                `builder` function, which if called returns a distribution. This may be
                useful for setting the hyperparameters e.g. `num_transfroms` within the
                `get_flow_builder` method specifying the number of transformations
                within a normalizing flow. If q is already a `VIPosterior`, then the
                arguments will be copied from it (relevant for multi-round training).
            theta_transform: Maps form prior support to unconstrained space. The
                inverse is used here to ensure that the posterior support is equal to
                that of the prior.
            vi_method: This specifies the variational methods which are used to fit q to
                the posterior. We currently support [rKL, fKL, IW, alpha]. Note that
                some of the divergences are `mode seeking` i.e. they underestimate
                variance and collapse on multimodal targets (`rKL`, `alpha` for alpha >
                1) and some are `mass covering` i.e. they overestimate variance but
                typically cover all modes (`fKL`, `IW`, `alpha` for alpha < 1).
            device: Training device, e.g., `cpu`, `cuda` or `cuda:0`. We will ensure
                that all other objects are also on this device.
            x_shape: Deprecated, should not be passed.
            parameters: List of parameters of the variational posterior. This is only
                required for user-defined q i.e. if q does not have a `parameters`
                attribute.
            modules: List of modules of the variational posterior. This is only
                required for user-defined q i.e. if q does not have a `modules`
                attribute.
        """
        super().__init__(potential_fn, theta_transform, device, x_shape=x_shape)

        # Especially the prior may be on another device -> move it...
        self._device = device
        self.theta_transform = theta_transform
        self.x_shape = x_shape
        self.potential_fn.device = device
        move_all_tensor_to_device(self.potential_fn, device)

        # Get prior and previous builds
        if prior is not None:
            self._prior = prior
        elif hasattr(self.potential_fn, "prior") and isinstance(
            self.potential_fn.prior, Distribution
        ):
            self._prior = self.potential_fn.prior
        elif isinstance(q, VIPosterior) and isinstance(q._prior, Distribution):
            self._prior = q._prior
        else:
            raise ValueError(
                "We could not find a suitable prior distribution within `potential_fn` "
                "or `q` (if a VIPosterior is given). Please explicitly specify a prior."
            )
        move_all_tensor_to_device(self._prior, device)
        self._optimizer = None

        # In contrast to MCMC we want to project into constrained space.
        if theta_transform is None:
            self.link_transform = mcmc_transform(self._prior).inv
        else:
            self.link_transform = theta_transform.inv

        if parameters is None:
            parameters = []
        if modules is None:
            modules = []
        # This will set the variational distribution and VI method
        self.set_q(
            q,
            parameters=parameters,
            modules=modules,
        )
        self.set_vi_method(vi_method)

        self._purpose = (
            "It provides Variational inference to .sample() from the posterior and "
            "can evaluate the _normalized_ posterior density with .log_prob()."
        )

    def to(self, device: Union[str, torch.device]) -> None:
        """
        Move potential_fn, _prior and x_o to device, and change the device attribute.

        Reinstantiates the posterior and re sets the default x.

        Args:
            device: The device to move the posterior to.
        """
        self.device = device
        self.potential_fn.to(device)  # type: ignore
        self._prior.to(device)  # type: ignore
        if self._x is not None:
            x_o = self._x.to(device)
        self.theta_transform = mcmc_transform(self._prior, device=device)
        super().__init__(
            self.potential_fn, self.theta_transform, device, x_shape=self.x_shape
        )
        # super().__init__ erases the self._x, so we need to set it again
        if self._x is not None:
            self.set_default_x(x_o)

        if self.theta_transform is None:
            self.link_transform = mcmc_transform(self._prior).inv
        else:
            self.link_transform = self.theta_transform.inv

    @property
    def q(self) -> Distribution:
        """Returns the variational posterior."""
        return self._q

    @q.setter
    def q(
        self,
        q: Union[str, Distribution, "VIPosterior", Callable],
    ) -> None:
        """Sets the variational distribution. If the distribution does not admit access
        through `parameters` and `modules` function, please use `set_q` if you want to
        explicitly specify the parameters and modules.


        Args:
            q: Variational distribution, either string, distribution, or a VIPosterior
                object. This specifies a parametric class of distribution over which
                the best possible posterior approximation is searched. For string input,
                we currently support [nsf, scf, maf, mcf, gaussian, gaussian_diag]. Of
                course, you can also specify your own variational family by passing a
                `parameterized` distribution object i.e. a torch.distributions
                Distribution with methods `parameters` returning an iterable of all
                parameters (you can pass them within the paramters/modules attribute).
                Additionally, we allow a `Callable`, which allows you the pass a
                `builder` function, which if called returns an distribution. This may be
                useful for setting the hyperparameters e.g. `num_transfroms:int` by
                using the `get_flow_builder` method specifying the hyperparameters. If q
                is already a `VIPosterior`, then the arguments will be copied from it
                (relevant for multi-round training).


        """
        self.set_q(q)

    def set_q(
        self,
        q: Union[str, PyroTransformedDistribution, "VIPosterior", Callable],
        parameters: Optional[Iterable] = None,
        modules: Optional[Iterable] = None,
    ) -> None:
        """Defines the variational family.

        You can specify over which parameters/modules we optimize. This is required for
        custom distributions which e.g. do not inherit nn.Modules or has the function
        `parameters` or `modules` to give direct access to trainable parameters.
        Further, you can pass a function, which constructs a variational distribution
        if called.

        Args:
            q: Variational distribution, either string, distribution, or a VIPosterior
                object. This specifies a parametric class of distribution over which
                the best possible posterior approximation is searched. For string input,
                we currently support [nsf, scf, maf, mcf, gaussian, gaussian_diag]. Of
                course, you can also specify your own variational family by passing a
                `parameterized` distribution object i.e. a torch.distributions
                Distribution with methods `parameters` returning an iterable of all
                parameters (you can pass them within the paramters/modules attribute).
                Additionally, we allow a `Callable`, which allows you the pass a
                `builder` function, which if called returns an distribution. This may be
                useful for setting the hyperparameters e.g. `num_transfroms:int` by
                using the `get_flow_builder` method specifying the hyperparameters. If q
                is already a `VIPosterior`, then the arguments will be copied from it
                (relevant for multi-round training).
            parameters: List of parameters associated with the distribution object.
            modules: List of modules associated with the distribution object.

        """
        if parameters is None:
            parameters = []
        if modules is None:
            modules = []
        self._q_arg = (q, parameters, modules)
        if isinstance(q, Distribution):
            q = adapt_variational_distribution(
                q,
                self._prior,
                self.link_transform,
                parameters=parameters,
                modules=modules,
            )
            make_object_deepcopy_compatible(q)
            self_custom_q_init_cache = deepcopy(q)
            self._q_build_fn = lambda *args, **kwargs: self_custom_q_init_cache
            self._trained_on = None
        elif isinstance(q, (str, Callable)):
            if isinstance(q, str):
                self._q_build_fn = get_flow_builder(q)
            else:
                self._q_build_fn = q

            q = self._q_build_fn(
                self._prior.event_shape,
                self.link_transform,
                device=self._device,
            )
            make_object_deepcopy_compatible(q)
            self._trained_on = None
        elif isinstance(q, VIPosterior):
            self._q_build_fn = q._q_build_fn
            self._trained_on = q._trained_on
            self.vi_method = q.vi_method  # type: ignore
            self._device = q._device
            self._prior = q._prior
            self._x = q._x
            self._q_arg = q._q_arg
            make_object_deepcopy_compatible(q.q)
            q = deepcopy(q.q)
        move_all_tensor_to_device(q, self._device)
        assert isinstance(
            q, Distribution
        ), """Something went wrong when initializing the variational distribution.
            Please create an issue on github https://github.com/mackelab/sbi/issues"""
        check_variational_distribution(q, self._prior)
        self._q = q

    @property
    def vi_method(self) -> str:
        """Variational inference method e.g. one of [rKL, fKL, IW, alpha]."""
        return self._vi_method

    @vi_method.setter
    def vi_method(self, method: str) -> None:
        """See `set_vi_method`."""
        self.set_vi_method(method)

    def set_vi_method(self, method: str) -> "VIPosterior":
        """Sets variational inference method.

        Args:
            method: One of [rKL, fKL, IW, alpha].

        Returns:
            `VIPosterior` for chainable calls.
        """
        self._vi_method = method
        self._optimizer_builder = get_VI_method(method)
        return self

    def sample(
        self,
        sample_shape: Shape = torch.Size(),
        x: Optional[Tensor] = None,
        **kwargs,
    ) -> Tensor:
        """Samples from the variational posterior distribution.

        Args:
            sample_shape: Shape of samples

        Returns:
            Samples from posterior.
        """
        x = self._x_else_default_x(x)
        if self._trained_on is None or (x != self._trained_on).all():
            raise AttributeError(
                f"The variational posterior was not fit on the specified `default_x` "
                f"{x}. Please train using `posterior.train()`."
            )
        samples = self.q.sample(torch.Size(sample_shape))
        return samples.reshape((*sample_shape, samples.shape[-1]))

    def sample_batched(
        self,
        sample_shape: Shape,
        x: Tensor,
        max_sampling_batch_size: int = 10000,
        show_progress_bars: bool = True,
    ) -> Tensor:
        raise NotImplementedError(
            "Batched sampling is not implemented for VIPosterior. "
            "Alternatively you can use `sample` in a loop "
            "[posterior.sample(theta, x_o) for x_o in x]."
        )

    def log_prob(
        self,
        theta: Tensor,
        x: Optional[Tensor] = None,
        track_gradients: bool = False,
    ) -> Tensor:
        r"""Returns the log-probability of theta under the variational posterior.

        Args:
            theta: Parameters
            track_gradients: Whether the returned tensor supports tracking gradients.
                This can be helpful for e.g. sensitivity analysis but increases memory
                consumption.

        Returns:
            `len($\theta$)`-shaped log-probability.
        """
        x = self._x_else_default_x(x)
        if self._trained_on is None or (x != self._trained_on).all():
            raise AttributeError(
                f"The variational posterior was not fit using observation {x}.\
                     Please train."
            )
        with torch.set_grad_enabled(track_gradients):
            theta = ensure_theta_batched(torch.as_tensor(theta))
            return self.q.log_prob(theta)

    def train(
        self,
        x: Optional[TorchTensor] = None,
        n_particles: int = 256,
        learning_rate: float = 1e-3,
        gamma: float = 0.999,
        max_num_iters: int = 2000,
        min_num_iters: int = 10,
        clip_value: float = 10.0,
        warm_up_rounds: int = 100,
        retrain_from_scratch: bool = False,
        reset_optimizer: bool = False,
        show_progress_bar: bool = True,
        check_for_convergence: bool = True,
        quality_control: bool = True,
        quality_control_metric: str = "psis",
        **kwargs,
    ) -> "VIPosterior":
        """This method trains the variational posterior.

        Args:
            x: The observation.
            n_particles: Number of samples to approximate expectations within the
                variational bounds. The larger the more accurate are gradient
                estimates, but the computational cost per iteration increases.
            learning_rate: Learning rate of the optimizer.
            gamma: Learning rate decay per iteration. We use an exponential decay
                scheduler.
            max_num_iters: Maximum number of iterations.
            min_num_iters: Minimum number of iterations.
            clip_value: Gradient clipping value, decreasing may help if you see invalid
                values.
            warm_up_rounds: Initialize the posterior as the prior.
            retrain_from_scratch: Retrain the variational distributions from scratch.
            reset_optimizer: Reset the divergence optimizer
            show_progress_bar: If any progress report should be displayed.
            quality_control: If False quality control is skipped.
            quality_control_metric: Which metric to use for evaluating the quality.
            kwargs: Hyperparameters check corresponding `DivergenceOptimizer` for detail
                eps: Determines sensitivity of convergence check.
                retain_graph: Boolean which decides whether to retain the computation
                    graph. This may be required for some `exotic` user-specified q's.
                optimizer: A PyTorch Optimizer class e.g. Adam or SGD. See
                    `DivergenceOptimizer` for details.
                scheduler: A PyTorch learning rate scheduler. See
                    `DivergenceOptimizer` for details.
                alpha: Only used if vi_method=`alpha`. Determines the alpha divergence.
                K: Only used if vi_method=`IW`. Determines the number of importance
                    weighted particles.
                stick_the_landing: If one should use the STL estimator (only for rKL,
                    IW, alpha).
                dreg: If one should use the DREG estimator (only for rKL, IW, alpha).
                weight_transform: Callable applied to importance weights (only for fKL)
        Returns:
            VIPosterior: `VIPosterior` (can be used to chain calls).
        """
        # Update optimizer with current arguments.
        if self._optimizer is not None:
            self._optimizer.update({**locals(), **kwargs})

        # Init q and the optimizer if necessary
        if retrain_from_scratch:
            self.q = self._q_build_fn()  # type: ignore
            self._optimizer = self._optimizer_builder(
                self.potential_fn,
                self.q,
                lr=learning_rate,
                clip_value=clip_value,
                gamma=gamma,
                n_particles=n_particles,
                prior=self._prior,
                **kwargs,
            )

        if (
            reset_optimizer
            or self._optimizer is None
            or not isinstance(self._optimizer, self._optimizer_builder)
        ):
            self._optimizer = self._optimizer_builder(
                self.potential_fn,
                self.q,
                lr=learning_rate,
                clip_value=clip_value,
                gamma=gamma,
                n_particles=n_particles,
                prior=self._prior,
                **kwargs,
            )

        # Check context
        x = atleast_2d_float32_tensor(self._x_else_default_x(x)).to(  # type: ignore
            self._device
        )

        already_trained = self._trained_on is not None and (x == self._trained_on).all()

        # Optimize
        optimizer = self._optimizer
        optimizer.to(self._device)
        optimizer.reset_loss_stats()

        if show_progress_bar:
            iters = tqdm(range(max_num_iters))
        else:
            iters = range(max_num_iters)

        # Warmup before training
        if reset_optimizer or (not optimizer.warm_up_was_done and not already_trained):
            if show_progress_bar:
                iters.set_description(  # type: ignore
                    "Warmup phase, this may take a few seconds..."
                )
            optimizer.warm_up(warm_up_rounds)

        for i in iters:
            optimizer.step(x)
            mean_loss, std_loss = optimizer.get_loss_stats()
            # Update progress bar
            if show_progress_bar:
                assert isinstance(iters, tqdm)
                iters.set_description(  # type: ignore
                    f"Loss: {np.round(float(mean_loss), 2)}"
                    f"Std: {np.round(float(std_loss), 2)}"
                )
            # Check for convergence
            if check_for_convergence and i > min_num_iters and optimizer.converged():
                if show_progress_bar:
                    print(f"\nConverged with loss: {np.round(float(mean_loss), 2)}")
                break
        # Training finished:
        self._trained_on = x

        # Evaluate quality
        if quality_control:
            try:
                self.evaluate(quality_control_metric=quality_control_metric)
            except Exception as e:
                print(
                    f"Quality control showed a low quality of the variational "
                    f"posterior. We are automatically retraining the variational "
                    f"posterior from scratch with a smaller learning rate. "
                    f"Alternatively, if you want to skip quality control, please "
                    f"retrain with `VIPosterior.train(..., quality_control=False)`. "
                    f"\nThe error that occured is: {e}"
                )
                self.train(
                    learning_rate=learning_rate * 0.1,
                    retrain_from_scratch=True,
                    reset_optimizer=True,
                )

        return self

    def evaluate(self, quality_control_metric: str = "psis", N: int = int(5e4)) -> None:
        """This function will evaluate the quality of the variational posterior
        distribution. We currently support two different metrics of type `psis`, which
        checks the quality based on the tails of importance weights (there should not be
        much with a large one), or `prop` which checks the proportionality between q
        and potential_fn.

        NOTE: In our experience `prop` is sensitive to distinguish ``good`` from ``ok``
        whereas `psis` is more sensitive in distinguishing `very bad` from `ok`.

        Args:
            quality_control_metric: The metric of choice, we currently support [psis,
                prop, prop_prior].
            N: Number of samples which is used to evaluate the metric.
        """
        quality_control_fn, quality_control_msg = get_quality_metric(
            quality_control_metric
        )
        metric = round(float(quality_control_fn(self, N=N)), 3)
        print(f"Quality Score: {metric} " + quality_control_msg)

    def map(
        self,
        x: Optional[TorchTensor] = None,
        num_iter: int = 1_000,
        num_to_optimize: int = 100,
        learning_rate: float = 0.01,
        init_method: Union[str, TorchTensor] = "proposal",
        num_init_samples: int = 10_000,
        save_best_every: int = 10,
        show_progress_bars: bool = False,
        force_update: bool = False,
    ) -> Tensor:
        r"""Returns the maximum-a-posteriori estimate (MAP).

        The method can be interrupted (Ctrl-C) when the user sees that the
        log-probability converges. The best estimate will be saved in `self._map` and
        can be accessed with `self.map()`. The MAP is obtained by running gradient
        ascent from a given number of starting positions (samples from the posterior
        with the highest log-probability). After the optimization is done, we select the
        parameter set that has the highest log-probability after the optimization.

        Warning: The default values used by this function are not well-tested. They
        might require hand-tuning for the problem at hand.

        For developers: if the prior is a `BoxUniform`, we carry out the optimization
        in unbounded space and transform the result back into bounded space.

        Args:
            x: Deprecated - use `.set_default_x()` prior to `.map()`.
            num_iter: Number of optimization steps that the algorithm takes
                to find the MAP.
            learning_rate: Learning rate of the optimizer.
            init_method: How to select the starting parameters for the optimization. If
                it is a string, it can be either [`posterior`, `prior`], which samples
                the respective distribution `num_init_samples` times. If it is a
                tensor, the tensor will be used as init locations.
            num_init_samples: Draw this number of samples from the posterior and
                evaluate the log-probability of all of them.
            num_to_optimize: From the drawn `num_init_samples`, use the
                `num_to_optimize` with highest log-probability as the initial points
                for the optimization.
            save_best_every: The best log-probability is computed, saved in the
                `map`-attribute, and printed every `save_best_every`-th iteration.
                Computing the best log-probability creates a significant overhead
                (thus, the default is `10`.)
            show_progress_bars: Whether to show a progressbar during sampling from
                the posterior.
            force_update: Whether to re-calculate the MAP when x is unchanged and
                have a cached value.
            log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
                {'norm_posterior': True} for SNPE.

        Returns:
            The MAP estimate.
        """
        self.proposal = self.q
        return super().map(
            x=x,
            num_iter=num_iter,
            num_to_optimize=num_to_optimize,
            learning_rate=learning_rate,
            init_method=init_method,
            num_init_samples=num_init_samples,
            save_best_every=save_best_every,
            show_progress_bars=show_progress_bars,
            force_update=force_update,
        )

    def __deepcopy__(self, memo: Optional[Dict] = None) -> "VIPosterior":
        """This method is called when using `copy.deepcopy` on the object.

        It defines how the object is copied. We need to overwrite this method, since the
        default implementation does use __getstate__ and __setstate__ which we overwrite
        to enable pickling (and in particular the necessary modifications are
        incompatible deep copying).

        Args:
            memo (Optional[Dict], optional): Deep copy internal memo. Defaults to None.

        Returns:
            VIPosterior: Deep copy of the VIPosterior.
        """
        if memo is None:
            memo = {}
        # Create a new instance of the class
        cls = self.__class__
        result = cls.__new__(cls)
        # Add to memo
        memo[id(self)] = result
        # Copy attributes
        for k, v in self.__dict__.items():
            setattr(result, k, copy.deepcopy(v, memo))
        return result

    def __getstate__(self) -> Dict:
        """This method is called when pickling the object.

        It defines what is pickled. We need to overwrite this method, since some parts
        due not support pickle protocols (e.g. due to local functions, etc.).

        Returns:
            Dict: All attributes of the VIPosterior.
        """
        self._optimizer = None
        self.__deepcopy__ = None  # type: ignore
        self._q_build_fn = None
        self._q.__deepcopy__ = None  # type: ignore
        return self.__dict__

    def __setstate__(self, state_dict: Dict):
        """This method is called when unpickling the object.

        Especially, we need to restore the removed attributes and ensure that the object
        e.g. remains deep copy compatible.

        Args:
            state_dict: Given state dictionary, we will restore the object from it.
        """
        self.__dict__ = state_dict
        q = deepcopy(self._q)
        # Restore removed attributes
        self.set_q(*self._q_arg)
        self._q = q
        make_object_deepcopy_compatible(self)
        make_object_deepcopy_compatible(self.q)

`q` `property` `writable` ¶

Returns the variational posterior.

`vi_method` `property` `writable` ¶

Variational inference method e.g. one of [rKL, fKL, IW, alpha].

`deepcopy(memo=None)` ¶

This method is called when using copy.deepcopy on the object.

It defines how the object is copied. We need to overwrite this method, since the default implementation does use getstate and setstate which we overwrite to enable pickling (and in particular the necessary modifications are incompatible deep copying).

Parameters:

Name	Type	Description	Default
`memo`	`Optional[Dict]`	Deep copy internal memo. Defaults to None.	`None`

Returns:

Name	Type	Description
`VIPosterior`	`VIPosterior`	Deep copy of the VIPosterior.

Source code in sbi/inference/posteriors/vi_posterior.py

def __deepcopy__(self, memo: Optional[Dict] = None) -> "VIPosterior":
    """This method is called when using `copy.deepcopy` on the object.

    It defines how the object is copied. We need to overwrite this method, since the
    default implementation does use __getstate__ and __setstate__ which we overwrite
    to enable pickling (and in particular the necessary modifications are
    incompatible deep copying).

    Args:
        memo (Optional[Dict], optional): Deep copy internal memo. Defaults to None.

    Returns:
        VIPosterior: Deep copy of the VIPosterior.
    """
    if memo is None:
        memo = {}
    # Create a new instance of the class
    cls = self.__class__
    result = cls.__new__(cls)
    # Add to memo
    memo[id(self)] = result
    # Copy attributes
    for k, v in self.__dict__.items():
        setattr(result, k, copy.deepcopy(v, memo))
    return result

`getstate()` ¶

This method is called when pickling the object.

It defines what is pickled. We need to overwrite this method, since some parts due not support pickle protocols (e.g. due to local functions, etc.).

Returns:

Name	Type	Description
`Dict`	`Dict`	All attributes of the VIPosterior.

Source code in sbi/inference/posteriors/vi_posterior.py

def __getstate__(self) -> Dict:
    """This method is called when pickling the object.

    It defines what is pickled. We need to overwrite this method, since some parts
    due not support pickle protocols (e.g. due to local functions, etc.).

    Returns:
        Dict: All attributes of the VIPosterior.
    """
    self._optimizer = None
    self.__deepcopy__ = None  # type: ignore
    self._q_build_fn = None
    self._q.__deepcopy__ = None  # type: ignore
    return self.__dict__

`init(potential_fn, prior=None, q='maf', theta_transform=None, vi_method='rKL', device='cpu', x_shape=None, parameters=None, modules=None)` ¶

Parameters:

Name	Type	Description	Default
`potential_fn`	`Union[BasePotential, CustomPotential]`	The potential function from which to draw samples. Must be a `BasePotential` or a `CustomPotential`.	required
`prior`	`Optional[TorchDistribution]`	This is the prior distribution. Note that this is only used to check/construct the variational distribution or within some quality metrics. Please make sure that this matches with the prior within the potential_fn. If `None` is given, we will try to infer it from potential_fn or q, if this fails we raise an Error.	`None`
`q`	`Union[Literal['nsf', 'scf', 'maf', 'mcf', 'gaussian', 'gaussian_diag'], PyroTransformedDistribution, VIPosterior, Callable]`	Variational distribution, either string, `TransformedDistribution`, or a `VIPosterior` object. This specifies a parametric class of distribution over which the best possible posterior approximation is searched. For string input, we currently support [nsf, scf, maf, mcf, gaussian, gaussian_diag]. You can also specify your own variational family by passing a pyro `TransformedDistribution`. Additionally, we allow a `Callable`, which allows you the pass a `builder` function, which if called returns a distribution. This may be useful for setting the hyperparameters e.g. `num_transfroms` within the `get_flow_builder` method specifying the number of transformations within a normalizing flow. If q is already a `VIPosterior`, then the arguments will be copied from it (relevant for multi-round training).	`'maf'`
`theta_transform`	`Optional[TorchTransform]`	Maps form prior support to unconstrained space. The inverse is used here to ensure that the posterior support is equal to that of the prior.	`None`
`vi_method`	`Literal['rKL', 'fKL', 'IW', 'alpha']`	This specifies the variational methods which are used to fit q to the posterior. We currently support [rKL, fKL, IW, alpha]. Note that some of the divergences are `mode seeking` i.e. they underestimate variance and collapse on multimodal targets (`rKL`, `alpha` for alpha > 1) and some are `mass covering` i.e. they overestimate variance but typically cover all modes (`fKL`, `IW`, `alpha` for alpha < 1).	`'rKL'`
`device`	`Union[str, device]`	Training device, e.g., `cpu`, `cuda` or `cuda:0`. We will ensure that all other objects are also on this device.	`'cpu'`
`x_shape`	`Optional[Size]`	Deprecated, should not be passed.	`None`
`parameters`	`Optional[Iterable]`	List of parameters of the variational posterior. This is only required for user-defined q i.e. if q does not have a `parameters` attribute.	`None`
`modules`	`Optional[Iterable]`	List of modules of the variational posterior. This is only required for user-defined q i.e. if q does not have a `modules` attribute.	`None`

Source code in sbi/inference/posteriors/vi_posterior.py

def __init__(
    self,
    potential_fn: Union[BasePotential, CustomPotential],
    prior: Optional[TorchDistribution] = None,  # type: ignore
    q: Union[
        Literal["nsf", "scf", "maf", "mcf", "gaussian", "gaussian_diag"],
        PyroTransformedDistribution,
        "VIPosterior",
        Callable,
    ] = "maf",
    theta_transform: Optional[TorchTransform] = None,
    vi_method: Literal["rKL", "fKL", "IW", "alpha"] = "rKL",
    device: Union[str, torch.device] = "cpu",
    x_shape: Optional[torch.Size] = None,
    parameters: Optional[Iterable] = None,
    modules: Optional[Iterable] = None,
):
    """
    Args:
        potential_fn: The potential function from which to draw samples. Must be a
            `BasePotential` or a `CustomPotential`.
        prior: This is the prior distribution. Note that this is only
            used to check/construct the variational distribution or within some
            quality metrics. Please make sure that this matches with the prior
            within the potential_fn. If `None` is given, we will try to infer it
            from potential_fn or q, if this fails we raise an Error.
        q: Variational distribution, either string, `TransformedDistribution`, or a
            `VIPosterior` object. This specifies a parametric class of distribution
            over which the best possible posterior approximation is searched. For
            string input, we currently support [nsf, scf, maf, mcf, gaussian,
            gaussian_diag]. You can also specify your own variational family by
            passing a pyro `TransformedDistribution`.
            Additionally, we allow a `Callable`, which allows you the pass a
            `builder` function, which if called returns a distribution. This may be
            useful for setting the hyperparameters e.g. `num_transfroms` within the
            `get_flow_builder` method specifying the number of transformations
            within a normalizing flow. If q is already a `VIPosterior`, then the
            arguments will be copied from it (relevant for multi-round training).
        theta_transform: Maps form prior support to unconstrained space. The
            inverse is used here to ensure that the posterior support is equal to
            that of the prior.
        vi_method: This specifies the variational methods which are used to fit q to
            the posterior. We currently support [rKL, fKL, IW, alpha]. Note that
            some of the divergences are `mode seeking` i.e. they underestimate
            variance and collapse on multimodal targets (`rKL`, `alpha` for alpha >
            1) and some are `mass covering` i.e. they overestimate variance but
            typically cover all modes (`fKL`, `IW`, `alpha` for alpha < 1).
        device: Training device, e.g., `cpu`, `cuda` or `cuda:0`. We will ensure
            that all other objects are also on this device.
        x_shape: Deprecated, should not be passed.
        parameters: List of parameters of the variational posterior. This is only
            required for user-defined q i.e. if q does not have a `parameters`
            attribute.
        modules: List of modules of the variational posterior. This is only
            required for user-defined q i.e. if q does not have a `modules`
            attribute.
    """
    super().__init__(potential_fn, theta_transform, device, x_shape=x_shape)

    # Especially the prior may be on another device -> move it...
    self._device = device
    self.theta_transform = theta_transform
    self.x_shape = x_shape
    self.potential_fn.device = device
    move_all_tensor_to_device(self.potential_fn, device)

    # Get prior and previous builds
    if prior is not None:
        self._prior = prior
    elif hasattr(self.potential_fn, "prior") and isinstance(
        self.potential_fn.prior, Distribution
    ):
        self._prior = self.potential_fn.prior
    elif isinstance(q, VIPosterior) and isinstance(q._prior, Distribution):
        self._prior = q._prior
    else:
        raise ValueError(
            "We could not find a suitable prior distribution within `potential_fn` "
            "or `q` (if a VIPosterior is given). Please explicitly specify a prior."
        )
    move_all_tensor_to_device(self._prior, device)
    self._optimizer = None

    # In contrast to MCMC we want to project into constrained space.
    if theta_transform is None:
        self.link_transform = mcmc_transform(self._prior).inv
    else:
        self.link_transform = theta_transform.inv

    if parameters is None:
        parameters = []
    if modules is None:
        modules = []
    # This will set the variational distribution and VI method
    self.set_q(
        q,
        parameters=parameters,
        modules=modules,
    )
    self.set_vi_method(vi_method)

    self._purpose = (
        "It provides Variational inference to .sample() from the posterior and "
        "can evaluate the _normalized_ posterior density with .log_prob()."
    )

`setstate(state_dict)` ¶

This method is called when unpickling the object.

Especially, we need to restore the removed attributes and ensure that the object e.g. remains deep copy compatible.

Parameters:

Name	Type	Description	Default
`state_dict`	`Dict`	Given state dictionary, we will restore the object from it.	required

Source code in sbi/inference/posteriors/vi_posterior.py

def __setstate__(self, state_dict: Dict):
    """This method is called when unpickling the object.

    Especially, we need to restore the removed attributes and ensure that the object
    e.g. remains deep copy compatible.

    Args:
        state_dict: Given state dictionary, we will restore the object from it.
    """
    self.__dict__ = state_dict
    q = deepcopy(self._q)
    # Restore removed attributes
    self.set_q(*self._q_arg)
    self._q = q
    make_object_deepcopy_compatible(self)
    make_object_deepcopy_compatible(self.q)

`evaluate(quality_control_metric='psis', N=int(50000.0))` ¶

This function will evaluate the quality of the variational posterior distribution. We currently support two different metrics of type psis, which checks the quality based on the tails of importance weights (there should not be much with a large one), or prop which checks the proportionality between q and potential_fn.

NOTE: In our experience prop is sensitive to distinguish good from ok whereas psis is more sensitive in distinguishing very bad from ok.

Parameters:

Name	Type	Description	Default
`quality_control_metric`	`str`	The metric of choice, we currently support [psis, prop, prop_prior].	`'psis'`
`N`	`int`	Number of samples which is used to evaluate the metric.	`int(50000.0)`

Source code in sbi/inference/posteriors/vi_posterior.py

def evaluate(self, quality_control_metric: str = "psis", N: int = int(5e4)) -> None:
    """This function will evaluate the quality of the variational posterior
    distribution. We currently support two different metrics of type `psis`, which
    checks the quality based on the tails of importance weights (there should not be
    much with a large one), or `prop` which checks the proportionality between q
    and potential_fn.

    NOTE: In our experience `prop` is sensitive to distinguish ``good`` from ``ok``
    whereas `psis` is more sensitive in distinguishing `very bad` from `ok`.

    Args:
        quality_control_metric: The metric of choice, we currently support [psis,
            prop, prop_prior].
        N: Number of samples which is used to evaluate the metric.
    """
    quality_control_fn, quality_control_msg = get_quality_metric(
        quality_control_metric
    )
    metric = round(float(quality_control_fn(self, N=N)), 3)
    print(f"Quality Score: {metric} " + quality_control_msg)

`log_prob(theta, x=None, track_gradients=False)` ¶

Returns the log-probability of theta under the variational posterior.

Parameters:

Name	Type	Description	Default
`theta`	`Tensor`	Parameters	required
`track_gradients`	`bool`	Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis but increases memory consumption.	`False`

Returns:

Type	Description
`Tensor`	`len($\theta$)`-shaped log-probability.

Source code in sbi/inference/posteriors/vi_posterior.py

def log_prob(
    self,
    theta: Tensor,
    x: Optional[Tensor] = None,
    track_gradients: bool = False,
) -> Tensor:
    r"""Returns the log-probability of theta under the variational posterior.

    Args:
        theta: Parameters
        track_gradients: Whether the returned tensor supports tracking gradients.
            This can be helpful for e.g. sensitivity analysis but increases memory
            consumption.

    Returns:
        `len($\theta$)`-shaped log-probability.
    """
    x = self._x_else_default_x(x)
    if self._trained_on is None or (x != self._trained_on).all():
        raise AttributeError(
            f"The variational posterior was not fit using observation {x}.\
                 Please train."
        )
    with torch.set_grad_enabled(track_gradients):
        theta = ensure_theta_batched(torch.as_tensor(theta))
        return self.q.log_prob(theta)

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=10000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

Returns the maximum-a-posteriori estimate (MAP).

The method can be interrupted (Ctrl-C) when the user sees that the log-probability converges. The best estimate will be saved in self._map and can be accessed with self.map(). The MAP is obtained by running gradient ascent from a given number of starting positions (samples from the posterior with the highest log-probability). After the optimization is done, we select the parameter set that has the highest log-probability after the optimization.

Warning: The default values used by this function are not well-tested. They might require hand-tuning for the problem at hand.

For developers: if the prior is a BoxUniform, we carry out the optimization in unbounded space and transform the result back into bounded space.

Parameters:

Name	Type	Description	Default
`x`	`Optional[TorchTensor]`	Deprecated - use `.set_default_x()` prior to `.map()`.	`None`
`num_iter`	`int`	Number of optimization steps that the algorithm takes to find the MAP.	`1000`
`learning_rate`	`float`	Learning rate of the optimizer.	`0.01`
`init_method`	`Union[str, TorchTensor]`	How to select the starting parameters for the optimization. If it is a string, it can be either [`posterior`, `prior`], which samples the respective distribution `num_init_samples` times. If it is a tensor, the tensor will be used as init locations.	`'proposal'`
`num_init_samples`	`int`	Draw this number of samples from the posterior and evaluate the log-probability of all of them.	`10000`
`num_to_optimize`	`int`	From the drawn `num_init_samples`, use the `num_to_optimize` with highest log-probability as the initial points for the optimization.	`100`
`save_best_every`	`int`	The best log-probability is computed, saved in the `map`-attribute, and printed every `save_best_every`-th iteration. Computing the best log-probability creates a significant overhead (thus, the default is `10`.)	`10`
`show_progress_bars`	`bool`	Whether to show a progressbar during sampling from the posterior.	`False`
`force_update`	`bool`	Whether to re-calculate the MAP when x is unchanged and have a cached value.	`False`
`log_prob_kwargs`		Will be empty for SNLE and SNRE. Will contain {‘norm_posterior’: True} for SNPE.	required

Returns:

Type	Description
`Tensor`	The MAP estimate.

Source code in sbi/inference/posteriors/vi_posterior.py

def map(
    self,
    x: Optional[TorchTensor] = None,
    num_iter: int = 1_000,
    num_to_optimize: int = 100,
    learning_rate: float = 0.01,
    init_method: Union[str, TorchTensor] = "proposal",
    num_init_samples: int = 10_000,
    save_best_every: int = 10,
    show_progress_bars: bool = False,
    force_update: bool = False,
) -> Tensor:
    r"""Returns the maximum-a-posteriori estimate (MAP).

    The method can be interrupted (Ctrl-C) when the user sees that the
    log-probability converges. The best estimate will be saved in `self._map` and
    can be accessed with `self.map()`. The MAP is obtained by running gradient
    ascent from a given number of starting positions (samples from the posterior
    with the highest log-probability). After the optimization is done, we select the
    parameter set that has the highest log-probability after the optimization.

    Warning: The default values used by this function are not well-tested. They
    might require hand-tuning for the problem at hand.

    For developers: if the prior is a `BoxUniform`, we carry out the optimization
    in unbounded space and transform the result back into bounded space.

    Args:
        x: Deprecated - use `.set_default_x()` prior to `.map()`.
        num_iter: Number of optimization steps that the algorithm takes
            to find the MAP.
        learning_rate: Learning rate of the optimizer.
        init_method: How to select the starting parameters for the optimization. If
            it is a string, it can be either [`posterior`, `prior`], which samples
            the respective distribution `num_init_samples` times. If it is a
            tensor, the tensor will be used as init locations.
        num_init_samples: Draw this number of samples from the posterior and
            evaluate the log-probability of all of them.
        num_to_optimize: From the drawn `num_init_samples`, use the
            `num_to_optimize` with highest log-probability as the initial points
            for the optimization.
        save_best_every: The best log-probability is computed, saved in the
            `map`-attribute, and printed every `save_best_every`-th iteration.
            Computing the best log-probability creates a significant overhead
            (thus, the default is `10`.)
        show_progress_bars: Whether to show a progressbar during sampling from
            the posterior.
        force_update: Whether to re-calculate the MAP when x is unchanged and
            have a cached value.
        log_prob_kwargs: Will be empty for SNLE and SNRE. Will contain
            {'norm_posterior': True} for SNPE.

    Returns:
        The MAP estimate.
    """
    self.proposal = self.q
    return super().map(
        x=x,
        num_iter=num_iter,
        num_to_optimize=num_to_optimize,
        learning_rate=learning_rate,
        init_method=init_method,
        num_init_samples=num_init_samples,
        save_best_every=save_best_every,
        show_progress_bars=show_progress_bars,
        force_update=force_update,
    )

`sample(sample_shape=torch.Size(), x=None, **kwargs)` ¶

Samples from the variational posterior distribution.

Parameters:

Name	Type	Description	Default
`sample_shape`	`Shape`	Shape of samples	`Size()`

Returns:

Type	Description
`Tensor`	Samples from posterior.

Source code in sbi/inference/posteriors/vi_posterior.py

def sample(
    self,
    sample_shape: Shape = torch.Size(),
    x: Optional[Tensor] = None,
    **kwargs,
) -> Tensor:
    """Samples from the variational posterior distribution.

    Args:
        sample_shape: Shape of samples

    Returns:
        Samples from posterior.
    """
    x = self._x_else_default_x(x)
    if self._trained_on is None or (x != self._trained_on).all():
        raise AttributeError(
            f"The variational posterior was not fit on the specified `default_x` "
            f"{x}. Please train using `posterior.train()`."
        )
    samples = self.q.sample(torch.Size(sample_shape))
    return samples.reshape((*sample_shape, samples.shape[-1]))

`set_q(q, parameters=None, modules=None)` ¶

Defines the variational family.

You can specify over which parameters/modules we optimize. This is required for custom distributions which e.g. do not inherit nn.Modules or has the function parameters or modules to give direct access to trainable parameters. Further, you can pass a function, which constructs a variational distribution if called.

Parameters:

Name	Type	Description	Default
`q`	`Union[str, PyroTransformedDistribution, VIPosterior, Callable]`	Variational distribution, either string, distribution, or a VIPosterior object. This specifies a parametric class of distribution over which the best possible posterior approximation is searched. For string input, we currently support [nsf, scf, maf, mcf, gaussian, gaussian_diag]. Of course, you can also specify your own variational family by passing a `parameterized` distribution object i.e. a torch.distributions Distribution with methods `parameters` returning an iterable of all parameters (you can pass them within the paramters/modules attribute). Additionally, we allow a `Callable`, which allows you the pass a `builder` function, which if called returns an distribution. This may be useful for setting the hyperparameters e.g. `num_transfroms:int` by using the `get_flow_builder` method specifying the hyperparameters. If q is already a `VIPosterior`, then the arguments will be copied from it (relevant for multi-round training).	required
`parameters`	`Optional[Iterable]`	List of parameters associated with the distribution object.	`None`
`modules`	`Optional[Iterable]`	List of modules associated with the distribution object.	`None`

Source code in sbi/inference/posteriors/vi_posterior.py

def set_q(
    self,
    q: Union[str, PyroTransformedDistribution, "VIPosterior", Callable],
    parameters: Optional[Iterable] = None,
    modules: Optional[Iterable] = None,
) -> None:
    """Defines the variational family.

    You can specify over which parameters/modules we optimize. This is required for
    custom distributions which e.g. do not inherit nn.Modules or has the function
    `parameters` or `modules` to give direct access to trainable parameters.
    Further, you can pass a function, which constructs a variational distribution
    if called.

    Args:
        q: Variational distribution, either string, distribution, or a VIPosterior
            object. This specifies a parametric class of distribution over which
            the best possible posterior approximation is searched. For string input,
            we currently support [nsf, scf, maf, mcf, gaussian, gaussian_diag]. Of
            course, you can also specify your own variational family by passing a
            `parameterized` distribution object i.e. a torch.distributions
            Distribution with methods `parameters` returning an iterable of all
            parameters (you can pass them within the paramters/modules attribute).
            Additionally, we allow a `Callable`, which allows you the pass a
            `builder` function, which if called returns an distribution. This may be
            useful for setting the hyperparameters e.g. `num_transfroms:int` by
            using the `get_flow_builder` method specifying the hyperparameters. If q
            is already a `VIPosterior`, then the arguments will be copied from it
            (relevant for multi-round training).
        parameters: List of parameters associated with the distribution object.
        modules: List of modules associated with the distribution object.

    """
    if parameters is None:
        parameters = []
    if modules is None:
        modules = []
    self._q_arg = (q, parameters, modules)
    if isinstance(q, Distribution):
        q = adapt_variational_distribution(
            q,
            self._prior,
            self.link_transform,
            parameters=parameters,
            modules=modules,
        )
        make_object_deepcopy_compatible(q)
        self_custom_q_init_cache = deepcopy(q)
        self._q_build_fn = lambda *args, **kwargs: self_custom_q_init_cache
        self._trained_on = None
    elif isinstance(q, (str, Callable)):
        if isinstance(q, str):
            self._q_build_fn = get_flow_builder(q)
        else:
            self._q_build_fn = q

        q = self._q_build_fn(
            self._prior.event_shape,
            self.link_transform,
            device=self._device,
        )
        make_object_deepcopy_compatible(q)
        self._trained_on = None
    elif isinstance(q, VIPosterior):
        self._q_build_fn = q._q_build_fn
        self._trained_on = q._trained_on
        self.vi_method = q.vi_method  # type: ignore
        self._device = q._device
        self._prior = q._prior
        self._x = q._x
        self._q_arg = q._q_arg
        make_object_deepcopy_compatible(q.q)
        q = deepcopy(q.q)
    move_all_tensor_to_device(q, self._device)
    assert isinstance(
        q, Distribution
    ), """Something went wrong when initializing the variational distribution.
        Please create an issue on github https://github.com/mackelab/sbi/issues"""
    check_variational_distribution(q, self._prior)
    self._q = q

`set_vi_method(method)` ¶

Sets variational inference method.

Parameters:

Name	Type	Description	Default
`method`	`str`	One of [rKL, fKL, IW, alpha].	required

Returns:

Type	Description
`VIPosterior`	`VIPosterior` for chainable calls.

Source code in sbi/inference/posteriors/vi_posterior.py

def set_vi_method(self, method: str) -> "VIPosterior":
    """Sets variational inference method.

    Args:
        method: One of [rKL, fKL, IW, alpha].

    Returns:
        `VIPosterior` for chainable calls.
    """
    self._vi_method = method
    self._optimizer_builder = get_VI_method(method)
    return self

`to(device)` ¶

Move potential_fn, _prior and x_o to device, and change the device attribute.

Reinstantiates the posterior and re sets the default x.

Parameters:

Name	Type	Description	Default
`device`	`Union[str, device]`	The device to move the posterior to.	required

Source code in sbi/inference/posteriors/vi_posterior.py

def to(self, device: Union[str, torch.device]) -> None:
    """
    Move potential_fn, _prior and x_o to device, and change the device attribute.

    Reinstantiates the posterior and re sets the default x.

    Args:
        device: The device to move the posterior to.
    """
    self.device = device
    self.potential_fn.to(device)  # type: ignore
    self._prior.to(device)  # type: ignore
    if self._x is not None:
        x_o = self._x.to(device)
    self.theta_transform = mcmc_transform(self._prior, device=device)
    super().__init__(
        self.potential_fn, self.theta_transform, device, x_shape=self.x_shape
    )
    # super().__init__ erases the self._x, so we need to set it again
    if self._x is not None:
        self.set_default_x(x_o)

    if self.theta_transform is None:
        self.link_transform = mcmc_transform(self._prior).inv
    else:
        self.link_transform = self.theta_transform.inv

`train(x=None, n_particles=256, learning_rate=0.001, gamma=0.999, max_num_iters=2000, min_num_iters=10, clip_value=10.0, warm_up_rounds=100, retrain_from_scratch=False, reset_optimizer=False, show_progress_bar=True, check_for_convergence=True, quality_control=True, quality_control_metric='psis', **kwargs)` ¶

This method trains the variational posterior.

Parameters:

Name	Type	Description	Default
`x`	`Optional[TorchTensor]`	The observation.	`None`
`n_particles`	`int`	Number of samples to approximate expectations within the variational bounds. The larger the more accurate are gradient estimates, but the computational cost per iteration increases.	`256`
`learning_rate`	`float`	Learning rate of the optimizer.	`0.001`
`gamma`	`float`	Learning rate decay per iteration. We use an exponential decay scheduler.	`0.999`
`max_num_iters`	`int`	Maximum number of iterations.	`2000`
`min_num_iters`	`int`	Minimum number of iterations.	`10`
`clip_value`	`float`	Gradient clipping value, decreasing may help if you see invalid values.	`10.0`
`warm_up_rounds`	`int`	Initialize the posterior as the prior.	`100`
`retrain_from_scratch`	`bool`	Retrain the variational distributions from scratch.	`False`
`reset_optimizer`	`bool`	Reset the divergence optimizer	`False`
`show_progress_bar`	`bool`	If any progress report should be displayed.	`True`
`quality_control`	`bool`	If False quality control is skipped.	`True`
`quality_control_metric`	`str`	Which metric to use for evaluating the quality.	`'psis'`
`kwargs`		Hyperparameters check corresponding `DivergenceOptimizer` for detail eps: Determines sensitivity of convergence check. retain_graph: Boolean which decides whether to retain the computation graph. This may be required for some `exotic` user-specified q’s. optimizer: A PyTorch Optimizer class e.g. Adam or SGD. See `DivergenceOptimizer` for details. scheduler: A PyTorch learning rate scheduler. See `DivergenceOptimizer` for details. alpha: Only used if vi_method=`alpha`. Determines the alpha divergence. K: Only used if vi_method=`IW`. Determines the number of importance weighted particles. stick_the_landing: If one should use the STL estimator (only for rKL, IW, alpha). dreg: If one should use the DREG estimator (only for rKL, IW, alpha). weight_transform: Callable applied to importance weights (only for fKL)	`{}`

Returns: VIPosterior: VIPosterior (can be used to chain calls).

Source code in sbi/inference/posteriors/vi_posterior.py

def train(
    self,
    x: Optional[TorchTensor] = None,
    n_particles: int = 256,
    learning_rate: float = 1e-3,
    gamma: float = 0.999,
    max_num_iters: int = 2000,
    min_num_iters: int = 10,
    clip_value: float = 10.0,
    warm_up_rounds: int = 100,
    retrain_from_scratch: bool = False,
    reset_optimizer: bool = False,
    show_progress_bar: bool = True,
    check_for_convergence: bool = True,
    quality_control: bool = True,
    quality_control_metric: str = "psis",
    **kwargs,
) -> "VIPosterior":
    """This method trains the variational posterior.

    Args:
        x: The observation.
        n_particles: Number of samples to approximate expectations within the
            variational bounds. The larger the more accurate are gradient
            estimates, but the computational cost per iteration increases.
        learning_rate: Learning rate of the optimizer.
        gamma: Learning rate decay per iteration. We use an exponential decay
            scheduler.
        max_num_iters: Maximum number of iterations.
        min_num_iters: Minimum number of iterations.
        clip_value: Gradient clipping value, decreasing may help if you see invalid
            values.
        warm_up_rounds: Initialize the posterior as the prior.
        retrain_from_scratch: Retrain the variational distributions from scratch.
        reset_optimizer: Reset the divergence optimizer
        show_progress_bar: If any progress report should be displayed.
        quality_control: If False quality control is skipped.
        quality_control_metric: Which metric to use for evaluating the quality.
        kwargs: Hyperparameters check corresponding `DivergenceOptimizer` for detail
            eps: Determines sensitivity of convergence check.
            retain_graph: Boolean which decides whether to retain the computation
                graph. This may be required for some `exotic` user-specified q's.
            optimizer: A PyTorch Optimizer class e.g. Adam or SGD. See
                `DivergenceOptimizer` for details.
            scheduler: A PyTorch learning rate scheduler. See
                `DivergenceOptimizer` for details.
            alpha: Only used if vi_method=`alpha`. Determines the alpha divergence.
            K: Only used if vi_method=`IW`. Determines the number of importance
                weighted particles.
            stick_the_landing: If one should use the STL estimator (only for rKL,
                IW, alpha).
            dreg: If one should use the DREG estimator (only for rKL, IW, alpha).
            weight_transform: Callable applied to importance weights (only for fKL)
    Returns:
        VIPosterior: `VIPosterior` (can be used to chain calls).
    """
    # Update optimizer with current arguments.
    if self._optimizer is not None:
        self._optimizer.update({**locals(), **kwargs})

    # Init q and the optimizer if necessary
    if retrain_from_scratch:
        self.q = self._q_build_fn()  # type: ignore
        self._optimizer = self._optimizer_builder(
            self.potential_fn,
            self.q,
            lr=learning_rate,
            clip_value=clip_value,
            gamma=gamma,
            n_particles=n_particles,
            prior=self._prior,
            **kwargs,
        )

    if (
        reset_optimizer
        or self._optimizer is None
        or not isinstance(self._optimizer, self._optimizer_builder)
    ):
        self._optimizer = self._optimizer_builder(
            self.potential_fn,
            self.q,
            lr=learning_rate,
            clip_value=clip_value,
            gamma=gamma,
            n_particles=n_particles,
            prior=self._prior,
            **kwargs,
        )

    # Check context
    x = atleast_2d_float32_tensor(self._x_else_default_x(x)).to(  # type: ignore
        self._device
    )

    already_trained = self._trained_on is not None and (x == self._trained_on).all()

    # Optimize
    optimizer = self._optimizer
    optimizer.to(self._device)
    optimizer.reset_loss_stats()

    if show_progress_bar:
        iters = tqdm(range(max_num_iters))
    else:
        iters = range(max_num_iters)

    # Warmup before training
    if reset_optimizer or (not optimizer.warm_up_was_done and not already_trained):
        if show_progress_bar:
            iters.set_description(  # type: ignore
                "Warmup phase, this may take a few seconds..."
            )
        optimizer.warm_up(warm_up_rounds)

    for i in iters:
        optimizer.step(x)
        mean_loss, std_loss = optimizer.get_loss_stats()
        # Update progress bar
        if show_progress_bar:
            assert isinstance(iters, tqdm)
            iters.set_description(  # type: ignore
                f"Loss: {np.round(float(mean_loss), 2)}"
                f"Std: {np.round(float(std_loss), 2)}"
            )
        # Check for convergence
        if check_for_convergence and i > min_num_iters and optimizer.converged():
            if show_progress_bar:
                print(f"\nConverged with loss: {np.round(float(mean_loss), 2)}")
            break
    # Training finished:
    self._trained_on = x

    # Evaluate quality
    if quality_control:
        try:
            self.evaluate(quality_control_metric=quality_control_metric)
        except Exception as e:
            print(
                f"Quality control showed a low quality of the variational "
                f"posterior. We are automatically retraining the variational "
                f"posterior from scratch with a smaller learning rate. "
                f"Alternatively, if you want to skip quality control, please "
                f"retrain with `VIPosterior.train(..., quality_control=False)`. "
                f"\nThe error that occured is: {e}"
            )
            self.train(
                learning_rate=learning_rate * 0.1,
                retrain_from_scratch=True,
                reset_optimizer=True,
            )

    return self

Name	Type	Description	Default
`theta`	`Tensor`	Parameters \(\theta\).	required
`norm_posterior`	`bool`	Whether to enforce a normalized posterior density. Renormalization of the posterior is useful when some probability falls out or leaks out of the prescribed prior support. The normalizing factor is calculated via rejection sampling, so if you need speedier but unnormalized log posterior estimates set here `norm_posterior=False`. The returned log posterior is set to -∞ outside of the prior support regardless of this setting.	`True`
`track_gradients`	`bool`	Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.	`False`
`leakage_correction_params`	`Optional[dict]`	A `dict` of keyword arguments to override the default values of `leakage_correction()`. Possible options are: `num_rejection_samples`, `force_update`, `show_progress_bars`, and `rejection_sampling_batch_size`. These parameters only have an effect if `norm_posterior=True`.	`None`

Posteriors¶

DirectPosterior ¶

__init__(posterior_estimator, prior, max_sampling_batch_size=10000, device=None, x_shape=None, enable_transform=True) ¶

leakage_correction(x, num_rejection_samples=10000, force_update=False, show_progress_bars=False, rejection_sampling_batch_size=10000) ¶

log_prob(theta, x=None, norm_posterior=True, track_gradients=False, leakage_correction_params=None) ¶

log_prob_batched(theta, x, norm_posterior=True, track_gradients=False, leakage_correction_params=None) ¶

map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='posterior', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False) ¶

sample(sample_shape=torch.Size(), x=None, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=True) ¶

sample_batched(sample_shape, x, max_sampling_batch_size=10000, show_progress_bars=True) ¶

to(device) ¶

ImportanceSamplingPosterior ¶

__init__(potential_fn, proposal, theta_transform=None, method='sir', oversampling_factor=32, max_sampling_batch_size=10000, device=None, x_shape=None) ¶

estimate_normalization_constant(x, num_samples=10000, force_update=False) ¶

log_prob(theta, x=None, track_gradients=False, normalization_constant_params=None) ¶

map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False) ¶

sample(sample_shape=torch.Size(), x=None, method=None, oversampling_factor=32, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=False) ¶

to(device) ¶

MCMCPosterior ¶

mcmc_method property writable ¶

posterior_sampler property ¶

__getstate__() ¶

__init__(potential_fn, proposal, theta_transform=None, method='slice_np_vectorized', thin=-1, warmup_steps=200, num_chains=20, init_strategy='resample', init_strategy_parameters=None, init_strategy_num_candidates=None, num_workers=1, mp_context='spawn', device=None, x_shape=None) ¶

get_arviz_inference_data() ¶

log_prob(theta, x=None, track_gradients=False) ¶

map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False) ¶

sample_batched(sample_shape, x, method=None, thin=None, warmup_steps=None, num_chains=None, init_strategy=None, init_strategy_parameters=None, num_workers=None, mp_context=None, show_progress_bars=True) ¶

set_mcmc_method(method) ¶

to(device) ¶

RejectionPosterior ¶

__init__(potential_fn, proposal, theta_transform=None, max_sampling_batch_size=10000, num_samples_to_find_max=10000, num_iter_to_find_max=100, m=1.2, device=None, x_shape=None) ¶

log_prob(theta, x=None, track_gradients=False) ¶

map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False) ¶

sample(sample_shape=torch.Size(), x=None, max_sampling_batch_size=None, num_samples_to_find_max=None, num_iter_to_find_max=None, m=None, sample_with=None, show_progress_bars=True) ¶

to(device) ¶

VectorFieldPosterior ¶

__init__(vector_field_estimator, prior, max_sampling_batch_size=10000, device=None, enable_transform=True, sample_with='sde', **kwargs) ¶

log_prob(theta, x=None, track_gradients=False, ode_kwargs=None) ¶

map(x=None, num_iter=1000, num_to_optimize=1000, learning_rate=0.01, init_method='posterior', num_init_samples=1000, save_best_every=1000, show_progress_bars=False, force_update=False) ¶

sample(sample_shape=torch.Size(), x=None, predictor='euler_maruyama', corrector=None, predictor_params=None, corrector_params=None, steps=500, ts=None, iid_method=None, iid_params=None, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=True) ¶

sample_batched(sample_shape, x, predictor='euler_maruyama', corrector=None, predictor_params=None, corrector_params=None, steps=500, ts=None, max_sampling_batch_size=10000, show_progress_bars=True) ¶

sample_via_ode(sample_shape=torch.Size(), **kwargs) ¶

to(device) ¶

VIPosterior ¶

References¶

q property writable ¶

vi_method property writable ¶

__deepcopy__(memo=None) ¶

__getstate__() ¶

__init__(potential_fn, prior=None, q='maf', theta_transform=None, vi_method='rKL', device='cpu', x_shape=None, parameters=None, modules=None) ¶

__setstate__(state_dict) ¶

evaluate(quality_control_metric='psis', N=int(50000.0)) ¶

log_prob(theta, x=None, track_gradients=False) ¶

map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=10000, save_best_every=10, show_progress_bars=False, force_update=False) ¶

sample(sample_shape=torch.Size(), x=None, **kwargs) ¶

set_q(q, parameters=None, modules=None) ¶

set_vi_method(method) ¶

to(device) ¶

`DirectPosterior` ¶

`init(posterior_estimator, prior, max_sampling_batch_size=10000, device=None, x_shape=None, enable_transform=True)` ¶

`leakage_correction(x, num_rejection_samples=10000, force_update=False, show_progress_bars=False, rejection_sampling_batch_size=10000)` ¶

`log_prob(theta, x=None, norm_posterior=True, track_gradients=False, leakage_correction_params=None)` ¶

`log_prob_batched(theta, x, norm_posterior=True, track_gradients=False, leakage_correction_params=None)` ¶

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='posterior', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

`sample(sample_shape=torch.Size(), x=None, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=True)` ¶

`sample_batched(sample_shape, x, max_sampling_batch_size=10000, show_progress_bars=True)` ¶

`to(device)` ¶

`ImportanceSamplingPosterior` ¶

`init(potential_fn, proposal, theta_transform=None, method='sir', oversampling_factor=32, max_sampling_batch_size=10000, device=None, x_shape=None)` ¶

`estimate_normalization_constant(x, num_samples=10000, force_update=False)` ¶

`log_prob(theta, x=None, track_gradients=False, normalization_constant_params=None)` ¶

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

`sample(sample_shape=torch.Size(), x=None, method=None, oversampling_factor=32, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=False)` ¶

`to(device)` ¶

`MCMCPosterior` ¶

`mcmc_method` `property` `writable` ¶

`posterior_sampler` `property` ¶

`getstate()` ¶

`init(potential_fn, proposal, theta_transform=None, method='slice_np_vectorized', thin=-1, warmup_steps=200, num_chains=20, init_strategy='resample', init_strategy_parameters=None, init_strategy_num_candidates=None, num_workers=1, mp_context='spawn', device=None, x_shape=None)` ¶

`get_arviz_inference_data()` ¶

`log_prob(theta, x=None, track_gradients=False)` ¶

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

`sample_batched(sample_shape, x, method=None, thin=None, warmup_steps=None, num_chains=None, init_strategy=None, init_strategy_parameters=None, num_workers=None, mp_context=None, show_progress_bars=True)` ¶

`set_mcmc_method(method)` ¶

`to(device)` ¶

`RejectionPosterior` ¶

`init(potential_fn, proposal, theta_transform=None, max_sampling_batch_size=10000, num_samples_to_find_max=10000, num_iter_to_find_max=100, m=1.2, device=None, x_shape=None)` ¶

`log_prob(theta, x=None, track_gradients=False)` ¶

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

`sample(sample_shape=torch.Size(), x=None, max_sampling_batch_size=None, num_samples_to_find_max=None, num_iter_to_find_max=None, m=None, sample_with=None, show_progress_bars=True)` ¶

`to(device)` ¶

`VectorFieldPosterior` ¶

`init(vector_field_estimator, prior, max_sampling_batch_size=10000, device=None, enable_transform=True, sample_with='sde', **kwargs)` ¶

`log_prob(theta, x=None, track_gradients=False, ode_kwargs=None)` ¶

`map(x=None, num_iter=1000, num_to_optimize=1000, learning_rate=0.01, init_method='posterior', num_init_samples=1000, save_best_every=1000, show_progress_bars=False, force_update=False)` ¶

`sample(sample_shape=torch.Size(), x=None, predictor='euler_maruyama', corrector=None, predictor_params=None, corrector_params=None, steps=500, ts=None, iid_method=None, iid_params=None, max_sampling_batch_size=10000, sample_with=None, show_progress_bars=True)` ¶

`sample_batched(sample_shape, x, predictor='euler_maruyama', corrector=None, predictor_params=None, corrector_params=None, steps=500, ts=None, max_sampling_batch_size=10000, show_progress_bars=True)` ¶

`sample_via_ode(sample_shape=torch.Size(), **kwargs)` ¶

`to(device)` ¶

`VIPosterior` ¶

`q` `property` `writable` ¶

`vi_method` `property` `writable` ¶

`deepcopy(memo=None)` ¶

`getstate()` ¶

`init(potential_fn, prior=None, q='maf', theta_transform=None, vi_method='rKL', device='cpu', x_shape=None, parameters=None, modules=None)` ¶

`setstate(state_dict)` ¶

`evaluate(quality_control_metric='psis', N=int(50000.0))` ¶

`log_prob(theta, x=None, track_gradients=False)` ¶

`map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='proposal', num_init_samples=10000, save_best_every=10, show_progress_bars=False, force_update=False)` ¶

`sample(sample_shape=torch.Size(), x=None, **kwargs)` ¶

`set_q(q, parameters=None, modules=None)` ¶

`set_vi_method(method)` ¶

`to(device)` ¶