metacast.sensitivity_analyses.lhs_and_prcc
- Creation:
Author: Martin Grunnill Date: 2024-03-08
Description: Generate Latin-Hypercube Sample (LHS), run simulations and calculate Partial Correlation Coefficients (PRCCs). For a description of LHS and PRCCs use in model sensitivity analyses see:
Marino, S., Hogue, I. B., Ray, C. J., & Kirschner, D. E. (2008). A methodology for performing global uncertainty and sensitivity analysis in systems biology. In Journal of Theoretical Biology (Vol. 254, Issue 1, pp. 178–196). https://doi.org/10.1016/j.jtbi.2008.04.011
Notes
Serial processing of LHS can be slow. but parallel processing of LHS can take up a lot of computing resources.
Module Contents
Functions
|
Formats Latin Hypercube sample generated by scipy.stats.qmc. |
|
Calculate Partial Correlation Coefficient PCC (default Rank 'Spearman' PRCC). |
|
Generate a Latin Hypercube sample, run model with sample and calculate PRCC for sampled parameters. |
|
|
|
Runs model simulations using parameters in sample_df using parallel processing. |
- metacast.sensitivity_analyses.lhs_and_prcc._format_sample(parameters_df, LH_samples, other_samples_to_repeat=None)
Formats Latin Hypercube sample generated by scipy.stats.qmc.
Scales LH_samples in with boundaries outlined in parameters_df.
- Parameters:
parameters_df (pd.DataFrame or dictionary) – DataFrame outlining the boundaries for each parameter. Must contain fields ‘Lower Bound’ and ‘Upper Bound’. Name of the parameters is assumed to be in the index. Alternatively, may be a dictionary that maps parameter names to percentage-point (inverse-CDF) functions.
LH_samples (numpy.array) – Output from scipy.stats.qmc.LatinHypercube.
other_samples_to_repeat (pandas.DataFrame) – Samples to resampled and merged with LH samples.
- Returns:
samples_df (pandas.Dataframe) – Fully formatted samples.
parameters_sampled (list) – A list of samples being sampled.
- metacast.sensitivity_analyses.lhs_and_prcc.calculate_prcc(results_and_sample_df, parameter, output, covariables, method='spearman')
Calculate Partial Correlation Coefficient PCC (default Rank ‘Spearman’ PRCC).
A wrapper for pingouin.partial_corr. Partial Rank Correlation Coefficients (PRCCS) can be used to evaluate sensitivity of a model to a parameter[1].
- Parameters:
results_and_sample_df (pandas.DataFrame) – DataFrame of results and parameters for calculating PCC.
parameter (string) – Parameter for which PCC will be calculated.
output (string) – Model output for which PCC will be calculated.
covariables (list of strings) – Parameters whose effects will be discounted.
method (string, default 'spearman' (Rank correlations)) – Form of PCC see documentation of pingouin.partial_corr.
- Returns:
Partial Corelation Coefficient of parameter and output.
- Return type:
pandas.DataFrame
References
- [1] Marino, S., Hogue, I. B., Ray, C. J., & Kirschner, D. E. (2008). A methodology for performing global uncertainty
and sensitivity analysis in systems biology. In Journal of Theoretical Biology (Vol. 254, Issue 1, pp. 178–196). https://doi.org/10.1016/j.jtbi.2008.04.011
- metacast.sensitivity_analyses.lhs_and_prcc.lhs_prcc(parameters_df, sample_size, model_run_method, client=None, lhs_obj=None, other_samples_to_repeat=None, **kwargs)
Generate a Latin Hypercube sample, run model with sample and calculate PRCC for sampled parameters.
Latin Hypercube Sampling with Partial Rank Correlation Coefficients (PRCCS) can be used to evaluate sensitivity of a model to a parameter[1]. Note currently only supports uniform distribution.
- Parameters:
parameters_df (pandas.DataFrame or dictionary) – DataFrame outlining the boundaries for each parameter. Must contain fields ‘Lower Bound’ and ‘Upper Bound’. The name of the parameters is assumed to be in the index of the DataFrame. Alternatively, may be a dictionary that maps parameter names to percentage-point (inverse-CDF) functions.
sample_size (int) – Sample size of Latin Hypercube.
model_run_method (function) – Method of running model’s simulations. Must accept parameters as a single dictionary. Must output dictionary of input parameters and model results.
client (dask.distributed.Client (default None)) –
- Dask client for running simulations in parallel. If not given simulations are run serially with a tqdm progress
bar.
lhs_obj (scipy.stats.qmc.LatinHypercube, optional) – Pre-initialised Latin Hypercube sample generator. If not provided one is generated by within this function.
other_samples_to_repeat (pandas.DataFrame) – Samples to resampled and merged with LH samples.
kwargs – Key word arguments to be passed to model_run_method.
- Returns:
results_df (pandas.DataFrame) – Results of simulations preceded by parameters used in simulations.
prccs (pandas.DataFrame) – PRCC summary of the model’s sensitivity to its parameters.
References
- [1] Marino, S., Hogue, I. B., Ray, C. J., & Kirschner, D. E. (2008). A methodology for performing global uncertainty
and sensitivity analysis in systems biology. In Journal of Theoretical Biology (Vol. 254, Issue 1, pp. 178–196). https://doi.org/10.1016/j.jtbi.2008.04.011
- metacast.sensitivity_analyses.lhs_and_prcc.run_samples_serially(parameters_df, model_run_method, **kwargs)
- metacast.sensitivity_analyses.lhs_and_prcc.run_samples_in_parallel(parameters_df, model_run_method, client, **kwargs)
Runs model simulations using parameters in sample_df using parallel processing.
- Parameters:
parameters_df (pandas.Dataframe) – Parameter samples being run. Column fields are parameters.
model_run_method (function) –
- Method of running model’s simulations. Must accept parameters as a single dictionary in the first argument. Must
output dictionary of input parameters and model results.
kwargs – Key word arguments to pass to model_run_method.
- Return type:
If return_focused_results is True a pandas DataFrame of results is returned.