Parameter Balancing - Documentation

Introduction

Welcome to the parameter balancing documentation.
In the leading Overview you will find information on what parameter balancing is and how it can be used.
For a quick start on the usage of our Online Balancing, please refer to our Getting Started Section.
You can find the underlying Python3 source code on our parameter balancing Github page. There, you will also find information on how to embed parameter balancing in your programming workflows or how to use it via Linux commandline. The information is also found in our Software Section.
Also on this page, you can find detailled information on the biochemical quantities employed in parameter balancing and have a look at Frequently Asked Questions on parameter balancing.
Finally, have a look into our parameter balancing User Manual to have a summary on the method and its applications.

Overview

What is parameter balancing? Parameter balancing is a way to determine consistent parameter sets for kinetic models of metabolism. Inserting experimentally measured values directly into a model will probably yield incomplete or inconsistent parameter sets, violating the thermodynamic Haldane relationships. Balanced parameter sets avoid this problem. They are computed based on kinetic constants and other data collected from experiments or the literature, but also based on known constraints between biochemical quantities and on assumptions about typical ranges, represented by prior values and bounds.

How can I run parameter balancing? After preparing your model and data files, you can run parameter balancing interactively on this website. The Getting Started Section will give a good quick start on that.
If you prefer working in the commandline, or if you would like to include parameter balancing in your programs and are interested in the source code, the parameter balancing Github page and our Software Section are of interest to you.

Which parameters of a metabolic model can be balanced? In general, parameter balancing concerns the kinetic and thermodynamic constants in kinetic metabolic models. It can also cover metabolite concentrations, chemical potentials, and reaction Gibbs free energies (or, equivalently, "reaction affinities" or "driving forces"). Metabolic fluxes cannot be balanced, but they can be included in the analysis (this is described below). There are different typical application cases:

Kinetic constants, where equilibrium constants are fixed and given

This can be done separately for each individual reaction. If a network is large and equilibrium constants can be predefined (for instance, by parameter balancing), we suggest to split the network into single reactions and to run parameter balancing separately for every reaction.

Kinetic constants and equilibrium constants in a network
Equilibrium constants and concentrations in a network
Equilibrium constants, kinetic constants, and concentrations in a network

What input data are needed? Parameter balancing employs SBML (Systems Biology Markup Language) files or SBtab files for model structures and SBtab table files for data and configuration files. Parameter balancing imports a model (SBML, obligatory) and a data table (SBtab, optional) with experimental data values. Furthermore, tables with information on the prior distributions and balancing options are possible. Parameter balancing produces tables with balanced parameters (SBtab) and a model with rate laws and balanced parameters included (SBML). Please prepare your SBtab files as described below. The validity of these files can be checked on the SBtab website. The quantities described in your data table will be linked to elements of the SBML model, via the entries in the columns !SBML:reaction:id and !SBML:species:id. The IDs in these columns must match the IDs chosen in the SBML file.

General information Parameter balancing has been developed at Humboldt Universität Berlin, Charité - Universitätsmedizin Berlin, and at INRA (Institut National de la Recherche Agronomique) Jouy-en-Josas. Parameter balancing is applied as part of larger workflows in Stanford et al. (Plos One) and Noor et al. (Plos Computational Biology).
Additional information on Systems Biology applications can be found in "Systems Biology: A textbook" by Edda Klipp, Wolfram Liebermeister, Christoph Wierling, and Axel Kowald.

Citing the parameter balancing method The concept of parameter balancing is described in The Journal of Physical Chemistry B. Please cite this publication if you use parameter balancing for your research.

Software

Parameter balancing is a tool for metabolic modelling in systems biology. It is implemented in Python3 and its code underlies the PEP8 guidelines. These are the possible ways of employing parameter balancing for your project:

The online version The tool can be employed via www.parameterbalancing.net. All required knowledge can be found here on the webpage.
Parameter balancing as Pypi package and commandline tool
To install parameter balancing as a Python3 package, first of all you need Python3. Next, you will need the pip3 installer. You can find information on how to do this here: https://pip.pypa.io/en/stable/installing/. Afterwards, install parameter balancing by typing in your command line:
```
sudo pip3 install pbalancing
```
This will also install libsbml and tablib on your computer if these libraries are missing. You can now employ parameter balancing as a Python3 package by, e.g., writing a script such as
```
from pbalancing import parameter_balancing_core
parameter_balancing_core.parameter_balancing_wrapper('model.xml')
```
In this example case, 'model.xml' is the file name of an SBML model. Further optional arguments are an SBtab parameter file, an SBtab prior distribution file, and an SBtab configuration file (example files can be found in the Download Section).

To run parameter balancing as a commandline tool, the package needs to be installed as explained above. Then, it can be executed in the commandline as follows:
```
 python3 -m pbalancing.parameter_balancing_core model.xml
```
where model.xml corresponds to the path of your SBML model. It is also possible to provide further input files, such as an SBtab parameter files (.tsv), an SBtab prior information file (.tsv), and an SBtab options file (.tsv) for the configuration of parameter balancing. Providing complete file information would look like this:
```
 python3 -m pbalancing.parameter_balancing_core model.xml --sbtab_data data_file.tsv --sbtab_prior prior_file.tsv --sbtab_options options_file.tsv
```
You can create a log file by setting the flag -l, you can use pseudo values to account for a lack of data by setting the flag -p, you can watch program outputs on your commandline by setting the flag -v. Information on the SBtab format can be found on www.sbtab.net, more information on the mentioned file types can be found in the parameter balancing manual, and example files can be found at the bottom of this page. If you do not want to install the pip package, you can still use the commandline modules from the standalone directory on Github. The usage works as a standard call of a Python module:
```
python3 cl_balancing.py model.xml
```
Here, as well, you can use the optional file provision like explained above. You will be required to install several Python packages, though. A list of these packages can be found in the requirements.txt at the root of the Github page.
Python code for parameter balancing Python3 source code for parameter balancing can be freely downloaded. Please see the instructions that come with the code.
Web2py application The web2py application for parameter balancing can be freely downloaded from the Github repository. Please see the instructions that come with the code.
Matlab code for parameter balancing Matlab code for parameter balancing is included in the Metabolic Networks Toolbox. For more information, install the MNT toolbox and type 'help mnt_parameter_balancing'.

Table of biochemical quantities

At the beginning of parameter balancing, a metabolic network structure is read from an SBML file. It is assumed that all kinetic rate laws will be substituted by modular rate laws and the necessary kinetic constants and some dynamic quantities (referring to a specific metabolic state of the system) are estimated from collected data. The relevant quantities are listed below.

The table is too large to be sensibly displayed on mobile. Please review on desktop or download the table in the Download section.

Quantity Type	Symbol	Unit	BiologicalElement	MathematicalType	PhysicalType	Dependence	PriorMedian	PriorStd	PriorGeometricStd	LowerBound	UpperBound	DataStd	DataGeometricStd	SBML element	UseAsPriorInformation	MatrixInfo
Standard chemical potential	μ⁰	kJ/mol	Species	Additive	Thermodynamic	Basic	0	500		-500	500	10		Global parameter	1	[I_species, 0, 0, 0, 0, 0, 0, 0]
Catalytic rate constant geometric mean	k^V	1/s	Reaction	Multiplicative	Kinetic	Basic	10		100	0.00000001	10000	10	1.2	Local parameter	1	[0, I_reaction, 0, 0, 0, 0, 0, 0]
Michaelis constant	k^M	mM	Reaction/Species	Multiplicative	Kinetic	Basic	0.1		10	0.0000001	1000	1	1.2	Local parameter	1	[0, 0, I_KM, 0, 0, 0, 0, 0]
Activation constant	k^A	mM	Reaction/Species	Multiplicative	Kinetic	Basic	0.1		10	0.0001	100	1	1.2	Local parameter	1	[0, 0, 0, I_KA, 0, 0, 0, 0]
Inhibitory constant	k^I	mM	Reaction/Species	Multiplicative	Kinetic	Basic	0.1		10	0.0001	100	1	1.2	Local parameter	1	[0, 0, 0, 0, I_KI, 0, 0, 0]
Concentration	c	mM	Species	Multiplicative	Dynamic	Basic	0.1		10	0.0000001	1000	1	1.2	Species (conc.)	1	[0, 0, 0, 0, 0, I_species, 0, 0]
Concentration of enzyme	u	mM	Reaction	Multiplicative	Dynamic	Basic	0.001		100	0.0000001	0.5	0.05	1.2	Local parameter	1	[[-1/RT * Nt], 0, 0, 0, 0, 0, 0, 0]
pH	pH	dimensionless	None	Additive	Dynamic	Basic	7	1		0	14	1		Global parameter	1	[0, 0, 0, 0, 0, 0, 0, 1]
Standard Gibbs energy of reaction	dmuO	kJ/mol	Reaction	Additive	Thermodynamic	Derived	0	500		-1000	1000	10		Global parameter	0	[Nt, 0, 0, 0, 0, 0, 0, 0]
Equilibrium constant	k^eq	dimensionless	Reaction	Multiplicative	Thermodynamic	Derived	1		100	0.00000000001	10000000000	10	1.2	Local parameter	1	[[-1/RT * Nt], 0, 0, 0, 0, 0, 0, 0]
Substrate catalytic rate constant	k^cat+	1/s	Reaction	Multiplicative	Kinetic	Derived	10		100	0.01	1000000000	10	1.2	Local parameter	1	[[-0.5/RT * Nt], I_reaction, [-0.5 * Nkm], 0, 0, 0, 0, 0]
Product catalytic rate constant	k^cat-	1/s	Reaction	Multiplicative	Kinetic	Derived	10		100	0.00000000001	1000000000	10	1.2	Local parameter	1	[[0.5/RT * Nt], I_reaction, [0.5 * Nkm], 0, 0, 0, 0, 0]
Chemical potential	μ	kJ/mol	Species	Additive	Dynamic	Derived	0	500		-500	500	10			0	[I_species, 0, 0, 0, 0, [RT * I_species], 0, 0]
Reaction affinity	A	kJ/mol	Reaction	Additive	Dynamic	Derived	0	500		-100	100	10			0	[[-1 * Nt], 0, 0, 0, 0, [-RT * Nt], 0, 0]
Forward maximal velocity	v^max+	mM/s	Reaction	Multiplicative	Dynamic	Derived	0.01		100	0.000000001	10000000	0.1	2	Local parameter	0	[[-0.5/RT * Nt], I_reaction, [-0.5 * Nkm], 0, 0, 0, I_reaction, 0]
Reverse maximal velocity	v^max-	mM/s	Reaction	Multiplicative	Dynamic	Derived	0.01		100	0.000000001	10000000	0.1	2	Local parameter	0	[[0.5/RT * Nt], I_reaction, [0.5 * Nkm], 0, 0, 0, I_reaction, 0]
Forward mass action term	thetaf	1/s	Reaction	Multiplicative	Dynamic	Derived	1		1000	0.000000001	100000000	1	2		0	[[-1/(2RT) h * Nt], I_reaction, - 1/2 * h * abs(Nkm), 0, 0, h * Nft, 0, 0]
Reverse mass action term	thetar	1/s	Reaction	Multiplicative	Dynamic	Derived	1		1000	0.000000001	100000000	1	2		0	[[ 1/(2RT) h * Nt], I_reaction, - 1/2 * h * abs(Nkm), 0, 0, h * Nrt, 0, 0]
Forward enzyme mass action term	tauf	mM/s	Reaction	Multiplicative	Dynamic	Derived	1		1000	0.000000001	100000000	1	2		0	[[-1/(2RT) h * Nt], I_reaction, - 1/2 * h * abs(Nkm), 0, 0, h * Nft, I_reaction, 0]
Reverse enzyme mass action term	taur	mM/s	Reaction	Multiplicative	Dynamic	Derived	1		1000	0.000000001	100000000	1	2		0	[[ 1/(2RT) h * Nt], I_reaction, - 1/2 * h * abs(Nkm), 0, 0, h * Nrt, I_reaction, 0]
Michaelis constant product	KMprod	mM	Reaction	Multiplicative	Kinetic	Derived	1	1000		0.001	1000	1	2	Local parameter	0	[0, 0, Nkm, 0, 0, 0, 0, 0]
Catalytic constant ratio	Kcatratio	dimensionless	Reaction	Multiplicative	Kinetic	Derived	1		10	0.00000000001	10000000000	1	2	Local parameter	0	[-1/RT * Nt], I_reaction, [-1 * Nkm], 0, 0, 0, 0, 0]

Remarks

Transformed thermodynamic quantities Note that the thermodynamic quantities refer to biochemical reactants (e.g.ATP) rather than chemical species (e.g. ATP^4-). Therefore, they represent transformed quantities. In Alberty's exact notation, they would be written as K' (for k^eq), μ' (for μ), and μ'⁰ (for μ⁰).
Additive and multiplicative quantities In parameter balancing, all quantities representing energies (in kJ/mol) are treated in their original scale, while all other quantities are converted to logarithmic scale (column "Scaling"). Since the latter quantities are described by log-normal (instead of normal) distributions, we need to distinguish between their mean and median values. For more information on this topic, see the review Biochemical thermodynamics by R. Alberty.
Basic quantities and derived quantities We further distinguish between basic and derived quantities (column "Type"). The difference is that the basis quantities can be freely chosen (e.g., as a result of an estimation), while the derived quantities depend on the basic quantities and are computed from them. In parameter balancing, we define typical ranges for the basic quantities by prior distributions and for the derived quantities by pseudo values.
Parameters in SBML When inserted into the SBML model, most quantities are inserted into kinetic rate laws as local parameters. Exceptions are concentrations (initial concentration attribute in species), standard chemical potentials (global parameters), and reaction affinities and chemical potentials, which could just be computed from other elements.
Importance of reaction orientation Some quantities depend on the specific definition of the reaction sum formula (nominal direction, appearance of small molecules like water); in particular for equilibrium constants, catalytic rate constants, and maximal velocities, make sure that the definitions match between model and data set.
Chemical potentials Chemical potentials and standard chemical potentials are estimated during parameter balancing (e.g., based on equilibrium constants and metabolite concentrations in the data file), but they cannot be provided directly as data. The purpose of this restriction is to avoid consistency problems due to different scaling conventions (e.g., the choice of offset values in Gibbs free energies of formation)

Frequently asked questions (FAQ)

Where is the parameter balancing method described? Parameter balancing is described in Lubitz et al. (2010). If you use parameter balancing in your work, please refer to this article. A modelling workflow based on parameter balancing is described in Stanford et al. (2013).
Clearing the session The variables of the browser session can be reset by clicking the 'Clear Session' button on bottom of the online balancing page. This can be handy, if e.g. large files have caused proxy errors and the tool subsequently shows random behaviour.
How does parameter balancing work mathematically? Parameter balancing employs Bayesian estimation to determine a consistent set of all model parameters. To use it efficiently, it is good to know about some of its details. For technical reasons, all quantities are internally converted to natural scaling. This means that for energy quantities (in kJ/mol), we keep the original values while for all other quantities, we take the natural logarithms. Furthermore, we distinguish between basic quantities and derived quantities (which are uniquely determined by the basic quantities). See the overview of all quantities shown above. During balancing, we integrate information from data (values and standard errors), prior distributions (typical values and spread for basic quantities), and pseudo values (typical values and spread for derived quantities). All these values and spreads are represented by normal distributions (priors, data with standard errors, pseudo values, and posteriors) for the naturally scaled quantities. When converting back to non-logarithmic values, we obtain log-normal distributions, which makes it crucial to distinguish between median and mean values. Eventually, the median values (which are more realistic and guaranteed to satisfy the relevant constraints) are inserted into the model.
Which kinetic rate laws are assumed in parameter balancing? Parameter balancing is based on modular rate laws, a generalised version of the convenience kinetics. The modular rate laws include reversible mass-action and reversible Michaelis-Menten rate laws as special cases. Modular rate laws are also supported by SBMLsqueezer, which allows you to directly insert rate laws into SBML models. In parameter balancing, rate laws and rate constants can be directly inserted into your model at the end of the workflow. Note that all rate laws previously present in your model will be removed.
What files do I need to prepare? A valid SBML file and a corresponding data file, provided in the SBtab format.
Where can I find example files? A number of example files (SBML models and SBtab data tables for parameter balancing) can be found here.
What is the SBtab format? SBtab tables can express various kinds of data. For the parameter balancing you need an SBtab file of the type "Quantity". You can also edit the prior distribution for the different parameter types by providing an SBtab file of the type "QuantityInfo". Finally, you can configure several options for the parameter balancing process by providing an SBtab file of the type "Config". Examples for each of these SBtab files are on the corresponding pages of the workflow and in the Downloads section. The current SBtab format is explained here. SBtab files are provided in the .tsv-format (tab separated values).
What does a SBtab file look like? In writing SBtab files, please regard the following rules:
1. The SBtab file begins with a specification line: "!!SBtab TableType='QuantityType' Level='1' Version '0.1'". Please do not alter this specification row, it will lead to trouble.
2. The next row holds the headers of the different columns. Each header has to be started with an exclamation mark to be recognized as a header name properly.
3. There are several mandatory column headers: !QuantityType, !SBMLReactionID, ! SBMLSpeciesID, !Mean, !Std, !Unit
4. Please consider the following rules concerning the columns:
  - !QuantityType can only hold the specific names of parameter types. You can have a look at the QuantityTypes.tsv-file in the Download area to see which those are. Watch spelling!
  - !SBML:reaction:id and !SBML:species:id hold the identifiers of the corresponding reaction and species. Watch out: they HAVE TO be the same identifier names like in the SBML file that is used. Watch spelling, too. No capital switches or anything.
  - !Mean: this column HAS TO hold a numeric value. If you do not have one, you do not need this row.
  - !Std: the standard deviation of the mean value. This can be left blank, if you have no std.
  - !Unit: please enter the unit of the mean value here.
  - Please note that different parameter types need either the !SBML:reaction:id (e.g. equilibrium constant), or the !SBML:species:id (e.g. concentration), or even both (e.g. Michaelis constant). If this is not entered correctly, the value will not be taken into account. Whether you need Reaction, Species, or both you can see in the default prior table.
  - To support consistent style, please stick to the presented order of the columns.
Where can I find kinetic and other data for parameter balancing? The parameter balancing relies on a posterior distribution of the different parameter types and on a collection of kinetic data provided by the user. Typical input data for estimating kinetic parameters comprise catalytic constants (kcat values), Michaelis-Menten constants (KM values), equilibrium constants, standard reaction Gibbs free energies, and Gibbs free energies of formation. Typical input data for estimating metabolic states also comprise metabolite concentrations. Data obtained from experiments can be found in the literature and in web resources:
- A large collection of kinetic data is provided by the BRENDA Enzyme Database.
- Thermodynamic data for many reactions can be obtained from the website eQuilibrator. This comprises calculated equilibrium constants and standard reaction Gibbs free energies for different values of pH and ionic strength. We provide a collection of these data for parameter balancing here.
What physical units are used? The units are predefined in the prior table and cannot be changed, unless you provide your own customised prior table.
Can I also use flux data? Metabolic fluxes do not directly fit into the dependence scheme that parameter balancing uses internally to link different quantitities. Therefore, fluxes cannot be used as input data, nor can they be predicted directly. However, they can be used in an indirect way, as described in Stanford et al. (2013). The idea is as follows: In parameter balancing including metabolite levels and reaction Gibbs free energies, known flux directions can be used to define the signs of all reaction Gibbs free energies. The resulting rate laws and metabolite levels will be consistent with the predefined fluxes. The kinetic model, parametrised in this way, and with the balanced metabolite levels, will yield reaction rates with the same signs as the predefined fluxes. By rescaling the Vmax values (i.e., scaling the catalytic constants, enzyme levels, or both), reaction rates and fluxes can be matched. The resulting model will correlate to the predefined flux distribution by construction. Note that, in order for this to work, the predefined fluxes must be thermodynamically feasible (i.e., loop-free, and realisable for the (potentially predefined) external metabolite levels.
How can I define or modify the priors on model parameters? Experimental data alone will usually not suffice to determine all model parameters. To determine underdetermined parameters, and to keep parameters in realistic ranges, parameter balancing uses prior distributions and constraints for each type of parameter. These priors and constraints are defined in a data table, which can be customised by the user. The table is described here.
What known caveats exist in parameter balancing? Parameter balancing makes specific assumptions about the rate laws used. In some cases, this can lead to problems, or parameter balancing may not be suitable for your modelling. Here we list points that can typically lead to problems:
- Large models Large networks lead to large parameter sets, which increase the numerical effort of parameter balancing. One possibility to avoid this problem is to use fixed, precalculated equilibrium constants. Then, the kinetic constants of each reaction can be balanced separately, which reduces the effort. In our code for parameter balancing, there is a restriction on model size to avoid numerical problems.
- Biomass reaction or polymerisation reactions Many metabolic models (especially, models used in flux balance analysis) contain a "biomass" reaction that involves a large number of compounds with largely varying stoichiometric coefficients. Modular rate laws, as assumed in parameter balancing, use the stoichiometric coefficients as exponents in the formula. For biomass reactions or polyerisation reactions, this is not very realistic as assumption. Furthermore, these reactions usually do not have to be thermodynamically consistent. For both reasons, it is advisable to discard the automatically proposed kinetics, and insert a more realistic kinetics instead (see, for instance, the rate laws proposed in Hofmeyr et al., (2013)).
- Very large or small parameter values Very large or small parameter values can lead to unrealistic models and numerical problems. Extreme values should be avoided by using proper bounds. In any case, we suggest to have a look at all balanced values and to see if they are in realistic ranges.
- Large uncertainties and strongly shifted mean values Each balanced parameter value comes with an uncertainty. The uncertainties are described by normal distributions for the logarithmic parameter values, so median and mean value on logarithmic scale will be identical. For the non-logarithmic values, we obtain a log-normal distribution, and median and mean value will differ. If the uncertainty is large (which can easily happen if a value is not constrained by any data values), the mean and median can become very different, and the mean value can become very high. This can be avoided by reducing the uncertainty range - by providing more data that constraints the parameter value, by using narrower priors or constraints, or by using "pseudo" values.
- What if the model cannot be simulated? Problems in simulating the model (e.g., using COPASI) may be caused by unrealistically high or low parameter values. If you notice such parameter values in your model and would like to avoid them, you may use tighter priors or pseudo values to exclude extreme parameter values.
- How are enzymes handled in the balancing? In SBML, enzymes are sometimes treated as SBML species and sometimes as SBML parameters; this is mainly up to the modeller. In parameter balancing, enzymes need to be parameters. If they are provided as species, they cannot correctly be assigned as reaction modifiers and thus are ignored.
- Annotation of modifiers as activators or inhibitors To recognise a reaction modifier, listed in an SBML model, as an activator or inhibitor, the parameter balancing code relies on SBO terms in the model. If these terms are missing (which is the case in many existing models), the allosteric regulator will be ignored and no activation or inhibition constant will be estimated.
Can I sample data sets from the posterior distribution? The Matlab version of parameter balancing allows for sampling parameter sets. To generate a sample, a parameter set is randomly sampled from the (multi-variate Gaussian) posterior. If the sample violates constraints, it is replaced by a parameter vector that satisfies all constraints and is closest to the initially sampled vector, where "closeness" is defined by a quadratic norm based on the posterior covariance matrix. The resulting constrained samples satisfy all constraints and give an impression about the posterior uncertainties, but they do not strictly represent the posterior distribution (which is defined as the multi-variate unconstrained Gaussian posterior, constrained to the feasible region).
What happens to reactions with many reactants? Reactions with many reactants (e.g., biomass-producing reactions) are not properly described by modular rate laws, and parameters estimated for such reactions should at least be taken with care. We recommend to remove such reactions from the model before running parameter balancing.
What happens to reactions without any substrate or without any product? Reactions without any substrate or without any product are not properly described by modular rate laws, and parameters estimated for such reactions should be taken with care. We recommend to remove such reactions from the model before running parameter balancing.
What happens to reactions with unusual stoichiometric coefficients? Reactions with high stoichiometric coefficients are not properly described by modular rate laws; in reactions with non-integer stoichiometric coefficients, it is likely that these coefficients do not properly describe molecularities. To avoid problems in such cases, our code allows only stoichiometric coefficients of 1 and 2. All other values (non-integer or values larger than 2) are internally replaced by values of 1. This rule is likely to yield realistic balanced parameter sets; however when checking Haldane relationships with these parameters, please not that the molecularities in these Haldane relationships must represent the adjusted (and not the original) stoichiometric coefficients. In case of doubt, we recommend to remove such reactions from the model before running parameter balancing.
What happens to irreversible reactions? Parameter balancing is designed to assume thermodynamic correctness, which implies reversible rate laws. Some reactions (e.g., macromolecule synthesis or biomass-producing reactions) are practically irreversible. In parameter balancing, these reactions will still be treated as reversible, but will obtain very large equilibrium constants. In some cases, this may lead to numerical problems. It is recommended to remove such reactions from the model before running parameter balancing. Likewise, the rate laws inserted in the SBML model will have a reversible form.
What is the purpose of pseudo values? In parameter balancing, some types of parameters are treated as "basic" (e.g., standard chemical potentials), while others are treated as "derived" (e.g., equilibrium constants). By considering independent marginal priors for all basic parameters, we obtain an uncorrelated prior distribution for the subset of basic parameters, and an ensuing distribution for the derived parameters. However, the variances of derived parameters in this distribution is still large. Therefore, the prior is modified by assuming "pseudo values" for derived parameters. The result is correlated prior distribution for all (basic and derived) parameter types. In this prior distribution "with pseudo values", basic and derived parameters are treated on an equal footing, resulting in realistic variances for all parameters. Please note that pseudo values are not a simple replacement for missing data values: instead, a pseudo value and a data value for the same parameter will be used at the same time.
Enzymes as model species The parameter balancing tool assumes that the SBML model contains all reactions and metabolites, but not the catalysing enzymes. If enzymes appear explicitly in the model as , they need to be tagged as enzymes by an SBO term. Otherwise, they will be treated as metabolites. In the parameter balancing results, this may lead to redundant (and contradictory) results, defining a "concentration" of the enzyme, and a (contradictory) "concentration of enzyme" of the enzyme-catalysed reaction.
How can I impose lower or upper bounds on data values? For each parameter type (e.g. substrate catalytic constant), lower and upper bounds are defined in the (modifiable) pb_prior file. Lower and upper bounds for individual parameters (e.g., the substrate catalytic constant of a specific reaction in the model), lower and upper bounds can be defined in the data file. Specifically in the case of catalytic constants, lower bounds could derived from measured flux and proteomics data, and an upper bound is given by the diffusion limit.
How are the data values preprocessed? When the data file is read, some checks are performed and changes are made in the following order. (i) If a zero value is given for a multiplicative quantitity, this value is ignored. (ii) If a data value is outside the allowed bound for this type of parameter, it is ignored. (iii) If a zero standard deviation is given for a value, this standard deviation is ignored. (iv) If a data value has no standard deviation, a default standard deviation (column DataStd from the prior file) is inserted. In the Matlab version, it is also possible to use default geometric standard deviations (column DataGeomStd from the prior file) instead. (v) If several data values are given for the same parameter (e.g., several values for the same equilibrium constant), the arithmetic mean of these values is used as the final data value (both for additive and multiplicative quantities), and the arithmetic mean of the standard deviations is used as the final standard deviation. After this procedure, each model parameter has either no or one data value, and a data value is characterised by arithmetic mean and standard deviation). For multiplicative parameters, the arithmetic mean and standard deviation (for an assumed log-normal distribution) and then translated into the arithmetic mean and standard deviation of the logarithmic parameter values (assumed to follow a normal distribution).
Is parameter balancing limited to metabolic models or could it be applied to other biological models? A specific problem in metabolic systems (which, in parameter balancing, is turned into an advantage) is the fact that kinetic constants are dependent through Wegscheider conditions (for equilibrium constants) and Haldane relationships (between equilibrium constants, turnover rates, and Michaelis-Menten constants); and, given the flux directions, equilibrium constants and metabolite concentrations are dependent as well. This is the reason why in parameter balancing those parameters (and also state variables such as metabolite concentrations) are not estimated or sampled independently, but using a linear dependence scheme in the background. Some parameters, e.g. the allosteric inhibition and activation constants, are independent of all other parameters and can be manually chosen by the modeller. The parameter balancing would not be necessary in this case, and they are only included in our software for convenience. Coming back to non-metabolic systems: even though the laws of thermodynamics apply to any biochemical network (including, for example, signalling pathways), thermodynamic constraints are usually not considered in these models: in particular, many reactions (e.g., phosphorylation by kinases) are described by irreversible kinetics, which renders Wegscheider conditions and Haldane relationships obsolete. As in the case of allosteric rate constants, applying parameter balancing is simply not necessary, and the different model parameters can be chosen at will, without having to care about (and without having the opportunity of exploiting) their interdependencies.
Aside from from these theoretical considerations, there is also a practical answer. The formalism of Parameter Balancing applies to any sorts of model parameters and to any sorts of dependencies between them, as long as these dependencies are linear (either between the model parameters themselves, or between their logarithms). The user can customise our software to handle any parameters that satisfy this condition by editing the “Prior distribution” file. The necessary steps are: (i) identify which new parameter types should be used, and whether they are “additive” (satisfying linear relationships) or “multiplicative” (satisfying linear relationships on logarithmic scale). (ii) Choose a subset of parameter types to be “independent”, and specify the dependencies of all other parameter types in the form of symbolic matrices (column “MatrixInfo” in the “Prior distribution” file). (iii) Modify the “Prior distribution” file by adding all the new information.
Is there a maximal model size? In order to avoid long calculation times, models are currently limited to a maximal size of 250 reactions. You can bypass this on your own risk by modifying the code.
Who can answer my other questions? Please refer to the Contact Page.