21.05.2014 Views

QRPEM – A New Standard of Accuracy, Precision, and ... - Pharsight

QRPEM – A New Standard of Accuracy, Precision, and ... - Pharsight

QRPEM – A New Standard of Accuracy, Precision, and ... - Pharsight

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>QRPEM</strong> <strong>–</strong> A <strong>New</strong> <strong>St<strong>and</strong>ard</strong> <strong>of</strong> <strong>Accuracy</strong>, <strong>Precision</strong>, <strong>and</strong> Efficiency<br />

in NLME Population PK/PD Methods<br />

<strong>Pharsight</strong>® ‐ A Certara Company<br />

Bob Leary<br />

Mike Dunlavey<br />

Jason Chittenden<br />

Brett Matzuka<br />

Serge Guzy<br />

[June 3, 2011]<br />

Summary: A new accurate likelihood EM estimation method <strong>QRPEM</strong> (Quasi‐r<strong>and</strong>om Parametric<br />

Expectation Maximization) is under development for a near future release <strong>of</strong> Phoenix® NLME.<br />

The method is in the same general accurate likelihood EM class as the recently introduced<br />

methods IMPEM in NONMEM 7, MCPEM in S‐ADAPT, <strong>and</strong> SAEM in MONOLIX, S‐ADAPT, <strong>and</strong><br />

NONMEM 7. The <strong>QRPEM</strong> method is distinguished by its use <strong>of</strong> low discrepancy (also called<br />

‘quasi‐r<strong>and</strong>om’) Sobol sequences as the core sampling technique in the expectation step, as<br />

opposed to the stochastic Monte Carlo sampling techniques used in the other EM methods. The<br />

theoretical best case accuracy for QR sampling is an error that decays as N ‐1 , where N is the<br />

number <strong>of</strong> samples. This represents an enormous advantage over the slower N ‐1/2 error decay<br />

rate characteristic <strong>of</strong> stochastic sampling. The fundamental characteristics <strong>of</strong> the types <strong>of</strong><br />

problems typically encountered in the population PK/PD NLME domain are relatively low<br />

dimensionality <strong>and</strong> high degree <strong>of</strong> smoothness <strong>of</strong> the function being sampled. This known to be<br />

the ideal case for application <strong>of</strong> QR techniques <strong>and</strong> suggests that the best case N ‐1 behavior may<br />

in fact be achievable.<br />

Implementation <strong>of</strong> the basic <strong>QRPEM</strong> algorithm has been completed <strong>and</strong> initial testing begun.<br />

Test results so far indicate conclusively that the theoretical advantages <strong>of</strong> QR are being realized.<br />

For an extreme example, a difficult very nonlinear PD test problem with sparse data from the<br />

recent pharmacometrics literature required 2.5 minutes <strong>and</strong> a sample size <strong>of</strong> N=500 to obtain a<br />

high quality, accurate estimate with <strong>QRPEM</strong>. The stochastic sampling method IMPEM in<br />

NONMEM 7, when run for the comparable length <strong>of</strong> time <strong>of</strong> five minutes with 500 samples,<br />

produced a significantly less accurate estimate when compared to known value <strong>of</strong> parameters<br />

used to simulate the data. However, as the number <strong>of</strong> samples was increased with IMPEM, the<br />

results steadily improved toward the <strong>QRPEM</strong> estimate. At a sample size <strong>of</strong> N=200,000 <strong>and</strong> 50.2<br />

1 | P age ©2011 Tripos, L.P. All Rights Reserved


hours, the IMPEM method converged almost exactly to the <strong>QRPEM</strong> result. Similarly, NONMEM<br />

IMPEM <strong>and</strong> <strong>QRPEM</strong> were compared on a test problem used by the French National Institute <strong>of</strong><br />

Health <strong>and</strong> Medical Research in 2005 to evaluate all available methods at that time, including<br />

precursors to the current MCPEM <strong>and</strong> SAEM algorithms in NONMEM <strong>and</strong> MONOLIX. All ten <strong>of</strong><br />

the methods available in 2005 failed to accurately estimate a particularly difficult parameter,<br />

with SAEM obtaining the best result with an overestimate <strong>of</strong> 50%. When <strong>QRPEM</strong> <strong>and</strong> the<br />

current NONMEM IMPEM were both run on this problem with a sample size <strong>of</strong> N=300,<br />

NONMEM IMPEM required approximately 5 hours <strong>and</strong> produced an overestimate <strong>of</strong> 76% on the<br />

parameter in question, while <strong>QRPEM</strong> obtained a very accurate estimate within 2.2% <strong>of</strong> the true<br />

parameter value within an hour. Further extensive testing on the 150 test cases in the<br />

MONOLIX test set that we have been using to evaluate the current release <strong>of</strong> Phoenix NLME<br />

confirms that <strong>QRPEM</strong> is very reliable <strong>and</strong> faster <strong>and</strong> much more precise, accurate, <strong>and</strong><br />

repeatable from different starts than either NONMEM IMPEM or NONMEM SAEM.<br />

Introduction<br />

The “Holy Grail” <strong>of</strong> population PK/PD methodology is an NLME algorithm that reliably computes<br />

maximum likelihood (ML) parameter estimates in reasonable times on the types <strong>of</strong> models <strong>and</strong> data sets<br />

typically encountered in the pharmacometrics community. ML estimates have many optimal statistical<br />

properties <strong>and</strong> are usually considered the most desirable type <strong>of</strong> estimate, at least in the frequentist (as<br />

opposed to Bayesian) approach to statistics which dominates pharmacometric analysis.<br />

Unfortunately, the most widely used methods, starting with the introduction <strong>of</strong> FO in the initial 1978<br />

release <strong>of</strong> NONMEM <strong>and</strong> continuing with FOCE, FOCEI <strong>and</strong> LAPLACE introduced in later versions <strong>of</strong><br />

NONMEM, SPLUS, <strong>and</strong> SAS, have fallen considerable short <strong>of</strong> this mark. These methods are based on<br />

the maximum likelihood approach, but due to difficulties in computing exact marginal likelihoods,<br />

employ likelihood approximations <strong>of</strong> varying quality. In the case <strong>of</strong> FO, the approximation is usually<br />

relatively poor <strong>and</strong> the overall quality <strong>of</strong> an FO estimator can be correspondingly very low (sometimes<br />

appallingly so). In general, FOCE (particularly with interaction), <strong>and</strong> LAPLACE estimation methods are<br />

typically better, but still do not achieve the quality level <strong>of</strong> a true ML estimate. There is no way to<br />

know in advance how much <strong>of</strong> an accuracy penalty will be incurred by using a particular approximate<br />

likelihood method. However, it is known that, for example, high degrees <strong>of</strong> model nonlinearity, large<br />

inter‐individual variability in the structural parameter population distribution, large residual errors<br />

variances, <strong>and</strong> sparse data in the form <strong>of</strong> relatively few observations per subject all tend to magnify this<br />

penalty. These types <strong>of</strong> difficult conditions are <strong>of</strong>ten encountered in practice.<br />

In addition to the accuracy penalty, another major drawback to the approximate likelihood methods is<br />

poor reliability. They all use formal gradient‐based likelihood optimization methods that are numerically<br />

delicate <strong>and</strong> prone to failure.<br />

2 | P age ©2011 Tripos, L.P. All Rights Reserved


During the past 5 years, new NLME (Nonlinear Mixed Effects) population PK/PD methods based on<br />

various versions <strong>of</strong> the stochastic EM (Expectation Maximization) algorithm have become widely<br />

available. These have started to gain traction in the pharmacometrics community as potentially superior<br />

alternatives to the traditional approximate likelihood methods. One <strong>of</strong> two major types <strong>of</strong> the new<br />

accurate likelihood methods is generically called MCPEM (Monte Carlo Parametric Expectation<br />

Maximization), which is implemented in NONMEM 7 as IMPEM (importance sampling PEM) <strong>and</strong> in S‐<br />

ADAPT <strong>and</strong> PDx‐MCPEM as MCPEM . All three <strong>of</strong> these are based on importance sampling Monte Carlo<br />

implementations <strong>of</strong> the E‐step <strong>and</strong> are fundamentally similar. The other type is called SAEM (stochastic<br />

approximation EM), which uses a Monte Carlos Markov Chain implementation <strong>of</strong> the E‐step. This is<br />

implemented in different but again fundamentally similar versions in NONMEM 7, S‐ADAPT, <strong>and</strong><br />

MONOLIX under the name SAEM.<br />

MCPEM <strong>and</strong> SAEM address both the accuracy <strong>and</strong> reliability problems associated with the classical<br />

likelihood methods. They do not use likelihood approximations, <strong>and</strong> therefore avoid the inherent bias<br />

<strong>and</strong> inaccuracy issue. They also do not use formal numerical optimization methods to maximize the<br />

likelihood, so they avoid the catastrophic failure modes inherent in that approach. Unlike the<br />

approximate likelihood methods, at least in principle MCPEM <strong>and</strong> SAEM will converge to the true<br />

maximum likelihood estimate if arbitrarily large computational effort is invested. Of course, any<br />

practical algorithm must be terminated in reasonable wall clock times, so the actual results may fall<br />

somewhat short <strong>of</strong> the true maximum likelihood estimate goal. However, experience with MCPEM <strong>and</strong><br />

SAEM shows they <strong>of</strong>ten produce better estimates than the approximate likelihood methods in similar or<br />

even smaller amounts <strong>of</strong> computational time, <strong>and</strong> also are far more numerically reliable <strong>and</strong> stable.<br />

Both MCPEM <strong>and</strong> SAEM work by imputing successively more plausible <strong>and</strong> likely collections <strong>of</strong> sample<br />

values <strong>of</strong> the structural parameters <strong>of</strong> the models from a conditional distribution <strong>of</strong> those parameters.<br />

The estimates <strong>of</strong> fixed effects (THETAS), r<strong>and</strong>om effect parameters (OMEGA), <strong>and</strong> residual error<br />

parameters (SIGMA) are performed at each iteration by applying simple algebraic statistical formulas<br />

that compute means <strong>and</strong> covariance matrices <strong>of</strong> the imputed values. MCPEM <strong>and</strong> SAEM differ primarily<br />

in how the conditional distributions are sampled <strong>–</strong> MCPEM typically collects several hundred (or more)<br />

samples at a time from the conditional distribution for each subject using importance sampling, <strong>and</strong><br />

then updates the parameters <strong>of</strong> interest from the means <strong>and</strong> covariances <strong>of</strong> these samples. SAEM uses<br />

Markov Chain Monte Carlo techniques to sample from the conditional distribution <strong>and</strong> updates much<br />

more frequently, <strong>of</strong>ten after just a single sample from each subject. The SAEM parameter estimates are<br />

obtained as running averages <strong>of</strong> the means <strong>and</strong> covariances <strong>of</strong> these small samples over successive<br />

iterations, after an initial burn‐in period.<br />

Both methods are inherently stochastic, <strong>and</strong> in principle accuracy <strong>and</strong> precision can be increased to<br />

achieve results arbitrarily close to the true maximum likelihood estimate by simply taking more samples<br />

(in the case <strong>of</strong> MCPEM) or running for more iterations (in the case <strong>of</strong> SAEM). In general, the error<br />

(imprecision) associated with stochastic estimates typically decays with the sample size (or in the case <strong>of</strong><br />

SAEM, iteration count) N as N ‐1/2 . So in order to reduce the error by a factor <strong>of</strong> 10, the sample size must<br />

be increased by a factor <strong>of</strong> 100.<br />

3 | P age ©2011 Tripos, L.P. All Rights Reserved


We have recently adapted a sampling methodology based on low discrepancy Sobol sequences (a<br />

particular type <strong>of</strong> so‐called quasi‐r<strong>and</strong>om (QR) numbers, or their more recent generalization, (T, M, S)<br />

nets) to an MCPEM‐like importance sampling algorithm. In theory, quasi‐r<strong>and</strong>om sampling provides<br />

much better error decay behavior <strong>of</strong> approximately N ‐1 for the computation <strong>of</strong> the means <strong>and</strong> variances<br />

that go into the parameter update formulas for MCPEM. This is a huge advantage if indeed this<br />

theoretical behavior is realized in practice. For example, sparse data sets with relatively few<br />

observations per subject typically require much more intensive sampling to obtain ‘good’ EM<br />

estimates than denser data. A quite practical <strong>and</strong> fairly modest QR sample size <strong>of</strong> 500 samples per<br />

subject is theoretically equivalent to a Monte Carlo (MC) r<strong>and</strong>om sample size <strong>of</strong> 250000, a level that<br />

stretches the limits <strong>of</strong> practicality. As shown in Test Case 1 described below, it is quite easy to find<br />

examples that require this sampling intensity level, <strong>and</strong> for such examples, the QR‐based version should<br />

ideally run 500 times faster than the MC version to produce equivalently good estimates.<br />

We call the new algorithm <strong>QRPEM</strong>, <strong>and</strong> plan to introduce it in a near future release <strong>of</strong> Phoenix NLME. It<br />

is currently undergoing extensive testing to verify that indeed the theoretical advantages <strong>of</strong> quasir<strong>and</strong>om<br />

sampling are realized in practice. This white paper places <strong>QRPEM</strong> in historical context <strong>and</strong><br />

summarizes some <strong>of</strong> the initial results <strong>of</strong> our current testing effort. The key result is that all testing to<br />

date indicates that the large theoretical advantage <strong>of</strong> the QR approach indeed translates into greatly<br />

improved accuracy, precision, <strong>and</strong> efficiency <strong>of</strong> QPREM relative to the r<strong>and</strong>om sampling based<br />

methods. Moreover, the level <strong>of</strong> improvement is in quantitative agreement with that predicted by<br />

best case QR theory.<br />

Historical Overview <strong>of</strong> NLME Methods <strong>and</strong> a Perspective on the Importance <strong>of</strong><br />

Error Scaling Behavior<br />

The first method to achieve widespread use for parametric population PK/PD estimation was FO as<br />

introduced in 1978 in the initial version <strong>of</strong> NONMEM. FO is now known to have very poor statistical<br />

properties <strong>–</strong> for example, it is probably the only method still widely used that is not consistent.<br />

Consistent population NLME methods have the property that if the underlying model is correct <strong>and</strong> both<br />

the number <strong>of</strong> subjects <strong>and</strong> number <strong>of</strong> observation per subject increase without bound, the estimates<br />

converge to the true model parameter values. Alan Schumitzky at USC showed that not only is FO not<br />

consistent, it has the unfortunate property that as the number <strong>of</strong> subjects <strong>and</strong> amount <strong>of</strong> data per<br />

subject increases without bound, the FO estimates can actually diverge to arbitrarily poor values. FO is<br />

still in use today despite its poor statistical properties because it is the fastest <strong>and</strong> most numerically<br />

reliable <strong>of</strong> the approximate likelihood methods, <strong>and</strong> sometimes on difficult models it is the only<br />

approximate likelihood method that will produce any result at all.<br />

Starting in the 1990’s, the more accurate FOCE , FOCEI, <strong>and</strong> LAPLACE approximate likelihood methods<br />

based were introduced in NONMEM , SPLUS, <strong>and</strong> SAS. These methods have much better statistical<br />

properties than FO, but are much more computationally expensive, much less numerically reliable, <strong>and</strong><br />

still can be quite biased <strong>and</strong> inaccurate, particularly for sparse data, large inter‐individual variability, <strong>and</strong><br />

4 | P age ©2011 Tripos, L.P. All Rights Reserved


very nonlinear models. Nevertheless, FOCE with interaction is probably the most widely used population<br />

PK/PD method currently available.<br />

The first widely available <strong>and</strong> practical accurate likelihood method was NPEM (NonParametric EM)<br />

introduced by Schumitzky <strong>of</strong> USC in 1991[1]. This method is fundamentally different than the more<br />

usual parametric methods in that it makes no distributional assumptions for r<strong>and</strong>om effects. NPEM is a<br />

grid based method that has relatively poor error scaling properties <strong>–</strong> error decays as N ‐1/d , where d is the<br />

number <strong>of</strong> r<strong>and</strong>om effects <strong>and</strong> N is the number <strong>of</strong> grid points used. Therefore NPEM can be<br />

computationally expensive, particularly if very accurate results are required. For example, in 2000 Bob<br />

Leary at UCSD implemented NPEM on Blue Horizon at the San Diego Supercomputer Center, at the time<br />

the world’s fastest non‐classified supercomputer. In what is probably still the computationally most<br />

intensive single population PK/PD job ever attempted, a high accuracy NPEM model <strong>of</strong> pipericillin with 6<br />

r<strong>and</strong>om effects <strong>and</strong> several tens <strong>of</strong> millions <strong>of</strong> grid points was successfully run in approximately 2300<br />

CPU‐hours (1152 processors for 2 wall‐clock hours). As an indication <strong>of</strong> the significance <strong>of</strong> error scaling<br />

properties, we note that Leary later developed a much more efficient version called NPAG (nonparametric<br />

adaptive grid) that preserved the use <strong>of</strong> exact nonparametric likelihoods but improved error<br />

scaling behavior to approximately 1/(number <strong>of</strong> iterations) <strong>and</strong> used only a small number <strong>of</strong> grid points.<br />

NPAG was able to compute an even more accurate pipericillin model estimate than the large scale<br />

2000+ CPU‐hour NPEM computation from scratch in less than 10 minutes on a single PC. An improved<br />

version <strong>of</strong> NPAG is the currently the nonparametric method implemented in Phoenix NLME.<br />

The first accurate likelihood parametric methods began to appear as research implementations in the<br />

early 2000’s. Bob Bauer <strong>and</strong> Serge Guzy developed MCPEM, the precursor to the current SADAPT <strong>and</strong><br />

NONMEM importance sampling EM methods. Leary at UCSD/SDSC developed PEM (Parametric EM<br />

method), which used a much cruder version <strong>of</strong> importance sampling than MCPEM but introduced the<br />

use <strong>of</strong> quasi‐r<strong>and</strong>om sampling techniques. The first version <strong>of</strong> SAEM that later evolved into MONOLIX<br />

was introduced in France by B. Delyon, M. Lavielle, <strong>and</strong> E. Moulines (with many subsequent<br />

contributors).<br />

By 2004, these accurate likelihood methods had attracted sufficient attention that INSERM, the French<br />

National Institute <strong>of</strong> Health <strong>and</strong> Medical Research, sponsored an inter‐method blind comparison<br />

exercise that compared SAEM, PEM, <strong>and</strong> MCPEM against each other <strong>and</strong> against various<br />

implementations <strong>of</strong> the traditional FO, FOCE, <strong>and</strong> LAPLACE approximate likelihood methods. In the<br />

initial 2004 exercise, 100 replicates <strong>of</strong> a simulated fairly sparse dataset for a simple EMAX PD model<br />

were distributed to all participants, who did not know the true values <strong>of</strong> the parameters used in the<br />

simulations. The participants returned the results to the organizers for analysis <strong>and</strong> scoring. A followon<br />

blind comparison exercise was done in 2005 on a one compartment PK model with oral first order<br />

absorption (with a rather unusual parameterization that avoids the ‘flip/flop’ phenomenon associated<br />

with this model), again with somewhat sparse data. Each <strong>of</strong> the methods was run by an acknowledged<br />

expert (in the case <strong>of</strong> each the new EM methods, one <strong>of</strong> the originators <strong>of</strong> that method). Results were<br />

revealed in at a meeting <strong>of</strong> all participants (approximately 10 methods in all were compared) in July,<br />

2005 in Lyon, France <strong>and</strong> also presented at PAGE in 2005 in Pamplona [3]. The main evaluation criteria<br />

were degree <strong>of</strong> bias in the parameter estimates, <strong>and</strong> relative precision (root mean square error between<br />

5 | P age ©2011 Tripos, L.P. All Rights Reserved


estimates <strong>and</strong> true values <strong>of</strong> parameters used to simulate the data). All three accurate likelihood EM<br />

methods were the top performers in both categories, with SAEM being the most precise <strong>and</strong> PEM the<br />

least biased. As expected, the approximate likelihood methods LAPLACE <strong>and</strong> FOCE were ranked<br />

significantly lower on both criteria, <strong>and</strong> FO was the worst performer by far in both categories.<br />

The <strong>QRPEM</strong> method currently under implementation for Phoenix NLME is a greatly improved version <strong>of</strong><br />

the original PEM method that now uses a much more sophisticated <strong>and</strong> efficient version <strong>of</strong> importance<br />

sampling. It implements quasi‐r<strong>and</strong>om sampling using Sobol low discrepancy sequences, including some<br />

recent new ‘scrambling’ techniques [6] developed by Owen at Stanford that further improves the<br />

numerical integration performance <strong>of</strong> the basic quasi‐r<strong>and</strong>om approach.<br />

Quasi‐r<strong>and</strong>om Numerical Integration<br />

Stochastic EM methods such as MCPEM require the numerical integration <strong>of</strong> a conditional density<br />

function over a d‐dimensional parameter space for each subject to find a normalizing factor, mean, <strong>and</strong><br />

covariance. Here d is the number <strong>of</strong> r<strong>and</strong>om effects. Usually in NLME population PK/PD models, d<br />

ranges from 1 to a practical maximum <strong>of</strong> around 20 with 2 to 6 being fairly typical values. This integral<br />

can always be transformed to an integral over a d‐dimensional unit box (hypercube) with all coordinate<br />

ranging between 0 <strong>and</strong> 1.<br />

The numerical integral values are used in simple algebraic formulas to compute updates for the model<br />

parameters to be estimated. If the numerical integrals are sufficiently accurate, it can be shown that the<br />

likelihood improves after each update. Since numerical integration is usually a very stable process,<br />

much more so than the gradient <strong>–</strong>based formal numerical optimization methods used by the traditional<br />

methods that optimize approximate likelihoods, the stochastic EM methods are much more reliable in<br />

terms <strong>of</strong> avoiding catastrophic numerical failures. However, the quality <strong>of</strong> the estimates <strong>and</strong> the<br />

convergence properties <strong>of</strong> the EM algorithms depend strongly on the accuracy <strong>of</strong> these numerical<br />

integrals.<br />

In MCPEM using r<strong>and</strong>om sampling, the error in the integrals is proportional to N ‐1/2 , where N is the<br />

number <strong>of</strong> samples. Thus to reduce the error by a factor <strong>of</strong> 10, the number <strong>of</strong> samples must be<br />

increased by a factor <strong>of</strong> 100. The primary reason for this relatively slow error decay rate is that r<strong>and</strong>om<br />

samples do not cover the unit d‐dimensional hypercube very evenly <strong>–</strong> by chance, some areas are always<br />

populated more densely than others, <strong>and</strong> relatively large areas may not be sampled at all.<br />

Perhaps somewhat counter‐intuitively, a regular rectangular grid on the hypercube, which seemingly<br />

samples quite uniformly, actually will perform much worse than r<strong>and</strong>om sampling on all spaces <strong>of</strong><br />

dimension d>2. The error is proportional to the ‘discrepancy’ <strong>of</strong> the sequence <strong>of</strong> sampled points, which<br />

roughly speaking is the volume <strong>of</strong> the largest empty rectangular ‘brick’ in the hypercube. A rectangular<br />

grid in d‐dimensions leaves many completely unpopulated thin slab‐like bricks <strong>of</strong> length 1 in d‐1<br />

dimensions <strong>and</strong> length N ‐1/d in the remaining dimension, for a very large discrepancy <strong>and</strong> thus a very<br />

slow error decay rate N ‐1/d . In a typical population PK/PD model with d=5, the number <strong>of</strong> samples must<br />

6 | P age ©2011 Tripos, L.P. All Rights Reserved


e increased by a factor <strong>of</strong> 100,000 in order to decrease the error by a factor <strong>of</strong> 10. Regular rectangular<br />

grids are thus relatively impractical for EM‐based general NLME pop PK/PD estimation.<br />

An alternative sampling technique for numerical integration which has lower discrepancy <strong>and</strong> hence<br />

much faster error decay rates than either regular or r<strong>and</strong>om grids can be obtained by covering the unit<br />

hypercube with so‐called low discrepancy or quasi‐r<strong>and</strong>om d‐dimensional sequences (see [2] for a good<br />

general reference for quasi‐r<strong>and</strong>om sequences <strong>and</strong> methods). Figure 1 shows the relatively high<br />

discrepancy <strong>of</strong> a uniform r<strong>and</strong>om distribution <strong>of</strong> 2000 points on the unit square, vs. the much lower<br />

discrepancy <strong>of</strong> a 2000 point Sobol sequence <strong>of</strong> the type used in <strong>QRPEM</strong>.<br />

1<br />

2000 2-dimensional Uniformly Distributed R<strong>and</strong>om Points<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

0<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />

1<br />

2000 2-dimensional Uniformly Distributed Quasi-r<strong>and</strong>om Points<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

0<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1<br />

Figure 1. Relatively high discrepancy sequence <strong>of</strong> 2000 uniformly distributed r<strong>and</strong>om points (top) vs.<br />

much lower discrepancy sequence <strong>of</strong> 2000 uniformly distributed Sobol quasi‐r<strong>and</strong>om points (bottom).<br />

Note the discrepancy is the area <strong>of</strong> the largest rectangular unpopulated white space that can be found<br />

within the bounding square.<br />

7 | P age ©2011 Tripos, L.P. All Rights Reserved


The basic concept <strong>of</strong> a low discrepancy sequence is relatively modern, having originated in the 1960s<br />

from a variety <strong>of</strong> contributors. It was later generalized by Niederreiter to (t,m,s) nets (here ‘net ‘ is<br />

used in an similar but more general sense as ‘grid’) . Practical low discrepancy sequences (some<br />

common types are Faure, Halton, Niederreiter, Hammersley, <strong>and</strong> Sobol sequences) first started to<br />

appear in the late 1960’s. Very fast implementations comparable in speed to the more usual<br />

pseudor<strong>and</strong>om number generators are now available for several important cases. Initial applications<br />

were primarily to numerical integration, particularly in physics <strong>and</strong> computational finance. Later the<br />

technique was extended as a potentially more efficient alternative to some, but not all, types <strong>of</strong> Monte<br />

Carlo simulation (for example, there are notorious difficulties to applying low discrepancy sequences to<br />

Markov Chain Monte Carlo based methods such as SAEM, although to some recent progress has been<br />

made in this area with the introduction <strong>of</strong> the idea <strong>of</strong> ‘scrambling’ [6] by Owen <strong>and</strong> others.)<br />

The Sobol sequence used in <strong>QRPEM</strong> is widely regard as being particularly effective for numerical<br />

integration problems in low dimensional spaces d


TEST 1 <strong>–</strong> A difficult sparse <strong>and</strong> highly nonlinear EMAX model.<br />

Summary <strong>of</strong> results: <strong>QRPEM</strong> <strong>and</strong> NMIMP were compared on a difficult sparse, highly nonlinear PD<br />

model adapted from a recent set <strong>of</strong> models used in Mats Karlsson’s lab at Uppsala to compare NM<br />

SAEM <strong>and</strong> MONOLIX SAEM to traditional approximate likelihood models. At identical sample sizes <strong>of</strong><br />

500, <strong>QRPEM</strong> gave excellent results for all parameters in 150 seconds, while NMIMP in 323 seconds gave<br />

good results on fixed effects but relatively poor estimates for the r<strong>and</strong>om effects matrix Omega. The<br />

sample size for NM IMPEM was then gradually increased to see if the Omega estimates could be<br />

improved. Indeed, at a sample size <strong>of</strong> N=200,000, the NMIM Omega results converged almost exactly<br />

to the <strong>QRPEM</strong> N=500 Omega results. QR theory in fact predicts convergence at roughly this size, i.e. a<br />

QR sample size <strong>of</strong> 500 should be approximately as accurate as a MC sample size <strong>of</strong> 500 2 = 250000 on<br />

smooth, low dimensional integr<strong>and</strong>s. NMIMP took 50.2 hours vs. 2.5 minutes for <strong>QRPEM</strong> to reach the<br />

same high quality estimate.<br />

It is important to note that this three order <strong>of</strong> magnitude time difference is not representative <strong>of</strong><br />

relative timings that should be expected when <strong>QRPEM</strong> <strong>and</strong> NMIMP are run in ‘normal’ operating mode.<br />

Under usual circumstances, both would probably be run with a few hundred samples. In these<br />

circumstances, <strong>QRPEM</strong> has proved to be faster by typical factors <strong>of</strong> 1.5 to 4 than NMIMP run in default<br />

mode. These relative timings may change as more features are added to <strong>QRPEM</strong>, or if NMIMP is run in<br />

some non‐default mode that happens to improve performance on a given model. But we believe the<br />

real significance is the greatly improved accuracy, precision, <strong>and</strong> repeatability at equivalent sample sizes<br />

that <strong>QRPEM</strong> <strong>of</strong>fers. EM methods like <strong>QRPEM</strong>, SAEM <strong>and</strong> MCPEM are <strong>of</strong>ten described as ‘exact’<br />

likelihood methods as opposed to the approximate likelihood methods like FO <strong>and</strong> FOCE. This is not<br />

exactly true. FO <strong>and</strong> FOCE are indeed approximate likelihood methods, <strong>and</strong> they produce a certain<br />

inherent level <strong>of</strong> accuracy for any given model <strong>and</strong> data set<strong>–</strong> there’s nothing the user can do to improve<br />

it. But SAEM, MCPEM <strong>and</strong> <strong>QRPEM</strong> are more correctly described as “likelihood as exact as you want to<br />

make it” methods. The accuracy <strong>of</strong> the likelihood depends on the number <strong>of</strong> samples N in the case <strong>of</strong><br />

<strong>QRPEM</strong>, <strong>and</strong> on the number <strong>of</strong> iterations in the case <strong>of</strong> SAEM, <strong>and</strong> the accuracy can be made as high as<br />

desired if the user is willing to pay for it with computer time. The real advantage <strong>of</strong> <strong>QRPEM</strong> is that it<br />

achieves high accuracy at sample sizes that are computationally reasonable, as opposed to NMIMP<br />

which <strong>of</strong>ten has to be run at very high samples sizes. At a sample size <strong>of</strong> 300 (the NMIMP default)<br />

<strong>QRPEM</strong> obtains results that are <strong>of</strong>ten within a few 0.1’s <strong>of</strong> the true ML ELS OBJ value (see results in<br />

other test cases below). At this same size, MCPEM <strong>and</strong> SAEM methods get results with ten times more<br />

error, so they are within a few 1’s <strong>of</strong> the ML ELS OBJ value. For some purposes, the MCPEM <strong>and</strong> SAEM<br />

results are probably good enough <strong>–</strong> for example, it is unlikely that the moderate precision MCPEM result<br />

would lead to a different conclusion in a visual predictive check than the high precision <strong>QRPEM</strong> result.<br />

But for other purposes, for example covariate model analysis, the additional precision is important. The<br />

critical value for determining whether addition <strong>of</strong> a covariate is statistically significant is an<br />

improvement <strong>of</strong> around 3.5 ELS OBJ units. If the method precision is near this level, covariate analyses<br />

become very problematic. But precision at the <strong>QRPEM</strong> level is more than adequate for this purpose.<br />

9 | P age ©2011 Tripos, L.P. All Rights Reserved


Details<br />

At the recent 2011 ACoP conference (<strong>and</strong> also at the 2010 PAGE conference), Elodie Plan et al [4]<br />

presented a series <strong>of</strong> EMAX PD models<br />

E = E0 + Emax*Dose /(Dose + ED50 <br />

with simulated data in order to compare the performance <strong>of</strong> NM SAEM <strong>and</strong> Monolix SAEM to the<br />

traditional approximate likelihood models (NM IMPEM was not included in the comparison). Here is a<br />

Hill coefficient that governs the degree <strong>of</strong> nonlinearity (higher is more nonlinear).<br />

On several <strong>of</strong> the test cases SAEM (both Monolix <strong>and</strong> NM7) gave relatively poor estimates. Perhaps the<br />

most difficult single case was the sparsest (2 observations per subjects) <strong>and</strong> most nonlinear (Hill<br />

coefficient =3). We selected this as a test case for <strong>QRPEM</strong> <strong>and</strong> simulated sparse data sets with 1000<br />

subjects, 2 observations per subject, proportional error model, with the same true parameter values as<br />

used in the Plan model (our model was different only in the number <strong>of</strong> subjects, 1000 vs. 100, <strong>and</strong> the<br />

fact that we used a fixed value <strong>of</strong> rather than treating it as a fixed effect to be estimated.<br />

When run in PHX <strong>QRPEM</strong> with a sample size <strong>of</strong> 500, excellent results were obtained for all parameters,<br />

namely the fixed effects THETA, the r<strong>and</strong>om effect parameters OMEGA, <strong>and</strong> the residual error st<strong>and</strong>ard<br />

deviation. The corresponding NM7 IMPEM results with a sample size <strong>of</strong> 500 were reasonably good for<br />

fixed effects but only fair for the r<strong>and</strong>om effect Omega parameters, <strong>and</strong> in particular relatively poor for<br />

the Omega(EMAX,ED50) parameter.<br />

True Omega values from simulation are<br />

E0 EMAX ED50<br />

0.0900<br />

0.000 0.490<br />

0.000 0.245 0.490<br />

Omega Estimates from <strong>QRPEM</strong> <strong>and</strong> NMIMP with N=500 are<br />

<strong>QRPEM</strong> N=500 (150 sec)<br />

NMIMP N=500 (323 sec)<br />

E0 EMAX ED50 E0 EMA ED50<br />

0.0939 0.0932<br />

‐0.00735 0.536 ‐0.0260 0.464<br />

‐0.00775 0.261 0.472 ‐0.0227 0.137 0 .365<br />

Note the Omega estimates for <strong>QRPEM</strong> with a sample size <strong>of</strong> 500 are considerably different <strong>and</strong> better<br />

than the NMIMP estimate with the same sample size, particularly in the omega(EMAX,ED50) element ,<br />

0.261 vs 0.137, <strong>and</strong> the Omega(ED50,ED50) estimate, 0.472 vs. 0.365. The true values used to simulate<br />

the data were 0.245 <strong>and</strong> 0.490, respectively.<br />

10 | P age ©2011 Tripos, L.P. All Rights Reserved


When the sample size for NM IMP is increased to 200,000, the Omega matrix estimate is remarkably<br />

close to that obtained by QPREP with a sample size <strong>of</strong> 500:<br />

NMIMP N=200000 (50.2 hours)<br />

E0 EMAX ED50<br />

0.0939<br />

‐0.00987 0.534<br />

‐0.0102 0.257 0.471<br />

As can be seen from the table below, the NM IMP cov(EMAX,ED50) <strong>and</strong> ELS OBJ value estimates steadily<br />

improve toward the PHX <strong>QRPEM</strong> estimate <strong>and</strong> ELS OBJ value as the sample size increases to very high<br />

values, so the result for the 200,000 sample size is not a fluke <strong>and</strong> a sample this large really is required<br />

to match the <strong>QRPEM</strong> results.<br />

NM cov(EMAX,ED50) <strong>and</strong> ELS OBJ estimates<br />

N cov(EMAX,ED50) ELS OBJ<br />

500 0.137 7359.276<br />

1000 0.158 7351.957<br />

3000 0.193 7353.849<br />

10000 0.219 7347.394<br />

250000 0.225 7343.776<br />

500000 0.237 7337.368<br />

2000000 0.257 7340.930<br />

The <strong>QRPEM</strong> cov(EMAX,ED50) <strong>and</strong> ELS OBJ value at ISAMPLE=500 are 0.261 <strong>and</strong> 7340.241, respectively.<br />

The 'true value' <strong>of</strong> cov(EMAX,ED50) used to simulate the data is 0.245.<br />

11 | P age ©2011 Tripos, L.P. All Rights Reserved


TEST 2 <strong>–</strong> A moderately difficult sparse test model from the 2005 Lyon INSERM Blind Inter‐method<br />

comparison exercise (see [3] for details <strong>of</strong> this exercise <strong>and</strong> the results)<br />

The original 2005 test model was a 1‐compartment first order oral absorption model with linear first<br />

order elimination (the usual V, Ke, Ka parameterization was changed to V, Ke, Ka‐Ke to avoid the flipflop<br />

identifiability problem associated with this model). The original exercise involved 100 sets <strong>of</strong><br />

simulated data with 100 subjects each, approximately 3 observations per subject. All data sets were<br />

simulated with the same parameter values. For the <strong>QRPEM</strong>/NMIMP comparison here all data sets were<br />

merged into a single large set with 10000 subjects <strong>and</strong> 30000 observations.<br />

Summary <strong>of</strong> results: <strong>QRPEM</strong> was much faster than NMIMP (3208 sec vs. 17502 sec, 210 iterations vs.<br />

397 iterations with both run with N=300 samples. The <strong>QRPEM</strong> method ELS OBJ was far better ‐ lower by<br />

about 55 points (7721.070 vs. 7776.479. Both methods gave parameters in excellent agreement with the<br />

true values used in the original simulations, except for NMIMP on Omega(Ka‐Ke,Ka‐Ke). This was<br />

estimated at 0.0220 by <strong>QRPEM</strong> <strong>and</strong> 0.0395 by NM IMP vs. a “True” value <strong>of</strong> 0.0225. The NMIMP 76%<br />

overestimate is generally consistent with the results found in the original 2005 exercise. All 10 methods<br />

evaluated at that time had difficulties with overestimating this value, with the best results being<br />

obtained by SAEM at an average <strong>of</strong> about a 50% overestimate averaged over the 100 individual data<br />

sets. The result obtained here by <strong>QRPEM</strong> (a 2.2 % underestimate) is by far the best result obtained by<br />

any method.<br />

Details:<br />

The Model was a simple one compartment oral first order linear absorption<br />

Conc =(( Dose*Ka)/V*(Ka‐Ke))*exp(‐Ke*t <strong>–</strong> exp(‐Ka*t)*exp(eps)<br />

where eps is a normally distributed r<strong>and</strong>om residual error.<br />

The parameterization was designed to avoid the possibility <strong>of</strong> flip‐flop associated with this model:<br />

V=tvV*exp(etaV)<br />

Ke=tvKe*exp(etaKe)<br />

Ka=Ke+(tvKa‐Ke)*exp(etaKa‐Ke)<br />

100 data sets with 100 subjects each with approximately 3 observations per subject were generated by<br />

simulation, so ‘TRUE” parameter values were known (to the organizers but not the participants).<br />

Detailed results <strong>of</strong> the performance <strong>of</strong> 10 different methods, including SAEM (the precursor to<br />

MONOLIX), PEM (the precursor to <strong>QRPEM</strong>), <strong>and</strong> MCPEM (the precursor to NONMEM IMPEM). Details <strong>of</strong><br />

the original exercise results are given in [3] . Basically, the EM methods in the original exercise all did<br />

well estimating all the parameters except Omega(Ka‐Ke,Ka‐Ke), with relatively little bias <strong>and</strong> reasonable<br />

12 | P age ©2011 Tripos, L.P. All Rights Reserved


oot mean square errors. All 10 methods severely overestimated Omega(Ka‐Ke,Ka‐Ke), with SAEM<br />

being the least biased for this parameter at an average overestimate <strong>of</strong> 50%.<br />

Here we concatenated all 100 data sets into a single large data set with 10000 subjects <strong>and</strong><br />

approximately 30000 observations <strong>and</strong> ran with 300 samples per subject in both <strong>QRPEM</strong> <strong>and</strong> NMIMP.<br />

The detailed estimates <strong>of</strong> primary interest are<br />

True <strong>QRPEM</strong> NMIMP<br />

tvV 27.2 27.4 27.4<br />

tvKe 0.232 0.231 0.235<br />

tvKa‐Ke 0.304 0.311 0.307<br />

stddeveps 0.250 0.254 0.253<br />

Omega(V,V) 0.218 0.213 0.214<br />

Omega(Ke,Ke) 0.655 0.654 0.666<br />

Omega(Ka‐Ke,Ka‐Ke) 0.0225 0.0220 0.0392<br />

ELSOBJ 7721.070 7776.479<br />

RUNTIME 3208 SEC 17502 SEC<br />

ITERATIONS 210 397<br />

Note in particular <strong>QRPEM</strong> achieves a far better estimate <strong>of</strong> Omega(Ka‐Ke,Ka‐Ke), as well as a much<br />

better (lower) ELS OBJ value.<br />

13 | P age ©2011 Tripos, L.P. All Rights Reserved


TEST 3: Repeatability <strong>and</strong> ELS OBJ accuracy test on a relatively easy IV bolus model<br />

Summary <strong>of</strong> results: The simplest <strong>of</strong> the MONOLIX test models accompanying the MONOLXI 2.0 release<br />

is a rich data single dose 1‐compartment IV bolus model. This was run from a variety <strong>of</strong> starts with<br />

<strong>QRPEM</strong> <strong>and</strong> NMIMP. Also, very high precision maximum likelihood estimates were obtained with the<br />

adaptive Gaussian quadrature (AQG) method in Phoenix NLME. All runs gave good estimates in the<br />

sense that the parameter estimates were in good agreement with the known simulated parameter<br />

values. However, the <strong>QRPEM</strong> results were much more repeatable, reproducing each other <strong>and</strong> the<br />

correct ML estimate value with 0.1 ELS OBJ units on each run. The NMIMP results were much more<br />

variable, with results changing by as much as 1.5 EL OBJ units for run to run <strong>and</strong> typically agreeing with<br />

the true ML value only within 1.0 units. This repeatability <strong>and</strong> accuracy is an important consideration<br />

when performing covariate searches, where imprecision in the ELS OBJ value <strong>of</strong> the magnitude observed<br />

in the NMPEM runs may lead to incorrect conclusions regarding the significance <strong>of</strong> a covariate.<br />

Details<br />

AGQ 25 Points ELSOBJ = ‐4041.788<br />

AGQ 100 Points ELSOBJ= ‐4041.790<br />

AGQ 400 Points ELSOBJ=‐4041.790<br />

Further increases in the resolution <strong>of</strong> the AGQ grid resulted in no changes to either ELSOBJ or parameter<br />

estimates.<br />

<strong>QRPEM</strong> N=300 ELSOBJ=‐4041.749<br />

<strong>QRPEM</strong> N=300 start 2 ELSOBJ= ‐4041.749<br />

<strong>QRPEM</strong> N=300 start 3 ELSOBJ= ‐4041.749<br />

<strong>QRPEM</strong> N=500 ELSOBJ=‐4041.784<br />

<strong>QRPEM</strong> N=1000 ELSOBJ=‐4041.788<br />

<strong>QRPEM</strong> N=2000 ELSOBJ=‐4041.790<br />

Note at three different starts with initial values <strong>of</strong> r<strong>and</strong>om effects changed by as much as 50%, <strong>QRPEM</strong><br />

gave essentially identical results.<br />

Parameter estimates <strong>of</strong> the maximum resolution versions <strong>of</strong> AGQ <strong>and</strong> <strong>QRPEM</strong> were identical to at least<br />

4 significant figures, <strong>and</strong> the ELSOBJ values were identical to 0.001. We may reasonably conclude that<br />

both have reached essentially the exact ML estimate. Note that at the ‘typical’ sample size <strong>of</strong> N=300,<br />

<strong>QRPEM</strong> estimates were nearly identical (within 1%) to the ML estimates <strong>and</strong> the ELSOBJ values were<br />

within 0.24 units <strong>of</strong> the true ML values. From a statistical point <strong>of</strong> view, a 0.24 unit discrepancy is trivial.<br />

NMIMP N=300 ELSOBJ=‐4040.891<br />

NMIMP N=300 start2 ELSOBJ= ‐4041.368<br />

NMIMP N=300 start3 ELSOBJ= ‐4042.353<br />

Note the much higher variability in NM IMP ELSOBJ values, with a range <strong>of</strong> 1.5 ELSOBJ points vs. a range<br />

<strong>of</strong> 0.000 points over 3 starts for QR PEM. At high precision (N=90000), the NM IMP value was obtained<br />

as ELSOBJ=‐4040.740, quite close to the ELSOBJ= ‐4040.749 value for <strong>QRPEM</strong>=300.<br />

14 | P age ©2011 Tripos, L.P. All Rights Reserved


TEST 4 : One compartment single dose models from the 2007 MONOLIX test set<br />

Background : The MONOLIX 2.0 release in 2007 was accompanied by 150 test cases spread over 24<br />

basic types <strong>of</strong> 1‐ <strong>and</strong> 2‐compartment models, with approximately 6 variations in dosing patterns <strong>and</strong><br />

model parameterizations for each basic model. Here we compared <strong>QRPEM</strong>, NMIMP, <strong>and</strong> NMSAEM on<br />

the 12 base one compartment single dose models with a Ke‐V type <strong>of</strong> parameterization. The models<br />

consisted <strong>of</strong> 6 linear elimination models <strong>and</strong> 6 nonlinear Michaelis‐Menten elimination models. Despite<br />

the relative richness <strong>of</strong> the data, many <strong>of</strong> these models proved to be quite difficult for NM FOCE, which<br />

failed to converge on a majority <strong>of</strong> them. As reported in [5], MONOLIX SAEM succeeded on all models<br />

in obtaining good estimates in general agreement with known parameter values used to simulate the<br />

data sets.<br />

Summary <strong>of</strong> Results:<br />

Odd numbered models have linear elimination, even numbered Michaelis‐Menten nonlinear<br />

elimination. See the Exprimo report in [5] for details <strong>of</strong> the models <strong>and</strong> true parameter values.<br />

Both NMIMP <strong>and</strong> <strong>QRPEM</strong> were run at the NMIMP default sampling level <strong>of</strong> N=300 samples/subject <strong>and</strong><br />

a max iteration count <strong>of</strong> 2000. NM IMP was run with a convergence criterion CTYPE=3, while NM SAEM<br />

was run with a maximum <strong>of</strong> 1000 iterations <strong>and</strong> CTYPE=1. In all cases NMIMP <strong>and</strong> <strong>QRPEM</strong> converged<br />

well before the maximum iteration limit, while SAEM required the full 1000 iterations.<br />

NMIMP <strong>and</strong> <strong>QRPEM</strong> succeeded on all models, getting parameter estimates in good agreement with<br />

known parameter simulation values. SAEM succeeded on all but one model, but was judged to have<br />

failed to get an acceptably good result on that model. Repeated starts from other initial conditions also<br />

did not succeed, <strong>and</strong> revealed a large variability for SAEM on this model.<br />

<strong>QRPEM</strong> <strong>of</strong>ten was significantly faster (typically 1.5X to 3X) than NM IMP for the more complex models),<br />

<strong>and</strong> got much closer to the true ML ELS OBJ values as evaluated by the techniques discussed in Test 3.<br />

NMIMP was faster than NMSAEM by about typical factors <strong>of</strong> 1.5X to 2X.<br />

Details ‐<br />

Values listed are the ELS OBJ values for each run, as well as the timings. The notation ‘sd’ refers to<br />

‘single dose’. We plan to add comparisons to other dosage regimens <strong>and</strong> parameterizations in the near<br />

future.<br />

mlx101 sd<br />

<strong>QRPEM</strong> ‐4041.749 14 sec<br />

NMIMP ‐4040.891 20 sec<br />

NMSAEM ‐4041.471 69 sec<br />

True ML ELS OBJ value from high precision AGQ : 4041.790<br />

15 | P age ©2011 Tripos, L.P. All Rights Reserved


mlx102 sd<br />

<strong>QRPEM</strong> ‐4090.768 140 sec<br />

NMIMP ‐4090.424 165 sec<br />

NMSAEM ‐4089.632 225 sec<br />

mlx103 sd<br />

<strong>QRPEM</strong> ‐4035.484 12 sec<br />

NMIMP ‐4034.644 20 sec<br />

NMSAEM ‐4035.201 74 sec<br />

mlx104 sd<br />

<strong>QRPEM</strong> ‐4185.945 120 sec<br />

NMIMP ‐4186.020 176 sec<br />

NMSAEM ‐4185.848 261 sec<br />

mlx105 sd<br />

<strong>QRPEM</strong> ‐3829.337 13 sec<br />

NMIMP ‐3829.094 23 sec<br />

NMSAEM ‐3828.736 90 sec<br />

mlx106 sd<br />

<strong>QRPEM</strong> ‐3876.126 122 sec<br />

NMIMP ‐3875.227 221 sec<br />

NMSAEM ‐3873.968 290 sec<br />

mlx107 sd<br />

<strong>QRPEM</strong> ‐3885.470 15 sec<br />

NMIMP ‐3885.309 24 sec<br />

NMSAEM ‐3885.061 91 sec<br />

mlx108 sd<br />

<strong>QRPEM</strong> ‐4080.355 180 sec<br />

NMIMP ‐4079.661 567 sec<br />

NMSAEM ‐4078.293 1051 sec<br />

mlx109 sd<br />

<strong>QRPEM</strong> ‐3231.600 43 sec<br />

NMIMP ‐3230.105 29.5 sec<br />

NMSAEM ‐3231.311 120 sec<br />

16 | P age ©2011 Tripos, L.P. All Rights Reserved


mlx110 sd<br />

<strong>QRPEM</strong> ‐3397.163 106 sec<br />

NMIMP ‐3397.186 133 sec<br />

NMSAEM ‐3395.371 232 sec<br />

mlx111 sd (this proved to be a difficult model for SAEM <strong>–</strong> note the excellent repeatability <strong>of</strong> the <strong>QRPEM</strong><br />

results relative to NMIMP <strong>and</strong> NMSAEM)<br />

<strong>QRPEM</strong> ‐3511.458 34 sec<br />

<strong>QRPEM</strong> ‐3511.459 32 sec (alternate start)<br />

NMIMP ‐3510.741 29 sec<br />

NMIMP ‐3508.197 50 sec (alternate start)<br />

NMSAEM ‐3458.423 118 sec , anomalous ELS OBJ value, poor omega<br />

NMSAEM ‐3484.962 116 (alternate start, still poor Omega estimates)<br />

mlx112 sd <strong>–</strong>( as in mlx111 sd, <strong>QRPEM</strong> is much more repeatable from different starts than NMIMP <strong>and</strong><br />

NMSAEM)<br />

<strong>QRPEM</strong> ‐3434.675 237 sec<br />

<strong>QRPEM</strong> ‐3434.779 301 sec (alternate start)<br />

NMIMP ‐3434.974 709 sec<br />

NMIMP ‐3432.923 680 sec (alternate start)<br />

NMSAEM ‐3431.192 861 sec<br />

NMSAEM ‐3430.728 842 sec (alternate start)<br />

17 | P age ©2011 Tripos, L.P. All Rights Reserved


TEST 5: Two‐compartment models from the 2007 MONOLIX test set.<br />

The 12 single dose base case 2‐compartment analogues to the 1‐compartment models discussed in Test<br />

4 were analyzed with <strong>QRPEM</strong> <strong>and</strong> NMPEM. As in the one compartment case, odd‐numbered models<br />

correspond to linear elimination cases, while even numbered models are considerably more difficult <strong>and</strong><br />

time consuming nonlinear Michaelis‐Menten models. Like the 1‐compartment case, NM FOCE failed to<br />

converge on most <strong>of</strong> these models. For each odd numbered model, two data sets were run ‐ a smaller,<br />

but still rich, single dose data set ‘sd’ <strong>and</strong> a considerably larger (about 3X as many observations) multiple<br />

dose data set ‘all’. For the even numbered nonlinear models, currently only sd results are available.<br />

Summary <strong>of</strong> Results<br />

Both <strong>QRPEM</strong> <strong>and</strong> NMPEM succeeded on all models, producing good parameter estimates in general<br />

agreement with the known values used to simulate the data. <strong>QRPEM</strong> was usually much faster (2X to 5X),<br />

<strong>and</strong> also on selected cases where high precision AGQ could be run, far more accurate in ELS OBJ value,<br />

generally getting the true value within 0.2 ELS OBJ units, as opposed to about 2.0 for NM IMP {data for<br />

this observation not yet shown)<br />

Details<br />

Method/Model Dataset Time ELSOBJ ITERS<br />

<strong>QRPEM</strong>201 sd 97 sec 20296.006 80<br />

NMIMP201 sd 245 sec 20296.391 113<br />

<strong>QRPEM</strong>201 all 245 sec 65933.468 60<br />

NMIMP201 all 223 sec 65932.981 39<br />

<strong>QRPEM</strong>202 sd 286 sec ‐4221.047 130<br />

NMIMP202 sd 1008 sec ‐4220.901 201<br />

<strong>QRPEM</strong>203 sd 29 sec 19834.529 40<br />

NMIMP203 sd 90 sec 19834.882 41<br />

<strong>QRPEM</strong>203 all 78 sec 65133.507 40<br />

NMIMP203 all 122 sec 65133.733 20<br />

<strong>QRPEM</strong>204 sd 128 sec ‐4150.415 50<br />

NMIMP204 sd 809 sec ‐4151.040 154<br />

<strong>QRPEM</strong>205 sd 51 sec ‐3801.372 60<br />

NMIMP205 sd 222 sec ‐ ‐3801.406 86<br />

<strong>QRPEM</strong>205 all 152 sec ‐13696.353 60<br />

18 | P age ©2011 Tripos, L.P. All Rights Reserved


NMIMP205 all 509 sec ‐13696.055 91<br />

<strong>QRPEM</strong>206 sd 158 sec ‐4088.784 60<br />

NMIMP206 sd 652 sec ‐4089.790 117<br />

<strong>QRPEM</strong>207 sd 71 sec 19555.424 80<br />

NMIMP207 sd 213 sec 19554.815 82<br />

<strong>QRPEM</strong>207 all 154 sec 64282.186 60<br />

NMIMP207 all 553 sec 64282.751 83<br />

<strong>QRPEM</strong>208 sd 293 sec ‐3996.374 80<br />

NMIMP208 sd 902 sec ‐3994.868 86<br />

<strong>QRPEM</strong>209 sd 36 sec ‐3356.841 40<br />

NMIMP209 sd 180 Sec ‐3356.903 39<br />

<strong>QRPEM</strong>209 all 166 sec ‐13220.644 60<br />

NMIMP209 all 378 sec ‐13220.921 28<br />

<strong>QRPEM</strong>210 sd 166 sec ‐3384.766 60<br />

<strong>QRPEM</strong>210 sd 630 sec ‐3386.107 85<br />

<strong>QRPEM</strong>211 sd 79 sec ‐3569.209 80<br />

NMIMP211 sd 557 sec ‐3570.122 179<br />

<strong>QRPEM</strong>211 all 234 sec ‐13487.62 80<br />

NMIMP211 all 1496 sec ‐13488.557 179<br />

<strong>QRPEM</strong>212 sd 199 sec ‐3693.311 60<br />

NMIMP 212 sd 1129 sec ‐3692.731 109<br />

19 | P age ©2011 Tripos, L.P. All Rights Reserved


References<br />

[1] A. Schumitzky. Nonparametric EM Algorithms for Estimating Prior Distributions. App. Math. <strong>and</strong><br />

Computation 45: 143‐158, 1991.<br />

[2] H. Niederreiter. R<strong>and</strong>om Number Generation <strong>and</strong> Quasi‐Monte Carlos Methods. SIAM .1992.<br />

[3] P. Girard <strong>and</strong> F. Mentre’. A Comparison <strong>of</strong> Estimation methods in Nonlinear Mixed Effects Models<br />

Using a Blind Analysis. PAGE 2005. Pamplona, Spain.<br />

http://www.pagemeeting.org/page/page2005/PAGE2005O08.pdf<br />

[4] E. Plan, A. Maloney, F. Mentre’, M. Karlsson <strong>and</strong> J. Bertr<strong>and</strong>. Performance Comparison <strong>of</strong> Various<br />

Maximum Likelihood Nonlinear Mixed‐effects Estimation Methods for Pharmacodynamic Models,<br />

American Conference on Pharmacometrics. San Diego. April, 2011. http://www.goacop.org/2011/posters<br />

Also available in an earlier PAGE 2010 version at<br />

http://www.page‐meeting.org/pdf_assets/4694‐Elodie_Plan_PAGE_Poster.pdf<br />

[5] C. Laveille, M. Lavielle, K. Chatel, P. Jacqmin . Evaluation <strong>of</strong> the PK <strong>and</strong> PK‐PD libraries <strong>of</strong> MONOLIX: A<br />

comparison with NONMEM, PAGE 2008; also P.Jacmin, C. Laveille, M. Lavielle, S<strong>of</strong>tware Evaluation :<br />

Simulation <strong>of</strong> PK Data Sets for Evaluation <strong>of</strong> the Monolix PK Library, June 6, 2007, Exprimo report.<br />

[6] A. Owen. Scrambling Sobol <strong>and</strong> Niederreiter‐Xing points. J. <strong>of</strong> Complexity 1998. 14:466‐89.<br />

20 | P age ©2011 Tripos, L.P. All Rights Reserved

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!