NONMEM Users Network Archive

Hosted by Cognigen

RE: General question on modeling

From: Mark Sale - Next Level Solutions <mark>
Date: Tue, 20 Mar 2007 18:05:41 -0700

Tobias,
  Thanks very much for your prespective. I especially appreciate your
interest in evolutionary computation and machine learning, an area I
think has a lot to contribute to our field. I don't know the reference
you cite (but I will). My reading in evolutionary computation and
machine learning (of which GA is one method) is that the "best" search
algorithm depends on the assumptions one can make about the structure
of the search space. Stepwise regression has it's own set of
assumptions, some which are likely true in our field in most cases,
some of which are certainly not true. But, evolutionary computation
and machine learning is a very different approach (and IMHO a more
rigorous approach) than what we currently do.


Mark Sale MD
Next Level Solutions, LLC
www.NextLevelSolns.com


> -------- Original Message --------
> Subject: Re: [NMusers] General question on modeling
> From: "Tobias Sing" <tobias.sing
> Date: Tue, March 20, 2007 8:57 pm
> To: "Sale Mark" <mark
> <nmusers
>
> Mark & list,
>
> I'm a newbie to the list. I hope I'm not duplicating anything
> mentioned yesterday (the archive seems to become available with
> delay), but this is a topic I'm also very much interested in, so I'd
> like to share my current view on it (I'd be happy to hear both
> dissenting or agreeing opinions).
>
> > On Monday 19 March 2007 19:32, Mark Sale - Next Level Solutions wrote:
> > Dear Colleagues,
> > I've lately been reviewing the literature on model building/selection
> > algorithms. The structural
> > first, then variances/forward addition/backward elimination is
> > generally mentioned in a number of places
> > [...] Can anyone point
> > me to any rigorous discussion of this model building strategy?
>
> There can be no rigorous general (i.e. problem-independent) statement
> about the superiority of any variable or model selection strategy over
> another:
>
> * Wolpert, D.H. and W.G. Macready, 1997. No free lunch theorems for
> search. IEEE Transactions on Evolutionary Computation (cf.
> http://citeseer.ist.psu.edu/wolpert95no.html and
> http://en.wikipedia.org/wiki/No-free-lunch_theorem).
>
> Thus, the only justification for advocating the use of a particular
> strategy _without making use of problem-specific knowledge_ is the
> empirical observation that it often works well in practice. Other
> approaches besides forward addition/backward elimination also often
> work well. An up-to-date overview (opening a whole journal special
> issue on variable selection):
>
> * An Introduction to Variable and Feature Selection
> Isabelle Guyon, André Elisseeff;
> Journal of Machine Learning Research 3(Mar):1157--1182, 2003.
> http://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf
>
> More or less subtle forms of overfitting always play a role in model
> selection, and with limited data, it is generally not possible to
> simultaneously select an optimal model _and_ obtain optimally accurate
> performance estimates, neither by relying on p-values, AIC/BIC/...,
> (double-)bootstrap-, or (double-) cross-validation-based procedures.
> However, the "double" versions for resampling the entire modeling
> process help a lot in obtaining more reliable estimates when doing a
> lot of "data dredging".
>
> Harrell's (fantastic) book was mentioned by some previous posters. In
> my personal opinion and experience, it is a bit too negative about
> stepwise variable selection or the simplified version of univariable
> screening (e.g. on pp. 56-60). In fact, Guyon/Elisseeff and many
> others have mentioned that greedy search strategies (such as
> forward/backward selection) are "particularly computationally
> advantageous and robust against overfitting", as compared to many more
> sophisticated approaches.
>
> Finally, for me, three important eye-openers on modeling, model
> uncertainty, and model selection in general (the first two also
> referenced in Harrell's book) were:
>
> * Model Specification: The Views of Fisher and Neyman, and Later Developments
> E. L. Lehmann
> Statistical Science 5:2 (1990), pp. 160-168.
>
> * Model uncertainty, data mining and statistical inference
> C. Chatfield
> Journal of the Royal Statistical Society A 158 (1995), pp. 419-466
>
> * Statistical modeling: the two cultures (+ lots of discussion
> articles in the same issue)
> Leo Breiman
> Statistical Science 16 (2001), pp. 199-231
>
> I hope this didn't sound too disappointing. Put positively, the fact
> that very few generic things can be said about the model selection
> process can be considered a "full employment theorem" for modelers...
> :)
>
> Cheers,
> Tobias.
>
> --
> Tobias Sing
> Computational Biology and Applied Algorithmics
> Max Planck Institute for Informatics
> Saarbrucken, Germany
> Phone: +49 681 9325 315
> Fax: +49 681 9325 399
> http://www.tobiassing.net
Received on Tue Mar 20 2007 - 21:05:41 EDT

The NONMEM Users Network is maintained by ICON plc. Requests to subscribe to the network should be sent to: nmusers-request@iconplc.com.

Once subscribed, you may contribute to the discussion by emailing: nmusers@globomaxnm.com.