NONMEM Users Network Archive

Hosted by Cognigen

Re: MPI installation on Win 7/ 64 bit

From: Nick Holford <n.holford>
Date: Tue, 24 May 2011 09:38:21 +0200

Rik,

The speed differences I noted between NM6, NM7.1 and NM7.2 beta were
obtained with the same compiler (Intel 11.1) with the same compiler
options for ifort in setup.bat
/Gs /nologo /nbs /w /Ob1gyti /Qprec_div /4Yportlib /traceback
except I do not use /traceback because it makes the exe bigger and slows
down execution.

When I get around to installing 7.2 I will try using

/Gs /nologo /nbs /w /fp:strict

as you suggest.

So far I haven't seen anybody report their comparisons of NM6, NM7.1,
NM7.2 without using the paralysing options.

Here are my results with NM7.2 beta.

The test problem had 376 subjects and 1942 observations with an ADVAN6
differential equation defined model with one DE using 15 THETAs, 3
OMEGAs and one SIGMA.
Dual core processor, Windows XP, Solid state disk (SSD)

Run (C_C Smax400) Test+Tcov Test Tcov
toctwamby_ivf_NM6 1.02 1.02 na
toctwamby_ivf_NM7 1.28 0.46 0.82
toctwamby_ivf_mpi 1.06 0.4 0.66
toctwamby_ivf_NM72 1.49 0.53 0.96
toctwamby_ivf_fpi 1.85 0.83 1.02
toctwamby_gf_NM6 1.2 1.2 na
toctwamby_gf_NM7 2.54 0.93 1.61
toctwamby_gf_mpi 0.95 0.37 0.58
toctwamby_gf_NM72 1.31 0.48 0.83
toctwamby_gf_fpi 1.88 0.92 0.96

ivf=Intel 11.1 gf=gfortran
Test is estimation time (and covariance time for NM6) in minutes
Tcov is covariance time for NM7 in minutes

In general NM7.2 executes more slowly than NM7.1 and NM6.

FPI is slower than NM6 despite using SSD which should give fast file
access times.



On 24/05/2011 8:59 a.m., Rik Schoemaker wrote:
> Dear all,
>
> In contrast to Dieter's findings, I can assure you it is very well possible
> to set up MPI on a Windows 7 64 bit installation if you use Intel Visual
> Fortran.
>
> If you use the following compiler settings:
>
> set op=/Gs /nologo /nbs /w /fp:strict
> (win7)
> -fp-model strict -Gs -nologo -nbs -w -static
> (linux/MacOSX)
>
> you will get identical results for NONMEM 7.1, NONMEM 7.2, with and without
> MPI on Windows 7 64 bit, Linux and Mac OSX providing you have the same
> Fortran compiler version (we use 11.1). I can assure you that this is
> definitely not the case when you use the default settings.
> This dependency on settings could perhaps explain the difference in speed
> that Nick noticed between different NONMEM versions: the minimisation path
> is not reproducible and so some runs which converged perfectly before now
> fail to do so and the other way around.
> As far as gain in speed is concerned, I have two examples. We have a dual
> hex-core machine that supports hyperthreading to 24 cores. The first fairly
> intensive problem takes 29:40 min without parallel processing. If we use MPI
> with 12 nodes
> we go down to 2:57 min, a 10-fold decrease! Using 20 cores we go up to 3:40
> which means we lose some speed. For a much smaller problem I get the
> following figures:
>
> Nodes time (sec)
> 0 183 non-parallel
> 1 179
> 2 101
> 3 74
> 4 63
> 5 54
> 6 48
> 7 44
> 10 38
> 12 35
> 14 36
> 16 39
> 18 38
> 20 42
> 24 47
>
> So for this run we have a maximum 5.2 fold increase in speed (here the
> overhead is taking its toll) and the optimum is 12 cores, so there is no
> gain in using hyper-threaded cores.
>
> Cheers,
>
> Rik
>
>
> -----Original Message-----
> From: owner-nmusers
> Behalf Of Dieter Menne
> Sent: 23 May 2011 9:43 PM
> To: nmuser list
> Subject: [NMusers] MPI installation on Win 7/ 64 bit
>
> The solution was to install the 32-bit version of MPICH2; both the version
> coming with Nonmem and the version from
>
> http://www.mcs.anl.gov/research/projects/mpich2/
>
> work ok.
>
> I got many personal mails telling me that my 64-bit version of MPICH was not
> running. As I had noted, it was running
>
> mpiexec -hosts 2 localhost computername.exe
>
> DIETERPC
>
> DIETERPC
>
> but it did not play well with nonmem compiled with gfortran.
>
> Here the summary with MPICH for a short Bayes run (20 iterations) and an i7
> with 4 real cores
>
> CPUs Time (s)
> 1 101
> 2 55
> 4 33 (50% CPU time)
> 8 35 (100% CPU time)
>
> So almost perfect scaling with 2 CPUs, and as I would expect no improvement
> beyond 4. The 100% CPU time indicated is simply bogus.
>
> Dieter
>
>
>> I am trying to get the MPI feature in 7.2 running.
>>
>> -- My system: Window 7, 64 bit. German. Gfortran 4.6.0
>>
>> -- All single-CPU tests work ok.
>>
>> -- File passing works ok:
>>
>>>> Nmfe72 foce_parallel.ctl foce_parallel.res -parafile=fpiwini8.pnm
>> [nodes]=4
>>
>> Surprisingly slow (32 seconds vs. 3 seconds with one thread), but
>> never mind.
>>
>> -- Test if smpd/mpiexec is working (computername.exe in directory)
>>
>>> smpd -start
>> MPICH2 Daemon (C) 2003 Argonne National Lab started.
>>
>>> mpiexec -hosts 1 localhost computername.exe
>> DIETERPC
>>
>> ### Everything works ok up to here
>>
>>
>>> Nmfe72 foce_parallel.ctl foce_parallel.res -parafile=mpiwini8.pnm
>> [nodes]=4
>> doing nmtran
>>
>> WARNINGS AND ERRORS (IF ANY) FOR PROBLEM 1
>>
>> (WARNING 2) NM-TRAN INFERS THAT THE DATA ARE POPULATION.
>> CREATING MUMODEL ROUTINE...
>> 1 Datei(en) kopiert.
>> Finished compiling fsubs
>>
>> USING PARALLEL PROFILE mpiwini8.pnm
>> MPI TRANSFER TYPE SELECTED
>> Completed call to gfcompile.bat
>> Starting MPI version of nonmem execution ...
>> C:\tmp\test1\output konnte nicht gefunden warden (could not be found)
>>
>> ----
>> All subdirectories are created ok, but the process returns immediately
>> with the above message.
>>
>
>
>
>

--
Nick Holford, Professor Clinical Pharmacology
Dept Pharmacology& Clinical Pharmacology
University of Auckland,85 Park Rd,Private Bag 92019,Auckland,New Zealand
tel:+64(9)923-6730 fax:+64(9)373-7090 mobile:+64(21)46 23 53
email: n.holford
http://www.fmhs.auckland.ac.nz/sms/pharmacology/holford


Received on Tue May 24 2011 - 03:38:21 EDT

The NONMEM Users Network is maintained by ICON plc. Requests to subscribe to the network should be sent to: nmusers-request@iconplc.com.

Once subscribed, you may contribute to the discussion by emailing: nmusers@globomaxnm.com.