Fortran is one of the oldest programming languages around. It has survived and thrived through the years, despite its disadvantages due to:
- Compiling & Linking
To compile a Fortran program in file fortran_program.f (note Fortran programs on most systems need to be in files with a .f extension - for Fortran 90 it is usually, with the exception of the IBMs, .f90. Other extensions include .for, .ftn and .F - check the man pages of the compilers for more information, including information on how to deal with the source code format, be it free or fixed):
% f77 fortran_program.f -o executable_name
If you omit -o executable_name the executable binary will be called a.out by default.
If your program is split over more than one file, you can either compile them all on the same line:
% f77 fortran_program1.f fortran_program2.f -o executable_name
or compile them separately and then link the resulting object files (".o" files) together:
% f77 -c fortran_program1.f
% f77 -c fortran_program2.f
% f77 fortran_program1.f fortran_program2.f -o executable_name
If some of the include files that you require (by # include <include_file.h> (you can use cpp style includes in Fortran) or include 'include_file' statements in your Fortran source code) are not in the standard include file search paths the C preprocessor cpp (called by the Fortran compiler in case it meets any include statements) searches in, you can specify them by the -Iinclude_search_path flag:
% f77 -I/usr/local/mpich/include -c mpi.f
Similarly, if some of the library functions that your program uses are not to be found among the standard libraries the linker looks in for a Fortran program, you have to specify them yourself, sometimes including the path to the library if it not in the standard directory the linker looks in:
% f77 mathfortran_program1.o mathfortran_program2.o -o executable_name -lblas
% f77 mpi.o -o mpi-test -L/usr/local/mpich/lib/IRIX/ch_shmem -lmpi
If you are using a library whose filename is libblas.a, you specify it as -lblas.
- Compilers
f77 is the standard name for a Fortran 77 compiler and f90 for Fortran 90 one, but not there are other options:
- On all platforms:
- pghpf (soon on all CFM platforms)
- The Portland Group's HPF compiler
- On all Suns:
- f77
- Sun's Fortran 77 compiler (With extensions).
- On the Suns running Solaris 2.*, 7, 8 (SunOS 5.*, 5.7, 5.8) we have:
- f90
- Sun's Fortran 90. Check the man pages for the flags that give the various language standards.
- nagacef90
- NAG/ACE Fortran 90 (SPARC) Compiler. Check the Web pages for documentation.
- On the SGIs we have:
- f77
- The MIPS (IRIX 5.*) and MIPSpro (IRIX 6.*) optimizing Fortran 77 compiler.
- f90
- The MIPSpro (IRIX 6.2) optimizing optimizing Fortran 90 compiler.
- On the IBMs we have:
- f77, xlf, xlf90
- IBM's very optimizing Fortran 77/90 XL compiler, with language standards conforming to strict Fortran 77, strict Fortran 90, pure Fortran 90 (free of obsolete features) or IBM extended Fortran 90, selected on the command line (see the man page).
- xlhpf
- IBM's HPF compiler
- Optimization flags
Please make sure that once you've developed and debugged a code you compile it optimized for any production runs you make. Running unoptimized production code is a waste of your time as well as well as CFM computing resources. One only needs to take care when using optimization flags as high optimization levels can alter the semantics of your code and produce significantly different and hence erroneous results. You should check the respective man pages for each compiler to see which optimization flags pose such a threat. In that case it is necessary to test your optimized code by comparing the results of one or more of its runs with those of the code compiled with no optimization. If the difference in the results is small (machine or algorithmic accuracy) then go ahead and use the optimized code. If the difference is large enough for the results to be wrong, choose a lower optimization level and try again. Despite the extra trouble you may have to go through, please try and compile your code optimized, you may be very surprised by how much the time it takes to run (especially if it is well written) decreases! And of course always remember that usually the best optimization is a better algorithm.
It is suggested that you look up the man pages for the compiler you plan to use for the best results. (Note that sometimes even supposedly safe optimization options could cause problems due to bugs in the optimizer.) Suggestions for optimization flags for the Fortran compilers on our systems are as follows:
- On the Suns with SunOS 4.1.* (Solaris 1.*):
- f77
- safe: -fast
- high: -fast -depend -native -O3 -bsdmalloc (-O4 will also inline routines)
- On the Suns with Solaris 2.*, 7, 8 (SunOS 5.*, 5.7, 5.8):
- f77 & f90
- safe: -xO3 -xdepend -xchip=generic -xarch=generic
- high: -fast -xO4 -xdepend -xtarget=native
- For some reason on UltraSparc based machines at least this flag combination chooses the wrong libraries currently. Please use:
-fast -xO4 -xdepend -xtarget=native -xarch=v8plusa -xchip=ultra
instead.
- You can also experiment with use of -xO5 instead of -xO4.
- Adding -xsafe=mem can also help sometimes.
- for autoparallelizing add: -xautopar -xloopinfo -xreduction
- Link with: -lfast
- Currently the f90 compiler on the Sun doesn't support the -xdepend flag. That should change, maybe with the next version.
- On the SGIs with IRIX 5.3:
- f77
- safe: -O2
- high: -O2 -sopt,-r=3,-so=4,-o=5,-lo=s
- can use -O3 for even higher optimization but then -c has to be replaced by -j
- On the SGIs with IRIX 6.*:
- f77 & f90
- safe: -O2 -n32
- high: -O3 -n32 -OPT:roundoff=3:IEEE_arithmetic=3
- very high with interprocedural optimization: -Ofast=IP??
To find the platform number IP?? execute uname -m. This option may break the correctness of your code though.
- For an executable tuned for a specific architecture add:
- -r5000 for an R5000 based machine.
- -r8000 for an R8000 based machine.
- -r10000 for an R10000 based machine.
To find out the processor of your machine execute hinv.
- Certainly look at the man page for the compilers as there is a multitude of beneficial options one can try.
- For the time being -v6 may need to be added to the flags above as the MIPSpro7.1 compilers may produce really slow code in very few cases. That will use the MIPSpro6.2 compilers.
- If the processor is an R4*00 -mips3 should be added to the lines above, otherwise -mips4 should be added. hinv -t cpu should give this information.
- If your machine is still running IRIX6.0.1 (but not for long) or you require 64 bit addressing for very large datasets (greater than 2 GB), then replace -n32 with -64.
- Sometimes -O3 will produce wrong code - try using -O2 or even -O1 instead, leaving all other flags as they are.
- On the IBMs:
- xlf & xlf90
- safe: on SP2 wide/thin nodes, like cws.cfm.brown.edu: -O2 -qarch=pwr2 -qtune=pwr2 -Q
- safe: on SP2 silver nodes, like control.cfm.brown.edu: -O2 -qarch=ppc -qtune=604 -Q
- safe: on Power3 nodes, like control.cfm.brown.edu: -O2 -qarch=pwr3 -qtune=pwr3 -Q
- high: -O3 -qarch=pwr2 -qtune=pwr2 -Q
- high but safer: -O3 -qarch=pwr2 -qtune=pwr2 -Q -qstrict
- for "hot" transformations (loop blocking, interchanging etc.) add:
-qhot -qcache=type=d:level=1:assoc=4:line=128:size=128 -qcache=type=i:level=1:assoc=2:line=128:size=32 -qcache:type=d:level=2:size=2048:assoc=2:line=4096 -qcache:type=i:level=2:size=512:assoc=2:line=4096
- A lot more options for example for porting Cray Fortran codes. Keep in mind that single precision calculations on the Power and Power2 processors on our IBMs are significantly slower than double precision ones - so use single precision only if you have to and use the relevant flags (see the man page) to speed code up.
- Also have a look at IBM's suggestions for optimizing code with XL Fortran.
- xlf & xlf90 with KAP preprocessor - please do not use both KAP and the "hot" transformations. KAP is usually beneficial to the performance of your code.
- high: -O3 -qarch=pwr2 -qtune=pwrx -Pk -Wp,-r=3,-inl,-f,-chs=128
- Rarely using -O2 instead of -O3 is actually a better option.
- For interprocedural optimizations add -qipa. Look at the man page for the compiler for more details.
- For cws.cfm.brown.edu replace -qarch=pwr2 -qtune=pwrx with -qarch=pwr -qtune=pwr and don't bother with KAP.
- IBM's Optimization and Tuning Guide for Fortran, C, and C++ can be seen online by typing: info -l xlf