OK I have gprof-ed the code, to see where the difference are. One routine grows massively in time expenditure, although I can't understand why as it's a simple convolution routine.
Faster version(g77):
Code:
% cumulative self self total
time seconds seconds calls s/call s/call name
64.12 2.52 2.52 23679 0.00 0.00 vp_spvoigte__
19.34 3.28 0.76 28574044 0.00 0.00 voigt_
6.36 3.53 0.25 12 0.02 0.28 deriv_
4.33 3.70 0.17 32747695 0.00 0.00 dexpf_
3.05 3.82 0.12 877 0.00 0.00 vp_chspread__
1.27 3.87 0.05 1 0.05 0.05 pr_sort__
0.51 3.89 0.02 1 0.02 0.02 probks_
0.25 3.90 0.01 23679 0.00 0.00 calcn_
0.25 3.91 0.01 877 0.00 0.00 vp_chipconv__
0.25 3.92 0.01 11 0.00 0.00 udchole_
0.25 3.93 0.01 1 0.01 0.01 pldef_
0.00 3.93 0.00 550174 0.00 0.00 ucase_
Slower version (gfortran):
Code:
% cumulative self self total
time seconds seconds calls s/call s/call name
43.27 3.28 3.28 33885 0.00 0.00 vp_spvoigte_
27.04 5.33 2.05 1255 0.00 0.00 vp_subchspread_
16.36 6.57 1.24 41100608 0.00 0.00 voigt_
6.99 7.10 0.53 528972139 0.00 0.00 dexpf_
3.56 7.37 0.27 17 0.02 0.38 deriv_
0.66 7.42 0.05 1 0.05 0.05 pr_sort_
0.66 7.47 0.05 1 0.05 0.05 probks_
0.40 7.50 0.03 478767 0.00 0.00 varythis_
0.40 7.53 0.03 1 0.03 7.52 vp_ucoptv_
0.26 7.55 0.02 33878 0.00 0.00 vp_archwav_
0.13 7.56 0.01 33885 0.00 0.00 calcn_
0.13 7.57 0.01 1255 0.00 0.00 vp_chipconv_
0.13 7.58 0.01 1 0.01 0.01 vp_gwclinfits_
0.00 7.58 0.00 793778 0.00 0.00 ucase_
Even dexpf, which is part of glibc, is massively slower with gfortran... =\