
Paul Zimmermann ha scritto:
Dear Roberto,
// FIXME(0.10.1): the test seems to speed up the GMP computation. if (tableau_ij != 0) { scalar_value = tableau_ij * norm_factor[i]; add_mul_assign(challenger_den, scalar_value, scalar_value); }
inline void add_mul_assign(GMP_Integer& x, const GMP_Integer& y, const GMP_Integer& z) { mpz_addmul(x.get_mpz_t(), y.get_mpz_t(), z.get_mpz_t()); }
I'm ignorant about C++, but maybe the speedup you observe is due to the scalar_value = tableau_ij * norm_factor[i] instruction which is not done any more (which types have scalar_value, tableau_ij and norm_factor[i] ?), or maybe to some init/clear/conversion due to that instruction, or even to the xxx.get_mpz_t() calls.
Indeed, the code of mpz_aorsmul (which implements mpz_addmul) starts with:
mpz_aorsmul (mpz_ptr w, mpz_srcptr x, mpz_srcptr y, mp_size_t sub) { ...
xsize = SIZ(x); ysize = SIZ(y); if (xsize == 0 || ysize == 0) return;
My advice is to convert this part of the code to plain C calling directly the GMP routines, and/or moving scalar_value = tableau_ij * norm_factor[i] before the test.
I've just checked and running
ppl_lpsol -s -p1 -oobtained boeing1.mps
with that two checks to avoid calls to mpz_addmul and mpz_submul permits to avoid 110958265 calls to mpz_addmul (94.3%) and 50079219 calls to mpz_submul (88,6%).
With another small test program I've verified how much CPU time is needed to call 110958265 times mpz_addmul and 50079219 times mpz_submul with factors set to 0 and the result is similar to the benefit seen in ppl_lpsol.
This show me that the inlining of some trivial tests can give good results when the test is true very often.
I see some alternative policies:
1) the GMP library does not check for any special values unless this is needed for correctness and the user code is responsible to add the trivial tests that increase efficiency for specific data set
2) the GMP library move the trivial checks in gmp.h to permit their inlining. Note that this approach does not conflict with 1: it would be feasible to have mpz_addmul defined in gmp.h and mpz_addmul_raw defined in GMP library, then the user code could call the version more suitable for its needs, the important part is that library version does not check for special values.
3) (current situation) the GMP library does some special value checks and user code may decide to inline the very same checks. This makes the 'else' path heavier than needed.