py: Add MICROPY_USE_GCC_MUL_OVERFLOW_INTRINSIC.

Most MCUs apart from Cortex-M0 with Thumb 1 have an instruction
for computing the "high part" of a multiplication (e.g., the upper
32 bits of a 32x32 multiply).

When they do, gcc uses this to implement a small and fast
overflow check using the __builtin_mul_overflow intrinsic, which
is preferable to the guard division method previously used in smallint.c.

However, in contrast to the previous mp_small_int_mul_overflow
routine, which checks that the result fits not only within mp_int_t
but is SMALL_INT_FITS(), __builtin_mul_overflow only checks for
overflow of the C type. As a result, a slight change in the code
flow is needed for MP_BINARY_OP_MULTIPLY.

Other sites using mp_small_int_mul_overflow already had the
result value flow through to a SMALL_INT_FITS check so they didn't
need any additional changes.

Do similarly for the _ll and _ull multiply overflows checks.

Signed-off-by: Jeff Epler <jepler@gmail.com>
This commit is contained in:
Jeff Epler
2025-07-23 16:14:22 -05:00
committed by Damien George
parent 3dd8073c29
commit a809132921
7 changed files with 105 additions and 86 deletions

View File

@@ -68,10 +68,6 @@
// The number of bits in a MP_SMALL_INT including the sign bit.
#define MP_SMALL_INT_BITS (MP_IMAX_BITS(MP_SMALL_INT_MAX) + 1)
// Multiply two small ints.
// If returns false, the correct result is stored in 'res'
// If returns true, the multiplication would have overflowed. 'res' is unchanged.
bool mp_small_int_mul_overflow(mp_int_t x, mp_int_t y, mp_int_t *res);
mp_int_t mp_small_int_modulo(mp_int_t dividend, mp_int_t divisor);
mp_int_t mp_small_int_floor_divide(mp_int_t num, mp_int_t denom);