py: Add MICROPY_USE_GCC_MUL_OVERFLOW_INTRINSIC.

Most MCUs apart from Cortex-M0 with Thumb 1 have an instruction for computing the "high part" of a multiplication (e.g., the upper 32 bits of a 32x32 multiply). When they do, gcc uses this to implement a small and fast overflow check using the __builtin_mul_overflow intrinsic, which is preferable to the guard division method previously used in smallint.c. However, in contrast to the previous mp_small_int_mul_overflow routine, which checks that the result fits not only within mp_int_t but is SMALL_INT_FITS(), __builtin_mul_overflow only checks for overflow of the C type. As a result, a slight change in the code flow is needed for MP_BINARY_OP_MULTIPLY. Other sites using mp_small_int_mul_overflow already had the result value flow through to a SMALL_INT_FITS check so they didn't need any additional changes. Do similarly for the _ll and _ull multiply overflows checks. Signed-off-by: Jeff Epler <jepler@gmail.com>
2026-01-06 12:10:13 +01:00 · 2025-07-23 16:14:22 -05:00
parent 3dd8073c29
commit a809132921
7 changed files with 105 additions and 86 deletions
--- a/py/smallint.h
+++ b/py/smallint.h
@@ -68,10 +68,6 @@
 // The number of bits in a MP_SMALL_INT including the sign bit.
 #define MP_SMALL_INT_BITS (MP_IMAX_BITS(MP_SMALL_INT_MAX) + 1)

-// Multiply two small ints.
-// If returns false, the correct result is stored in 'res'
-// If returns true, the multiplication would have overflowed. 'res' is unchanged.
-bool mp_small_int_mul_overflow(mp_int_t x, mp_int_t y, mp_int_t *res);
 mp_int_t mp_small_int_modulo(mp_int_t dividend, mp_int_t divisor);
 mp_int_t mp_small_int_floor_divide(mp_int_t num, mp_int_t denom);