py: Add MICROPY_USE_GCC_MUL_OVERFLOW_INTRINSIC.

Most MCUs apart from Cortex-M0 with Thumb 1 have an instruction
for computing the "high part" of a multiplication (e.g., the upper
32 bits of a 32x32 multiply).

When they do, gcc uses this to implement a small and fast
overflow check using the __builtin_mul_overflow intrinsic, which
is preferable to the guard division method previously used in smallint.c.

However, in contrast to the previous mp_small_int_mul_overflow
routine, which checks that the result fits not only within mp_int_t
but is SMALL_INT_FITS(), __builtin_mul_overflow only checks for
overflow of the C type. As a result, a slight change in the code
flow is needed for MP_BINARY_OP_MULTIPLY.

Other sites using mp_small_int_mul_overflow already had the
result value flow through to a SMALL_INT_FITS check so they didn't
need any additional changes.

Do similarly for the _ll and _ull multiply overflows checks.

Signed-off-by: Jeff Epler <jepler@gmail.com>
This commit is contained in:
Jeff Epler
2025-07-23 16:14:22 -05:00
committed by Damien George
parent 3dd8073c29
commit a809132921
7 changed files with 105 additions and 86 deletions

View File

@@ -2336,4 +2336,23 @@ typedef time_t mp_timestamp_t;
#define MP_WARN_CAT(x) (NULL)
#endif
// If true, use __builtin_mul_overflow (a gcc intrinsic supported by clang) for
// overflow checking when multiplying two small ints. Otherwise, use a portable
// algorithm.
//
// Most MCUs have a 32x32->64 bit multiply instruction, in which case the
// intrinsic is likely to be faster and generate smaller code. The main exception is
// cortex-m0 with __ARM_ARCH_ISA_THUMB == 1.
//
// The intrinsic is in GCC starting with version 5.
#ifndef MICROPY_USE_GCC_MUL_OVERFLOW_INTRINSIC
#if defined(__ARM_ARCH_ISA_THUMB) && (__GNUC__ >= 5)
#define MICROPY_USE_GCC_MUL_OVERFLOW_INTRINSIC (__ARM_ARCH_ISA_THUMB >= 2)
#elif (__GNUC__ >= 5)
#define MICROPY_USE_GCC_MUL_OVERFLOW_INTRINSIC (1)
#else
#define MICROPY_USE_GCC_MUL_OVERFLOW_INTRINSIC (0)
#endif
#endif
#endif // MICROPY_INCLUDED_PY_MPCONFIG_H