Validating of OSACA predictions for IVB, SKX, ZEN1, ZEN2, A64FX and TX2 with different kernels.
build_and_run.py contains the configuration used at RRZE's testcluster and UR's qpace4, Analysis.ipynb contains the analysis script and results. Raw data from measurements (122MB) will be attached to next OSACA release.
For now, find the raw data here: https://hawo.net/~sijuhamm/d/UPIhBOtz/validation-data.tar.gz
The analysis report can be viewed at https://nbviewer.jupyter.org/github/RRZE-HPC/OSACA/blob/validation/validation/Analysis.ipynb
Quite a few changes on OSACA included:
Feature: register change tracking via semantic understanding of operations
Feature: recording LCD latency along path and exposing this to frontend
Feature: support for memory reference aliases
Feature: store throughput scaling (similar to load throughput scaling)
Fix: model importer works with latest uops.info export
Fix: immediate type tracking on ARM now preserves type in internal representaion
Removed unused KerncraftAPI
att parser: workaround for crash with "jg,pt" mnemonic
For now we will ignore the branch taken/not-taken indication and will only keep the condition in the mnemonic.
found some 'jg,pt' in icc/mkl generated binaries which crashed the
parser, here an example:
dd8ccd: 3e 7f 90 jg,pt dd8c60 <mkl_blas_avx2_dtrsm_kernel...
Tool for semi-automatically creating an OSACA model using a PMEvo port
mapping, optionally using asmbench to measure latency and throughput,
which otherwise are not available in the port mapping.
This is only designed to handle AArch64 architectures, in particular the
Cortex-A72, used on the Raspberry Pi 4. Usefulness for other models may
be limited.