Commit Graph

30 Commits

Author SHA1 Message Date
JanLJL
af3b1fe3e8 add missing instruction for test 2023-03-14 17:51:20 +01:00
JanLJL
0b93766bdd Merge branch 'master' into pr-armcc 2023-03-14 17:50:48 +01:00
JanLJL
27eb8f62b6 more instructions 2023-03-14 17:00:23 +01:00
JanLJL
10d4c4b87e added instruction 2023-03-07 17:04:32 +01:00
JanLJL
dbfba9ce5b added another instruction 2023-03-03 14:39:28 +01:00
JanLJL
841a4a5724 resolve #81 2023-03-02 15:50:13 +01:00
JanLJL
ff61c65d58 added more load instrs 2021-07-21 02:34:31 +02:00
JanLJL
615c809fe3 updated a few DB entries 2021-06-02 16:37:18 +02:00
Julian Hammer
8d8eaa8e4f addd LD2 and ST2 instructions to a64fx 2021-04-23 13:33:32 +02:00
Julian Hammer
1f32252f91 improved register range and list support on AArch64 2021-04-23 13:12:18 +02:00
Julian
08440ed5e1 Validation (#71)
Validating of OSACA predictions for IVB, SKX, ZEN1, ZEN2, A64FX and TX2 with different kernels.

build_and_run.py contains the configuration used at RRZE's testcluster and UR's qpace4, Analysis.ipynb contains the analysis script and results. Raw data from measurements (122MB) will be attached to next OSACA release.

For now, find the raw data here: https://hawo.net/~sijuhamm/d/UPIhBOtz/validation-data.tar.gz

The analysis report can be viewed at https://nbviewer.jupyter.org/github/RRZE-HPC/OSACA/blob/validation/validation/Analysis.ipynb

Quite a few changes on OSACA included:

Feature: register change tracking via semantic understanding of operations
Feature: recording LCD latency along path and exposing this to frontend
Feature: support for memory reference aliases
Feature: store throughput scaling (similar to load throughput scaling)
Fix: model importer works with latest uops.info export
Fix: immediate type tracking on ARM now preserves type in internal representaion
Removed unused KerncraftAPI
2021-04-15 14:42:37 +02:00
Julian Hammer
25a0e0607d added missing instructions to all DBs 2021-04-05 16:47:52 +02:00
JanLJL
74a479fb95 fixed AArch64 parser for register shifts and new instructions for A64FX 2021-02-25 07:43:42 +01:00
JanLJL
9f87606ce8 minor model fixes 2021-01-26 12:56:19 +01:00
JanLJL
60f792c4b2 new instructions 2020-12-17 12:38:58 +01:00
JanLJL
8e3d613843 new instructions 2020-12-09 11:52:10 +01:00
JanLJL
e87ab5d6ca new instruction 2020-12-07 01:18:32 +01:00
JanLJL
82b35e7649 new instruction 2020-12-07 01:18:32 +01:00
JanLJL
b9e434d124 new instructions 2020-12-07 01:18:32 +01:00
JanLJL
92c162daa2 new instructions 2020-11-11 13:54:23 +01:00
JanLJL
a7918db145 enhanced hanlding for immediates with shifting 2020-10-21 12:14:21 +02:00
JanLJL
451ba62959 added vector mov 2020-09-23 10:07:43 +02:00
JanLJL
adeae88665 instr update 2020-09-17 21:21:15 +02:00
JanLJL
1698ed1776 gather enhancement 2020-09-03 13:48:00 +02:00
JanLJL
2ef6051e64 added gather load instruction 2020-09-03 09:30:19 +02:00
JanLJL
addcdeda85 added sve instructions 2020-08-03 08:55:37 +02:00
JanLJL
673da99fba minor enhancements for scheduling 2020-07-23 15:55:56 +02:00
JanLJL
5520362e65 adjustments and bugfixes 2020-07-13 18:53:19 +02:00
JanLJL
ce8c3ff9ab bugfixes for A64FX 2020-07-06 18:48:54 +02:00
JanLJL
6294e2e9da initial commit for trying to support a64fx 2020-06-26 05:20:40 +02:00