Commit Graph

142 Commits

Author SHA1 Message Date
JanLJL
671f7f5591 added ICX architecture 2022-08-29 11:14:56 +02:00
JanLJL
d81c53ef91 fixed #88 2022-06-22 17:09:24 +02:00
JanLJL
93c0753db3 formatting 2022-04-07 12:17:08 +02:00
JanLJL
9c966c2359 small bugfixes 2022-03-17 16:38:28 +01:00
JanLJL
e1a5272fdf formatting 2022-01-27 10:12:00 +01:00
JanLJL
d2a4749c39 added lane comparison for AArch64 reg operands 2022-01-26 14:24:48 +01:00
Qingcai Jiang
e70229aa32 Merge branch 'bug_fix/when_mov_is_the_last_instr' into feature/tsv110 2021-12-30 21:33:42 +08:00
Qingcai Jiang
71b9a17ab8 fix a bug when longest_path is not integer, try 'ldpw3, w1, [x0, #0x48]' in AArch64 2021-12-30 21:32:29 +08:00
Qingcai Jiang
203ea2dfb0 XMerge branch 'bug_fix/when_mov_is_the_last_instr' into feature/tsv110 2021-12-30 20:32:34 +08:00
Qingcai Jiang
0e984f4ec7 fix a bug when 'mov' is the last instruction 2021-12-30 20:30:43 +08:00
Qingcai Jiang
7194e79beb simple implement for TSV110 2021-11-06 16:04:16 +08:00
JanLJL
df26edd075 Merge branch 'master' of github.com:RRZE-HPC/OSACA 2021-11-04 12:09:57 +01:00
JanLJL
ba45038ad7 add latency of last instruction in CP 2021-11-04 11:58:40 +01:00
JanLJL
9c16f8bc56 formatted 2021-10-14 10:59:55 +02:00
JanLJL
5735291d27 Merge branch 'master' into a72 2021-10-14 10:37:05 +02:00
JanLJL
5205cb5cc6 fixed formatting with correct line length 2021-10-04 15:00:17 +02:00
JanLJL
e6ce870ca0 black formatting 2021-10-04 14:33:28 +02:00
JanLJL
566fbc6bc4 black conformity 2021-09-30 15:53:56 +02:00
JanLJL
d181184788 enhanced parser 2021-09-29 17:26:27 +02:00
JanLJL
d418c16f4a applied flake8 and black rules 2021-08-26 16:58:19 +02:00
JanLJL
d59b100fa8 changed immediate type from str to int 2021-05-10 01:12:30 +02:00
Julian Hammer
88d5094bf1 Merge branch 'master' of github.com:RRZE-HPC/OSACA 2021-04-23 13:18:23 +02:00
Julian Hammer
1f32252f91 improved register range and list support on AArch64 2021-04-23 13:12:18 +02:00
JanLJL
1de644cd62 fixed incompatibilty to py3.6 2021-04-20 13:59:56 +02:00
JanLJL
3f31235f8a added no timeout option 2021-04-19 10:57:51 +02:00
JanLJL
152360bad2 enhanced LCD analysis by making it parallel and added timeout flag 2021-04-19 00:04:03 +02:00
JanLJL
607d459569 keep dependency paths as generators instead of lists 2021-04-17 12:46:44 +02:00
JanLJL
b033b3b7aa allow different base with prefix for offset values 2021-04-17 11:06:39 +02:00
Julian
08440ed5e1 Validation (#71)
Validating of OSACA predictions for IVB, SKX, ZEN1, ZEN2, A64FX and TX2 with different kernels.

build_and_run.py contains the configuration used at RRZE's testcluster and UR's qpace4, Analysis.ipynb contains the analysis script and results. Raw data from measurements (122MB) will be attached to next OSACA release.

For now, find the raw data here: https://hawo.net/~sijuhamm/d/UPIhBOtz/validation-data.tar.gz

The analysis report can be viewed at https://nbviewer.jupyter.org/github/RRZE-HPC/OSACA/blob/validation/validation/Analysis.ipynb

Quite a few changes on OSACA included:

Feature: register change tracking via semantic understanding of operations
Feature: recording LCD latency along path and exposing this to frontend
Feature: support for memory reference aliases
Feature: store throughput scaling (similar to load throughput scaling)
Fix: model importer works with latest uops.info export
Fix: immediate type tracking on ARM now preserves type in internal representaion
Removed unused KerncraftAPI
2021-04-15 14:42:37 +02:00
JanLJL
b0e35316f0 changed consideration of masking for database back to NO 2021-03-25 11:50:17 +01:00
Julian Hammer
63563ecabc flake8 to ignore some errors and small style improvements 2021-03-11 12:52:34 +01:00
Julian Hammer
b7625a4a25 making flake8 happy 2021-03-11 12:29:14 +01:00
Julian Hammer
6204c90934 migrate code style to Black 2021-03-11 12:02:45 +01:00
Julian Hammer
1ebe5ecfbd sanity check validity of operand entries 2021-03-11 11:38:25 +01:00
JanLJL
9a13e5cbc5 guarantee 0 latency for None values in DB 2021-03-11 01:55:57 +01:00
Git out :V
12044e3ac4 Initial support for the Cortex-A72 (Raspberry Pi 4) 2020-12-16 18:49:16 +01:00
JanLJL
23623ca18a enhancements for lookup and parsing AArch64 instrs 2020-12-07 01:18:32 +01:00
JanLJL
596a323dfb bugfixes 2020-11-21 21:00:58 +01:00
Julian Hammer
314ff4cf9d improved performance of arch_semantics and reg dependency matching 2020-11-09 19:27:47 +01:00
Julian Hammer
f64253b2b9 added dict for instruction lookup 2020-11-09 17:00:46 +01:00
Julian Hammer
a2dd6f752d added comment 2020-11-09 12:35:13 +01:00
Julian Hammer
2fb36406a7 performance improvement of throughput summation 2020-11-09 12:01:00 +01:00
JanLJL
207c53aaad minor bugfix in HW model and added user warnings for more insight 2020-11-06 15:06:36 +01:00
Julian Hammer
6b0adb5d68 improved cache handing (always hashing original file) 2020-11-06 12:27:34 +01:00
JanLJL
f9f382a948 bugfixes 2020-11-06 12:03:54 +01:00
Julian Hammer
decec86e56 fixed py3.5 compatability 2020-10-29 10:59:00 +01:00
JanLJL
9af689b28c fixed bug in tests and removed unused imports 2020-10-28 19:29:48 +01:00
Julian Hammer
9d2ea8603f new caching structure with support for distribution 2020-10-28 16:29:55 +01:00
Julian
dd59af16b2 Merge pull request #51 from RRZE-HPC/A64FX
A64FX support and several Arm bugfixes and enhancements including better TP scheduling
2020-10-16 10:44:47 +02:00
JanLJL
e8b78e4cc6 Merge branch 'master' into A64FX 2020-10-15 22:44:12 +02:00