JanLJL
671f7f5591
added ICX architecture
2022-08-29 11:14:56 +02:00
JanLJL
d81c53ef91
fixed #88
2022-06-22 17:09:24 +02:00
JanLJL
93c0753db3
formatting
2022-04-07 12:17:08 +02:00
JanLJL
9c966c2359
small bugfixes
2022-03-17 16:38:28 +01:00
JanLJL
e1a5272fdf
formatting
2022-01-27 10:12:00 +01:00
JanLJL
d2a4749c39
added lane comparison for AArch64 reg operands
2022-01-26 14:24:48 +01:00
Qingcai Jiang
e70229aa32
Merge branch 'bug_fix/when_mov_is_the_last_instr' into feature/tsv110
2021-12-30 21:33:42 +08:00
Qingcai Jiang
71b9a17ab8
fix a bug when longest_path is not integer, try 'ldpw3, w1, [x0, #0x48]' in AArch64
2021-12-30 21:32:29 +08:00
Qingcai Jiang
203ea2dfb0
XMerge branch 'bug_fix/when_mov_is_the_last_instr' into feature/tsv110
2021-12-30 20:32:34 +08:00
Qingcai Jiang
0e984f4ec7
fix a bug when 'mov' is the last instruction
2021-12-30 20:30:43 +08:00
Qingcai Jiang
7194e79beb
simple implement for TSV110
2021-11-06 16:04:16 +08:00
JanLJL
df26edd075
Merge branch 'master' of github.com:RRZE-HPC/OSACA
2021-11-04 12:09:57 +01:00
JanLJL
ba45038ad7
add latency of last instruction in CP
2021-11-04 11:58:40 +01:00
JanLJL
9c16f8bc56
formatted
2021-10-14 10:59:55 +02:00
JanLJL
5735291d27
Merge branch 'master' into a72
2021-10-14 10:37:05 +02:00
JanLJL
5205cb5cc6
fixed formatting with correct line length
2021-10-04 15:00:17 +02:00
JanLJL
e6ce870ca0
black formatting
2021-10-04 14:33:28 +02:00
JanLJL
566fbc6bc4
black conformity
2021-09-30 15:53:56 +02:00
JanLJL
d181184788
enhanced parser
2021-09-29 17:26:27 +02:00
JanLJL
d418c16f4a
applied flake8 and black rules
2021-08-26 16:58:19 +02:00
JanLJL
d59b100fa8
changed immediate type from str to int
2021-05-10 01:12:30 +02:00
Julian Hammer
88d5094bf1
Merge branch 'master' of github.com:RRZE-HPC/OSACA
2021-04-23 13:18:23 +02:00
Julian Hammer
1f32252f91
improved register range and list support on AArch64
2021-04-23 13:12:18 +02:00
JanLJL
1de644cd62
fixed incompatibilty to py3.6
2021-04-20 13:59:56 +02:00
JanLJL
3f31235f8a
added no timeout option
2021-04-19 10:57:51 +02:00
JanLJL
152360bad2
enhanced LCD analysis by making it parallel and added timeout flag
2021-04-19 00:04:03 +02:00
JanLJL
607d459569
keep dependency paths as generators instead of lists
2021-04-17 12:46:44 +02:00
JanLJL
b033b3b7aa
allow different base with prefix for offset values
2021-04-17 11:06:39 +02:00
Julian
08440ed5e1
Validation ( #71 )
...
Validating of OSACA predictions for IVB, SKX, ZEN1, ZEN2, A64FX and TX2 with different kernels.
build_and_run.py contains the configuration used at RRZE's testcluster and UR's qpace4, Analysis.ipynb contains the analysis script and results. Raw data from measurements (122MB) will be attached to next OSACA release.
For now, find the raw data here: https://hawo.net/~sijuhamm/d/UPIhBOtz/validation-data.tar.gz
The analysis report can be viewed at https://nbviewer.jupyter.org/github/RRZE-HPC/OSACA/blob/validation/validation/Analysis.ipynb
Quite a few changes on OSACA included:
Feature: register change tracking via semantic understanding of operations
Feature: recording LCD latency along path and exposing this to frontend
Feature: support for memory reference aliases
Feature: store throughput scaling (similar to load throughput scaling)
Fix: model importer works with latest uops.info export
Fix: immediate type tracking on ARM now preserves type in internal representaion
Removed unused KerncraftAPI
2021-04-15 14:42:37 +02:00
JanLJL
b0e35316f0
changed consideration of masking for database back to NO
2021-03-25 11:50:17 +01:00
Julian Hammer
63563ecabc
flake8 to ignore some errors and small style improvements
2021-03-11 12:52:34 +01:00
Julian Hammer
b7625a4a25
making flake8 happy
2021-03-11 12:29:14 +01:00
Julian Hammer
6204c90934
migrate code style to Black
2021-03-11 12:02:45 +01:00
Julian Hammer
1ebe5ecfbd
sanity check validity of operand entries
2021-03-11 11:38:25 +01:00
JanLJL
9a13e5cbc5
guarantee 0 latency for None values in DB
2021-03-11 01:55:57 +01:00
Git out :V
12044e3ac4
Initial support for the Cortex-A72 (Raspberry Pi 4)
2020-12-16 18:49:16 +01:00
JanLJL
23623ca18a
enhancements for lookup and parsing AArch64 instrs
2020-12-07 01:18:32 +01:00
JanLJL
596a323dfb
bugfixes
2020-11-21 21:00:58 +01:00
Julian Hammer
314ff4cf9d
improved performance of arch_semantics and reg dependency matching
2020-11-09 19:27:47 +01:00
Julian Hammer
f64253b2b9
added dict for instruction lookup
2020-11-09 17:00:46 +01:00
Julian Hammer
a2dd6f752d
added comment
2020-11-09 12:35:13 +01:00
Julian Hammer
2fb36406a7
performance improvement of throughput summation
2020-11-09 12:01:00 +01:00
JanLJL
207c53aaad
minor bugfix in HW model and added user warnings for more insight
2020-11-06 15:06:36 +01:00
Julian Hammer
6b0adb5d68
improved cache handing (always hashing original file)
2020-11-06 12:27:34 +01:00
JanLJL
f9f382a948
bugfixes
2020-11-06 12:03:54 +01:00
Julian Hammer
decec86e56
fixed py3.5 compatability
2020-10-29 10:59:00 +01:00
JanLJL
9af689b28c
fixed bug in tests and removed unused imports
2020-10-28 19:29:48 +01:00
Julian Hammer
9d2ea8603f
new caching structure with support for distribution
2020-10-28 16:29:55 +01:00
Julian
dd59af16b2
Merge pull request #51 from RRZE-HPC/A64FX
...
A64FX support and several Arm bugfixes and enhancements including better TP scheduling
2020-10-16 10:44:47 +02:00
JanLJL
e8b78e4cc6
Merge branch 'master' into A64FX
2020-10-15 22:44:12 +02:00