JanLJL
d181184788
enhanced parser
2021-09-29 17:26:27 +02:00
JanLJL
d418c16f4a
applied flake8 and black rules
2021-08-26 16:58:19 +02:00
JanLJL
d59b100fa8
changed immediate type from str to int
2021-05-10 01:12:30 +02:00
Julian Hammer
88d5094bf1
Merge branch 'master' of github.com:RRZE-HPC/OSACA
2021-04-23 13:18:23 +02:00
Julian Hammer
1f32252f91
improved register range and list support on AArch64
2021-04-23 13:12:18 +02:00
JanLJL
1de644cd62
fixed incompatibilty to py3.6
2021-04-20 13:59:56 +02:00
JanLJL
3f31235f8a
added no timeout option
2021-04-19 10:57:51 +02:00
JanLJL
152360bad2
enhanced LCD analysis by making it parallel and added timeout flag
2021-04-19 00:04:03 +02:00
JanLJL
607d459569
keep dependency paths as generators instead of lists
2021-04-17 12:46:44 +02:00
JanLJL
b033b3b7aa
allow different base with prefix for offset values
2021-04-17 11:06:39 +02:00
Julian
08440ed5e1
Validation ( #71 )
...
Validating of OSACA predictions for IVB, SKX, ZEN1, ZEN2, A64FX and TX2 with different kernels.
build_and_run.py contains the configuration used at RRZE's testcluster and UR's qpace4, Analysis.ipynb contains the analysis script and results. Raw data from measurements (122MB) will be attached to next OSACA release.
For now, find the raw data here: https://hawo.net/~sijuhamm/d/UPIhBOtz/validation-data.tar.gz
The analysis report can be viewed at https://nbviewer.jupyter.org/github/RRZE-HPC/OSACA/blob/validation/validation/Analysis.ipynb
Quite a few changes on OSACA included:
Feature: register change tracking via semantic understanding of operations
Feature: recording LCD latency along path and exposing this to frontend
Feature: support for memory reference aliases
Feature: store throughput scaling (similar to load throughput scaling)
Fix: model importer works with latest uops.info export
Fix: immediate type tracking on ARM now preserves type in internal representaion
Removed unused KerncraftAPI
2021-04-15 14:42:37 +02:00
JanLJL
b0e35316f0
changed consideration of masking for database back to NO
2021-03-25 11:50:17 +01:00
Julian Hammer
63563ecabc
flake8 to ignore some errors and small style improvements
2021-03-11 12:52:34 +01:00
Julian Hammer
b7625a4a25
making flake8 happy
2021-03-11 12:29:14 +01:00
Julian Hammer
6204c90934
migrate code style to Black
2021-03-11 12:02:45 +01:00
Julian Hammer
1ebe5ecfbd
sanity check validity of operand entries
2021-03-11 11:38:25 +01:00
JanLJL
9a13e5cbc5
guarantee 0 latency for None values in DB
2021-03-11 01:55:57 +01:00
Git out :V
12044e3ac4
Initial support for the Cortex-A72 (Raspberry Pi 4)
2020-12-16 18:49:16 +01:00
JanLJL
23623ca18a
enhancements for lookup and parsing AArch64 instrs
2020-12-07 01:18:32 +01:00
JanLJL
596a323dfb
bugfixes
2020-11-21 21:00:58 +01:00
Julian Hammer
314ff4cf9d
improved performance of arch_semantics and reg dependency matching
2020-11-09 19:27:47 +01:00
Julian Hammer
f64253b2b9
added dict for instruction lookup
2020-11-09 17:00:46 +01:00
Julian Hammer
a2dd6f752d
added comment
2020-11-09 12:35:13 +01:00
Julian Hammer
2fb36406a7
performance improvement of throughput summation
2020-11-09 12:01:00 +01:00
JanLJL
207c53aaad
minor bugfix in HW model and added user warnings for more insight
2020-11-06 15:06:36 +01:00
Julian Hammer
6b0adb5d68
improved cache handing (always hashing original file)
2020-11-06 12:27:34 +01:00
JanLJL
f9f382a948
bugfixes
2020-11-06 12:03:54 +01:00
Julian Hammer
decec86e56
fixed py3.5 compatability
2020-10-29 10:59:00 +01:00
JanLJL
9af689b28c
fixed bug in tests and removed unused imports
2020-10-28 19:29:48 +01:00
Julian Hammer
9d2ea8603f
new caching structure with support for distribution
2020-10-28 16:29:55 +01:00
Julian
dd59af16b2
Merge pull request #51 from RRZE-HPC/A64FX
...
A64FX support and several Arm bugfixes and enhancements including better TP scheduling
2020-10-16 10:44:47 +02:00
JanLJL
e8b78e4cc6
Merge branch 'master' into A64FX
2020-10-15 22:44:12 +02:00
Julian Hammer
c80088b628
Merge branch 'master' into fix/increment_handling
2020-10-15 16:36:29 +02:00
Julian Hammer
748474cd81
added more cmp versions
2020-10-15 16:23:14 +02:00
Julian Hammer
cf4a9cddcb
Merge branch 'master' into fix/increment_handling
2020-10-15 13:17:02 +02:00
Julian Hammer
4865e7ea72
fixed ignoring of last line without end marker
2020-10-15 11:59:51 +02:00
Julian Hammer
d03398ddf9
treating post- and pre-incremeted memory references no longer as src_dst
...
the incremented register is now considered src_dst instead
2020-10-13 19:25:29 +02:00
Julian Hammer
1def12ee79
if not markes were found, use whole code
2020-10-12 15:04:55 +02:00
Julian Hammer
bd61b94669
ignoring b.none branched in basic block detection
2020-08-03 19:23:33 +02:00
JanLJL
b052ab4151
bugfix in OoO scheduling
2020-07-28 14:52:30 +02:00
JanLJL
6c72281d65
prepared for aarch64 8.2 support
2020-07-23 15:54:54 +02:00
JanLJL
93060eee43
Merge branch 'master' into A64FX
2020-07-13 14:41:49 +02:00
JanLJL
0e77b7bc9a
enhanced TP scheduling
2020-07-06 18:49:46 +02:00
Cloud User
34e978d2ae
initial implementation of Neoverse N1 support
2020-06-30 20:28:57 +00:00
JanLJL
6294e2e9da
initial commit for trying to support a64fx
2020-06-26 05:20:40 +02:00
Julian Hammer
9624e6c109
closing cache file after dump
2020-03-24 15:20:49 +01:00
Julian Hammer
c5801cfe2f
closing cache file
2020-03-21 17:18:04 +01:00
JanLJL
1aa710f195
enhanced MachineModel to support mask/zeroing differentiation for instruction forms
2020-03-17 12:55:37 +01:00
JanLJL
17e7f0e0d8
more instruction forms and added wildcard support for registers in ISA DB
2020-03-12 15:07:51 +01:00
JanLJL
666512d54d
added reg-only fallback for mem-instructions not found in ISA DB
2020-03-10 17:15:57 +01:00