JanLJL
607d459569
keep dependency paths as generators instead of lists
2021-04-17 12:46:44 +02:00
JanLJL
b033b3b7aa
allow different base with prefix for offset values
2021-04-17 11:06:39 +02:00
Julian
08440ed5e1
Validation ( #71 )
...
Validating of OSACA predictions for IVB, SKX, ZEN1, ZEN2, A64FX and TX2 with different kernels.
build_and_run.py contains the configuration used at RRZE's testcluster and UR's qpace4, Analysis.ipynb contains the analysis script and results. Raw data from measurements (122MB) will be attached to next OSACA release.
For now, find the raw data here: https://hawo.net/~sijuhamm/d/UPIhBOtz/validation-data.tar.gz
The analysis report can be viewed at https://nbviewer.jupyter.org/github/RRZE-HPC/OSACA/blob/validation/validation/Analysis.ipynb
Quite a few changes on OSACA included:
Feature: register change tracking via semantic understanding of operations
Feature: recording LCD latency along path and exposing this to frontend
Feature: support for memory reference aliases
Feature: store throughput scaling (similar to load throughput scaling)
Fix: model importer works with latest uops.info export
Fix: immediate type tracking on ARM now preserves type in internal representaion
Removed unused KerncraftAPI
2021-04-15 14:42:37 +02:00
JanLJL
b0e35316f0
changed consideration of masking for database back to NO
2021-03-25 11:50:17 +01:00
Julian Hammer
63563ecabc
flake8 to ignore some errors and small style improvements
2021-03-11 12:52:34 +01:00
Julian Hammer
b7625a4a25
making flake8 happy
2021-03-11 12:29:14 +01:00
Julian Hammer
6204c90934
migrate code style to Black
2021-03-11 12:02:45 +01:00
Julian Hammer
1ebe5ecfbd
sanity check validity of operand entries
2021-03-11 11:38:25 +01:00
JanLJL
9a13e5cbc5
guarantee 0 latency for None values in DB
2021-03-11 01:55:57 +01:00
JanLJL
23623ca18a
enhancements for lookup and parsing AArch64 instrs
2020-12-07 01:18:32 +01:00
JanLJL
596a323dfb
bugfixes
2020-11-21 21:00:58 +01:00
Julian Hammer
314ff4cf9d
improved performance of arch_semantics and reg dependency matching
2020-11-09 19:27:47 +01:00
Julian Hammer
f64253b2b9
added dict for instruction lookup
2020-11-09 17:00:46 +01:00
Julian Hammer
a2dd6f752d
added comment
2020-11-09 12:35:13 +01:00
Julian Hammer
2fb36406a7
performance improvement of throughput summation
2020-11-09 12:01:00 +01:00
JanLJL
207c53aaad
minor bugfix in HW model and added user warnings for more insight
2020-11-06 15:06:36 +01:00
Julian Hammer
6b0adb5d68
improved cache handing (always hashing original file)
2020-11-06 12:27:34 +01:00
JanLJL
f9f382a948
bugfixes
2020-11-06 12:03:54 +01:00
Julian Hammer
decec86e56
fixed py3.5 compatability
2020-10-29 10:59:00 +01:00
JanLJL
9af689b28c
fixed bug in tests and removed unused imports
2020-10-28 19:29:48 +01:00
Julian Hammer
9d2ea8603f
new caching structure with support for distribution
2020-10-28 16:29:55 +01:00
Julian
dd59af16b2
Merge pull request #51 from RRZE-HPC/A64FX
...
A64FX support and several Arm bugfixes and enhancements including better TP scheduling
2020-10-16 10:44:47 +02:00
JanLJL
e8b78e4cc6
Merge branch 'master' into A64FX
2020-10-15 22:44:12 +02:00
Julian Hammer
c80088b628
Merge branch 'master' into fix/increment_handling
2020-10-15 16:36:29 +02:00
Julian Hammer
748474cd81
added more cmp versions
2020-10-15 16:23:14 +02:00
Julian Hammer
cf4a9cddcb
Merge branch 'master' into fix/increment_handling
2020-10-15 13:17:02 +02:00
Julian Hammer
4865e7ea72
fixed ignoring of last line without end marker
2020-10-15 11:59:51 +02:00
Julian Hammer
d03398ddf9
treating post- and pre-incremeted memory references no longer as src_dst
...
the incremented register is now considered src_dst instead
2020-10-13 19:25:29 +02:00
Julian Hammer
1def12ee79
if not markes were found, use whole code
2020-10-12 15:04:55 +02:00
Julian Hammer
bd61b94669
ignoring b.none branched in basic block detection
2020-08-03 19:23:33 +02:00
JanLJL
b052ab4151
bugfix in OoO scheduling
2020-07-28 14:52:30 +02:00
JanLJL
6c72281d65
prepared for aarch64 8.2 support
2020-07-23 15:54:54 +02:00
JanLJL
93060eee43
Merge branch 'master' into A64FX
2020-07-13 14:41:49 +02:00
JanLJL
0e77b7bc9a
enhanced TP scheduling
2020-07-06 18:49:46 +02:00
Cloud User
34e978d2ae
initial implementation of Neoverse N1 support
2020-06-30 20:28:57 +00:00
JanLJL
6294e2e9da
initial commit for trying to support a64fx
2020-06-26 05:20:40 +02:00
Julian Hammer
9624e6c109
closing cache file after dump
2020-03-24 15:20:49 +01:00
Julian Hammer
c5801cfe2f
closing cache file
2020-03-21 17:18:04 +01:00
JanLJL
1aa710f195
enhanced MachineModel to support mask/zeroing differentiation for instruction forms
2020-03-17 12:55:37 +01:00
JanLJL
17e7f0e0d8
more instruction forms and added wildcard support for registers in ISA DB
2020-03-12 15:07:51 +01:00
JanLJL
666512d54d
added reg-only fallback for mem-instructions not found in ISA DB
2020-03-10 17:15:57 +01:00
JanLJL
4e73e24b99
added documentation
2020-03-09 16:35:06 +01:00
JanLJL
dcd5b8fd61
more documentation
2020-03-05 18:39:38 +01:00
JanLJL
c9000f74bc
enabled kerncraft marker insertion for aarch64 and more tests
2020-02-27 16:00:23 +01:00
Julian Hammer
0adde7b9fc
added ice lake abbreviation
2020-02-05 10:05:57 +01:00
JanLJL
9c7907ee21
Merge branch 'master' of github.com:RRZE-HPC/osaca
2020-01-29 13:04:11 +01:00
JanLJL
5574a93a5e
made detection of flag dependencies as opt_in for now
2020-01-29 13:03:43 +01:00
Julian Hammer
530ad8484e
frontend returns strings; added helper function to calc. unmatched ratio
2020-01-28 17:24:00 +01:00
JanLJL
421cf55af7
minor enhancements and bugfixes
2020-01-27 16:37:28 +01:00
JanLJL
2fc1f3a186
added new instructions and fixed false positive assignment of stores by dynamic TP/LT combination for aarch64
2020-01-22 21:40:11 +01:00