Validation (#71)

Validating of OSACA predictions for IVB, SKX, ZEN1, ZEN2, A64FX and TX2 with different kernels.

build_and_run.py contains the configuration used at RRZE's testcluster and UR's qpace4, Analysis.ipynb contains the analysis script and results. Raw data from measurements (122MB) will be attached to next OSACA release.

For now, find the raw data here: https://hawo.net/~sijuhamm/d/UPIhBOtz/validation-data.tar.gz

The analysis report can be viewed at https://nbviewer.jupyter.org/github/RRZE-HPC/OSACA/blob/validation/validation/Analysis.ipynb

Quite a few changes on OSACA included:

Feature: register change tracking via semantic understanding of operations
Feature: recording LCD latency along path and exposing this to frontend
Feature: support for memory reference aliases
Feature: store throughput scaling (similar to load throughput scaling)
Fix: model importer works with latest uops.info export
Fix: immediate type tracking on ARM now preserves type in internal representaion
Removed unused KerncraftAPI
This commit is contained in:
Julian
2021-04-15 14:42:37 +02:00
committed by GitHub
parent 25a0e0607d
commit 08440ed5e1
44 changed files with 72388 additions and 149982 deletions

View File

@@ -0,0 +1,18 @@
mov x1, #111 // OSACA START MARKER
.byte 213,3,32,31 // OSACA START MARKER
// pointer_increment=8 bcc2ad06facad03d27f4cce90dbe3f50
.L4:
ldr d0, [x2]
ldr d3, [x1]
ldr d2, [x1, 16]
ldr d1, [x2, x4, lsl 3]
add x2, x2, 8
fadd d0, d0, d3
fadd d0, d0, d2
fadd d0, d0, d1
fmul d0, d0, d4
str d0, [x1, 8]!
cmp x5, x1
bne .L4
mov x1, #222 // OSACA END MARKER
.byte 213,3,32,31 // OSACA END MARKER

View File

@@ -0,0 +1,17 @@
# OSACA-BEGIN
.L4:
vmovsd %xmm0, 8(%rax)
addq $8, %rax
vmovsd %xmm0, 8(%rax,%rcx,8)
vaddsd (%rax), %xmm0, %xmm0 # depends on line 3, 8(%rax) == (%rax+8)
subq $-8, %rax
vaddsd -8(%rax), %xmm0, %xmm0 # depends on line 3, 8(%rax) == -8(%rax+16)
dec %rcx
vaddsd 8(%rax,%rcx,8), %xmm0, %xmm0 # depends on line 5, 8(%rax,%rdx,8) == 8(%rax+8,%rdx-1,8)
movq %rcx, %rdx
vaddsd 8(%rax,%rdx,8), %xmm0, %xmm0 # depends on line 5, 8(%rax,%rdx,8) == 8(%rax+8,%rdx-1,8)
vmulsd %xmm1, %xmm0, %xmm0
addq $8, %rax
cmpq %rsi, %rax
jne .L4
# OSACA-END