mirror of
https://github.com/RRZE-HPC/OSACA.git
synced 2025-12-15 16:40:05 +01:00
Update README.rst
This commit is contained in:
32
README.rst
32
README.rst
@@ -71,7 +71,9 @@ The usage of OSACA can be listed as:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
osaca [-h] [-V] [--arch ARCH] [--fixed] [--db-check] [--import MICROBENCH] [--insert-marker] [--export-graph GRAPHNAME] FILEPATH
|
||||
osaca [-h] [-V] [--arch ARCH] [--fixed] [--db-check]
|
||||
[--import MICROBENCH] [--insert-marker]
|
||||
[--export-graph GRAPHNAME] FILEPATH
|
||||
|
||||
-h, --help
|
||||
prints out the help message.
|
||||
@@ -79,7 +81,6 @@ The usage of OSACA can be listed as:
|
||||
shows the program’s version number.
|
||||
--arch ARCH
|
||||
needs to be replaced with the wished architecture abbreviation.
|
||||
This flag is necessary for the throughput analysis (default function) and the inclusion of an ibench output (``-i``).
|
||||
Possible options are ``SNB``, ``IVB``, ``HSW``, ``BDW``, ``SKX`` and ``CSX`` for the latest Intel micro architectures starting from Intel Sandy Bridge and ``ZEN1`` for AMD Zen (17h family) architecture.
|
||||
Furthermore, ``TX2`` for Marvell`s ARM-based ThunderX2 architecture is available.
|
||||
--fixed
|
||||
@@ -208,33 +209,6 @@ It shows the whole kernel together with the optimized port pressure of each inst
|
||||
Furthermore, in the two columns on the right, the critical path (CP) and the longest loop-carried dependency (LCD) of the loop kernel.
|
||||
In the bottom, all loop-carried dependencies are shown, each with a list of line numbers being part of this dependency chain on the right.
|
||||
|
||||
.. For measuring the instruction forms with ibench or asmbench we highly recommend to use an exclusively allocated node, so there is no other workload falsifying the results. For the correct function of ibench the benchmark files from OSACA need to be placed in a subdirectory of src in root so ibench can create the a folder with the subdirectory’s name and the shared objects. For running the tests the frequencies of all cores must set to a constant value and this has to be given as an argument together with the directory of the shared objects to ibench, e.g.:
|
||||
|
||||
.. .. code:: bash
|
||||
|
||||
./ibench ./AVX 2.2
|
||||
|
||||
.. for running ibench in the directory ``AVX`` with a core frequency of 2.2 GHz. We get an output like:
|
||||
|
||||
.. .. code:: bash
|
||||
|
||||
Using frequency 2.20GHz.
|
||||
add-mem_imd-TP: 1.023 (clock cycles) [DEBUG - result: 1.000000]
|
||||
add-mem_imd: 6.050 (clock cycles) [DEBUG - result: 1.000000]
|
||||
|
||||
.. The debug output as resulting value of register ``xmm0`` is additional validation information depending on the executed instruction form meant for the user and is not considered by OSACA. The ibench output information can be included by OSACA running the program with the flag ``--import ibench`` and the specify micro architecture:
|
||||
|
||||
.. .. code-block:: bash
|
||||
|
||||
osaca --arch IVB --import ibench PATH/TO/IBENCH-OUTPUTFILE
|
||||
|
||||
.. For now no automatic allocation of ports for a instruction form is implemented, so for getting an output in the Ports Pressure table, one must add the port occupation by hand. We know that the inserted instruction form must be assigned always to Port 2, 3 and 4 and additionally to either 0, 1 or 5, a valid data file therefore would look like this:
|
||||
|
||||
.. .. code:: yaml
|
||||
|
||||
name: addl-mem_imd,1.0,6.0,"(0.33,0.33,1.00,1.00,1.00,0.33)"
|
||||
|
||||
|
||||
Credits
|
||||
=======
|
||||
Implementation: Jan Laukemann
|
||||
|
||||
Reference in New Issue
Block a user