Skip to content

Initial SME Support - SE-122 #248

Merged
FinnWilkinson merged 98 commits intodevfrom
SME-PR
Nov 1, 2022
Merged

Initial SME Support - SE-122 #248
FinnWilkinson merged 98 commits intodevfrom
SME-PR

Conversation

@FinnWilkinson
Copy link
Copy Markdown
Contributor

@FinnWilkinson FinnWilkinson commented Sep 15, 2022

Within this PR is the background support needed to support the Armv9.2-a SME extension. Below gives an overview of the changes made :

  • Added Config file options Streaming-Vector-Length and MatrixRow-Count
  • Added SME instruction groups to the AArch64/InstructionGroups.hh file
  • Added the ZA matrix register to the AArch64 register files
  • Added decode logic for SME instructions
    • This includes a function to allocate a sub-tile its correct rows
  • Added the SVCR system register to keep track of Streaming SVE Mode and ZA-Enabled Modes
  • Implemented some initial instructions (without helper functions)
    • AArch64_FMOPA_MPPZZ_S Perform an outer-produce accumulate operation on two 32-bit vectors
    • AArch64_LD1_MXIPXX_H_S Load a horizontal slice of a 32-bit ZA tile
    • AArch64_LD1_MXIPXX_V_S Load a vertical slice of a 32-bit ZA tile
    • AArch64_ST1_MXIPXX_H_S Store a horizontal slice of a 32-bit ZA tile
    • AArch64_ST1_MXIPXX_V_S Store a vertical slice of a 32-bit ZA tile
    • AArch64_ZERO_M Zero out given sub-tiles of / whole ZA register
    • AArch64_MSRpstatesvcrImm1 Update the SVCR register; used by aliases SMSTART and SMSTOP
  • Added tests for each new instruction
  • Implemented new SME related test suite helper functions
  • Added 5 new exception types to ensure the core is in the correct context mode for SME execution


Other non-SME changes include:

  • Updated the test suite to be compatible with LLVM 14.0.5
    • Test suite still works for previous versions of LLVM used (i.e. 12.0.0)
  • Set the default LLVM version in CMakeLists.txt to LLVM 14.0.5
  • Optimised source and destination operand arrays in AArch64-Instruction
    • Due to implementation of SME, the old arrays (4 total) would be initialised to 260 Register Values / Registers as standard
    • This caused a large slow down, and so the arrays have been replaced with vectors to mitigate this performance regression
  • Added dedicated zero register to the architectural register file to reduce decode logic
  • Removed Un-used functions in AArch64-Instruction
  • Implemented a Copy Constructor for Instruction objects
    • This is typically defined implicitly, however, as two member variables are references the Copy Constructor needs defining explicitly
  • Fixed Jenkins pipeline building scripts to actually use the targeted compiler at all stages of the build/linking process
  • Updated the Capstone repo used to UoB-HPC's fork at the next branch

@FinnWilkinson FinnWilkinson added the enhancement New feature or request label Sep 15, 2022
@FinnWilkinson FinnWilkinson self-assigned this Sep 15, 2022
Comment thread src/lib/arch/aarch64/Instruction_execute.cc
@FinnWilkinson FinnWilkinson changed the title Initial SME Support SE-122 Initial SME Support Sep 27, 2022
@FinnWilkinson FinnWilkinson changed the title SE-122 Initial SME Support Initial SME Support - SE-122 Oct 11, 2022
FinnWilkinson and others added 19 commits October 17, 2022 17:08
…ency, rather than update the VCT register to total cycles completed.
A fix for handling missing system registers in the aarch64 systemRegisterMap_ map. A missing entry will return a -1 and a decoded instruction accessing an unmapped system register will raise a new UnmappedSysReg fatal exception.
Fixes output error present for miniBUDE when compiled with GCC-10.3.0 targeting armv8.4-a+sve, caused by an incorrect implementation of the FNEG sve instruction.

Additionally, other SVE instructions were updated to accomodate for optional patterns.
A new generic branch predictor containing parameterisable BTB and RAS structures, global indexing, and better identification of branch instructions. Additionally, a parameterisable loop buffer has been implemented in the fetch unit and a loop detection scheme in the ROB unit.
* Moved counter timer logic from main into Architecture, allowing the implementation to be architecture agnostic.

* Added test for CNTVCT register.

* Updated sveGetPattern auxiliary function to work for any instruction string.

* Ensured all necessary SVE instructions included pattern recognition.

* Changed specialFiles generation directory to be the build location.

* Fixed AArch64_LD1RQ_D_IMM's invalid increments of its index variable.

* Improved conditional branch not taken target and remove loop closing direction due to emergent bug.

* Resolved LSQ bugs for comparisons against the total req limit and forwarding operands from flushed loads.

* Updated comment in sveGetPattern Aux function.
This pull request updates SimEng to use the Armv9.2-update branch of the UoB-HPC Capstone Fork.

Changes to the CMake files have been changed to reflect build changes in the upstream Capstone:next branch.

A pre-upstreamed update to Capstone has been merged into the Armv9.2-update, which adds support for the AArch64 Armv9.2-a ISA (including SVE2 and SME instructions). As such, minor fixes have been made to accommodate changes to instruction enums, aliasing logic, and other changes.
This PR has reduced the number of unused copies of the memory image and thus reduced the memory requirements of a SimEng simulation. The process memory image is instantiated once through malloc/remalloc calls and shared between simulation objects through shared pointers.
This PR introduces a new CoreInstance class. The class supports the creation of a SimEng core model, storing all the relevant simulation objects within shared pointers.

A key factor in this change being introduced was to improve the ease SimEng's interactions with other frameworks e.g. SST.
This PR adds prefixes to all SimEng outputs to help distinguish between simulated workload outputs and the outputs from the framework.
This PR moves the dispatch rate restriction from dispatch unit wide to the individual reservation stations. This improves the parameterization of the unit as a whole.
Comment thread src/include/simeng/ModelConfig.hh Outdated
Comment thread src/include/simeng/arch/aarch64/Architecture.hh Outdated
Comment thread src/include/simeng/arch/aarch64/Instruction.hh Outdated
Comment thread src/include/simeng/arch/aarch64/InstructionGroups.hh Outdated
Comment thread src/lib/ModelConfig.cc
Comment thread src/lib/arch/aarch64/ExceptionHandler.cc Outdated
Comment thread src/lib/arch/aarch64/Instruction.cc Outdated
Comment thread src/lib/arch/aarch64/Instruction.cc
Comment thread src/lib/arch/aarch64/Instruction_address.cc
Comment thread src/lib/arch/aarch64/Instruction_decode.cc
@FinnWilkinson
Copy link
Copy Markdown
Contributor Author

Missing SME docs at the moment. Will add ASAP

Comment thread src/include/simeng/arch/aarch64/helpers/sve.hh
Comment thread src/include/simeng/arch/aarch64/helpers/sve.hh
Comment thread src/lib/ModelConfig.cc
Comment thread src/lib/arch/aarch64/Instruction_address.cc
Comment thread src/lib/arch/aarch64/Instruction_address.cc
Copy link
Copy Markdown
Contributor

@jj16791 jj16791 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments and clarification needed

Comment thread CMakeLists.txt Outdated
Comment thread docs/sphinx/developer/arch/supported/aarch64.rst Outdated
Comment thread docs/sphinx/developer/arch/supported/aarch64.rst Outdated
Comment thread docs/sphinx/developer/arch/supported/aarch64.rst Outdated
Comment thread docs/sphinx/developer/arch/supported/aarch64.rst Outdated
Comment thread src/lib/arch/aarch64/Instruction_decode.cc
Comment thread src/lib/arch/aarch64/Instruction_execute.cc Outdated
Comment thread src/lib/arch/aarch64/Instruction_execute.cc Outdated
Comment thread src/include/simeng/arch/Architecture.hh Outdated
Comment thread src/include/simeng/arch/aarch64/Instruction.hh Outdated
Copy link
Copy Markdown
Contributor

@jj16791 jj16791 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment

Comment thread src/lib/models/outoforder/Core.cc Outdated
Copy link
Copy Markdown
Contributor

@jj16791 jj16791 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me

@rahahahat
Copy link
Copy Markdown
Contributor

All looks good to me.

Copy link
Copy Markdown
Contributor

@dANW34V3R dANW34V3R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible major issue, other minor comments

Comment thread src/include/simeng/arch/aarch64/InstructionGroups.hh Outdated
Comment thread .jenkins/build_gcc8.sh Outdated
Comment thread CMakeLists.txt Outdated
Comment thread CMakeLists.txt
Comment thread src/include/simeng/arch/aarch64/Instruction.hh Outdated
Comment thread docs/sphinx/user/building_simeng.rst Outdated
Comment thread test/regression/aarch64/AArch64RegressionTest.hh
Comment thread test/regression/aarch64/instructions/sme.cc Outdated
Comment thread src/include/simeng/arch/aarch64/Architecture.hh Outdated
Comment thread src/lib/arch/aarch64/ExceptionHandler.cc Outdated
Copy link
Copy Markdown
Contributor

@jj16791 jj16791 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updates look good to me. Good job!

Copy link
Copy Markdown
Contributor

@rahahahat rahahahat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New changes, look good to me

Copy link
Copy Markdown
Contributor

@dANW34V3R dANW34V3R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me

@FinnWilkinson FinnWilkinson linked an issue Nov 1, 2022 that may be closed by this pull request
@FinnWilkinson FinnWilkinson merged commit 4165d09 into dev Nov 1, 2022
FinnWilkinson added a commit that referenced this pull request Nov 29, 2022
This PR adds functionality into SimEng to support the AArch64 SME extension. 6 new SME instructions have been implemented, along with the SVCR Streaming-SVE-Mode context switching functionality. 

Additionally, the default LLVM version has been updated to 14.0.5 in order to support the SME regression tests.
The AArch64 Instruction class now has a copy constructor, optimising the use of cached instructions.
Updated Jenkin's pipeline scripts to work with LLVM 14
Changes the Capstone usage to fix an existing memory leak, as mentioned in Issue #261.
jj16791 pushed a commit that referenced this pull request May 19, 2023
This PR adds functionality into SimEng to support the AArch64 SME extension. 6 new SME instructions have been implemented, along with the SVCR Streaming-SVE-Mode context switching functionality. 

Additionally, the default LLVM version has been updated to 14.0.5 in order to support the SME regression tests.
The AArch64 Instruction class now has a copy constructor, optimising the use of cached instructions.
Updated Jenkin's pipeline scripts to work with LLVM 14
Changes the Capstone usage to fix an existing memory leak, as mentioned in Issue #261.
@FinnWilkinson FinnWilkinson deleted the SME-PR branch June 8, 2023 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Architecture::predecode leaks memory for every core tick

5 participants