In Visual Studio 2022 17.5, the code-generation was: |int test(int)| PROCīle employs a branch and contains multiple basic blocks. In Visual Studio 2022 17.5, MSVC was generating the following instruction sequence: |int cal(int)| PROC Let’s see a simple example: int cal(int a) Knowing such equivalence, the compiler could simplify code-generation. While for cmp reg, #imm, the reg must equal imm in true path. For cbz reg, label, the reg must equal to zero in true path, and the same applies for CBNZ on false path. The ARM64 backend previously missed this support for CBZ and CMP. MSVC already has an infrastructure for doing value range deduction, the backend just needs to teach the middle-end about the semantics of its supported comparison instructions. When one register is compared with an immediate value, the compiler can deduce the value range of the register, and this information is useful for later optimizations, for example evaluating comparison results statically. Scalar code-generation improved based on value range analysis In the 17.6 release, the code-generation has been improved into a single abs v16.8h,v16.8h. In Visual Studio 2022 17.5, there was no vectorization, and the code-generation was: ldrsb w8, Short * _restrict b, short * _restrict c) Now, we have extended such support to Multiply-Add Long and Multiply-Subtract Long ( SMLAL/ UMLAL/ SMLSL/ UMLSL).įor example: void smlal(int * _restrict dst, int * _restrict a, The destination vector elements are twice as long as the source vector elements. These instructions add each vector element in the lower or upper half of the first source SIMD register to the corresponding vector element of the second source SIMD register and write the vector result to the destination SIMD register. The ARM64 backend already supports some NEON instructions with asymmetric typed operands, like Add/Subtract Long operations (SADDL/UADDL/SSUBL/USUBL). Auto-Vectorizer supports more NEON instructions with asymmetric operands Including a tagged subject title, detailed description of the issue, and a simple repro simplifies our analysis work and helps us deliver a fix more quickly. This, optimize neon right shift into cmp, is an example of good feedback. The feedback helps us prioritize work items in our backlog. Let’s review some interesting optimizations in this blog.īefore diving into technical details, we’d encourage you to create feedback here at Developer Community if you have found performance issues. These optimizations improved code-generation for both scalar ISA and SIMD ISA (NEON). In the last couple of months, the Microsoft C++ team has been working on improving MSVC ARM64 backend performance and we are excited to have a couple of optimizations available in the Visual Studio 2022 version 1 7.6.
0 Comments
Leave a Reply. |