Analyzing and Optimizing* Speed**

* Improving, but probably not optimizing
** And boot time and power use 

Class # Topic Slides and Notes Assigned Readings and Videos
Supplementary Information
 Performance Analysis and Scalar Optimization
19
3/20/25
Memory System
20
3/25/25
Performance Analysis
20
3/25/25
Key ARM ISA Features for Performance
21
3/27/25
 Cortex-A72 Microarchitecture
22
4/1/25
Getting Better Object Code
Optimizing for the Cortex-A72 Microarchitecture and Memory System
22
4/1/25
Examining SG1 and SG2 Performance on the Cortex-A72 and Cortex-A53
Comparing SG on Arm32 and Arm64
  • SG1 on A72
    • SG1-1 Introduction, v1 (15:14)
    • SG1-2 Using perf, v1-3  (15:05)
    • SG1-3 More optimizations, v4-9  (4:32)
  • SG2 on A72
    • SG2-1 Even more optimizations, v11-13 (6:57)
  • Speed/Scalar/SG1
  • Speed/Scalar/SG2
  •  Drongowski Perf Tutorial
    • Part 1: Introduction & Hot Spots
    • Part 2: Counting Hardware Events
    • Part 3: Profiling Hardware Events

Class # Topic Slides and Notes Assigned Readings and Videos
Supplementary Information
 SIMD Optimization
23
4/3/25
Introduction
  • Read NEON PG, Ch.1 – 3
  • Introducing NEON (22:00)
  • Using NEON 1: Documentation, approaches, libraries, generic compiler opts. pp. 1-13. (11:41)
  • Using NEON 2: Loop unrolling example. pp. 14-21. (30:15)

24
4/8/25

Using NEON

YUV Color Model, find_chroma_matches

 

  • Read NEON PG, Ch. 4 – 6
  • Using NEON 3: Loop unrolling + vectorization details, helping the compiler. pp. 22-35. (23:07)
  • Using NEON 4: Intrinsics. pp. 36-54. (31:58)

25
4/10/25

More on Optimization: Help from the Compiler, SG2, N-Body Simulation
TBD  and Scalar Performance
TBD Applying NEON:
N-Body Gravitational Simulation
TBD Applying NEON: find_chroma_matches
N/A Applying NEON: Fractals and the Mandelbrot Set
  • Speed/Vector/MandelTime

Class # Topic Slides and Notes Assigned Readings and Videos
Supplementary Information
 Other Analyses and Optimizations
N/A RPi Power Basics
Basic Multicore Programming
Examining and Improving Boot Time
Examining and Improving Power Use