Coding

While my primary research focuses on statistics, I have a strong interest in high-performance computing with Julia across both GPU and CPU architectures. I’ve bee first considering reduction operations, which generalize fundamental computations such as scalar products for vectors and matrix-vector multiplications for higher-dimensional arrays.

Luma

I’m developing Luma.jl, a Julia package for high-performance GPU algorithms. This work aims to provide efficient, portable implementations of parallel primitives that match the performance of vendor-optimized libraries while maintaining cross-platform compatibility.

Luma.jl on GitHub, with benchmarks in the Readme file

CPUs

I’ve also worked on optimizing Julia’s mapreduce function for CPU execution. This implementation achieves:

  • 20% performance improvement over Julia Base
  • Consistent speedups across most array sizes
  • Support for multiple numeric types, including Float32
  • Maintained numerical precision for floating-point operations

View the CPU mapreduce performance notebook