Coding
While my primary research focuses on statistics, I have a strong interest in high-performance computing with Julia across both GPU and CPU architectures. I’ve bee first considering reduction operations, which generalize fundamental computations such as scalar products for vectors and matrix-vector multiplications for higher-dimensional arrays.
Luma
I’m developing Luma.jl, a Julia package for high-performance GPU algorithms. This work aims to provide efficient, portable implementations of parallel primitives that match the performance of vendor-optimized libraries while maintaining cross-platform compatibility.
CPUs
I’ve also worked on optimizing Julia’s mapreduce function for CPU execution. This implementation achieves:
- 20% performance improvement over Julia Base
- Consistent speedups across most array sizes
- Support for multiple numeric types, including Float32
- Maintained numerical precision for floating-point operations