Your port was quite helpful in terms of file and (expecially) endian handling for my own version of llm.java.
I used Java Stream parallelization as well as you did. In a 2nd step I leveraged TornadoVM to run the methods that implement the layers as CUDA kernels (blog).
Regards
Jürgen
Your port was quite helpful in terms of file and (expecially) endian handling for my own version of llm.java.
I used Java Stream parallelization as well as you did. In a 2nd step I leveraged TornadoVM to run the methods that implement the layers as CUDA kernels (blog).
Regards
Jürgen