{2025.06}[SYSTEM] CUDA 12.6.0,12.8.0, cuDNN 9.5.0.50,9.10.1.4#1351
{2025.06}[SYSTEM] CUDA 12.6.0,12.8.0, cuDNN 9.5.0.50,9.10.1.4#1351casparvl wants to merge 4 commits intoEESSI:mainfrom
Conversation
…if we can get it to work
|
Let's first try this for a single CPU arch for each supported CC. Native builds: bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80 Cross compiles: bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen4 for:arch=x86_64/amd/zen4,accel=nvidia/cc100 |
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
bot: help |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
bot: help |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
So... the updated hooks are not picked up, because they were used from the repository ( bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/amd/zen4,accel=nvidia/cc90 |
|
New job on instance
|
|
This is the status of all the
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-rug for:arch=x86_64/intel/skylake_avx512,accel=nvidia/cc70 |
|
New job on instance
|
|
FYI: we should hold off on deploying this until we've resolved EESSI/compatibility-layer#226 Edit: i.e. essentially, that should be done by deploying EESSI/compatibility-layer#227 , but building that with the bot proves problematic. |
|
bot:status last_build |
|
This is the status of all the
|
|
This is the status of all the
|
|
This is the status of all the
|
2 similar comments
|
This is the status of all the
|
|
This is the status of all the
|
|
bot:show_config |
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
bot:show_config |
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
bot:status last_build |
|
This is the status of all the
|
|
This is the status of all the
|
1 similar comment
|
This is the status of all the
|
|
This is the status of all the
|
2 similar comments
|
This is the status of all the
|
|
This is the status of all the
|
Many things still need to be done... The software-layer-scripts PR should make sure to
--module-onlyif they target CC100 or above9.5.0comes, I think, only with9.0device code, not9.0a. Thus, we should change the requested CC to9.0for that particular software name & version. For cuDNN 9.10.1.4, I think9.0ais supported, but10.0fis not and it should be changed to10.0. I'd prefer to make those changes in hooks to avoid having to open multiple different software-layer PR, each with customoptionsfor the build. Added advantage is that by doing it in the hooks, it also fixes things forEESSI-extend-based installations.Edit 08-01:
cuDNN-9.5.0.50indeed contains device code for7.0,8.0and9.0, but not for9.0a, which causes the sanity check to fail. So we should make a conversion from 9.0a to 9.0 (in a hook?) for this version.cuDNN-9.10.1.4contains device code for7.0,8.0,9.0a,10.0,12.0, but not for10.0fand12.0f, so needs stripping of the suffix for those as well.This PR should replace #1278 , #1286 and #1287