{2023.06}[2023a,2023b] rebuild CUDA/* modules to update module files#918
{2023.06}[2023a,2023b] rebuild CUDA/* modules to update module files#918trz42 wants to merge 11 commits intoEESSI:2023.06-software.eessi.iofrom
Conversation
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
bot: build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 accelerator:nvidia/cc90 |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
Force rebuilding CUDA/12.1.1 for |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
New job on instance
|
|
New job on instance
|
|
Latest jobs will fail rebuilding CUDA/12.1.1 with a known error (example below for |
|
Force rebuilding CUDA/12.1.1 for zen2 and zen3 and try previous workaround for permission denied issue (reverting #907)... |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
New job on instance
|
|
One more... |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
New job on instance
|
|
Should be better... |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
New job on instance
|
|
Looks better. If rebuilding works, rollback changes related to alternative removal of packages, but keep grep for |
|
|
||
| # if this script is run as root, use PR patch file to determine if software needs to be removed first | ||
| if [ $EUID -eq 0 ]; then | ||
| if [ $EUID -ne 0 ]; then |
There was a problem hiding this comment.
If it is, the comments will need updating
There was a problem hiding this comment.
Yeah, just trying to get it working by using an alternative approach that doesn't use fakeroot ;)
|
Didn't work to remove existing installations. However, CUDA/12.1.1 was identified as an installation to be removed (in addition to CUDA/12.4.0). So this part works now. Opting to open another PR based on the standard way to remove/rebuild a package that also includes the changes to consider CUDA/12.1.1. |
|
Superseded by #919 |
|
PR merged! Moved |
|
PR merged! Moved |
|
PR merged! Moved |
|
PR merged! Moved |
After easybuilders/easybuild-easyblocks#3516 got merged we need to update the module files for CUDA/12.{1.1,4.0}
We need to do that for the architecture combinations:
zen2+cc80zen3+cc80zen4+cc90For the first two we use the build cluster on AWS. For the third we use the build cluster on Azure. Because CUDA is just a binary installation, this should be fine.
Note, while we only need to rebuild the module files, we cannot use
--module-onlyas EasyBuild argument because the rebuild procedure removes the whole installation.