Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Introduced user configurable memory specification #349

Merged
4 commits merged into from
Feb 25, 2024

Conversation

wdeshazer
Copy link
Contributor

Reverts #346

Warning

WIP

Note

Previous PR was WIP but not Marked.

New approach: Memory Defaults are moving to Platform qsubs

Introduces User Configurable Memory Specification to gacode/shared/bin/gacode_qsub

  • Previously it was hardwired in Site Supplemental files in gacode/platform/qsub

Provisioned for -mem and -mem-per-cpu

Applied Precedence rules

  • Default -mem=16GB
  • Double specification by user
    • Warning
    • Default to user specified `-mem1
    • Otherwise use use specified values

Communicates select through environment variables

  • MEMPERNODE
  • MEMPERCPU
  • Logic nullifies other variable

Site Supplemental file responds correspondingly

  • Modified gacode/shared/bin/gacode_qsub
  • Modified Site Files
    • gacode/platform/qsub/qsub.PPPL
    • gacode/platform/qsub/qsub.OMEGA
  • Ran regression tests
    • Write regression tests to confirm behavior
  • Assess if gacode/shared/bin/gacode_qsub_multi needs same medicine
  • Peruse the remaining files in gacode/shared/bin to ensure compatibility
  • Peruse the remaining Site Files in gacode/platform/qsub to ensure compatibility

I have taken a cursory look at all files and don't see anything glaring, but it deserves more consideration

@wdeshazer wdeshazer requested review from smithsp and jmcclena February 7, 2024 07:22
@wdeshazer wdeshazer self-assigned this Feb 7, 2024
@wdeshazer
Copy link
Contributor Author

wdeshazer commented Feb 7, 2024

Checklist of Confirmed Compatible files

gacode/shared/bin

AZURE-COMET
  • qsub.AZURE
  • qsub.AZURE_GPU
  • qsub.AZURE_HB2
  • qsub.AZURE_HB2_1k
  • qsub.AZURE_HC
  • qsub.AZURE_SLURM
  • qsub.BANACH
  • qsub.CLOUD_GPU
  • qsub.COMET
  • qsub.CRUSHER
DAINT_PGI-HPC_ITER
  • qsub.DAINT_PGI
  • qsub.DROP
  • qsub.EDISON_CRAY
  • qsub.EDISON_IFORT
  • qsub.FREIA
  • qsub.Frontera_GCC
  • qsub.Frontera_IFORT
  • qsub.FRONTIER
  • qsub.GASUMMIT_GPU
  • qsub.HPC_ITER
IRIS-OMEGA
  • qsub.IRIS
  • qsub.KAIROS
  • qsub.LOKI
  • qsub.MARCONI
  • qsub.MARCONI_KNL
  • qsub.MARCONI_LEONARDO
  • qsub.MARCONI_SKL
  • qsub.MINT
  • qsub.NURION
  • qsub.OMEGA
PPPL-STAMPEDE
  • qsub.PPPL
  • qsub.PPPL_atom
  • qsub.PPPL_gcc
  • qsub.PSFCLUSTER
  • qsub.SATURN
  • qsub.SATURN_GCC
  • qsub.SHENMA
  • qsub.STAMPEDE
  • qsub.STAMPEDE2_KNL_HT2_IFORT
  • qsub.STAMPEDE2_KNL_IFORT
  • qsub.STAMPEDE2_SKX_IFORT
SUMMIT-MARCONI
  • qsub.SUMMIT
  • qsub.SUMMITDEV_GPU
  • qsub.PERLMUTTER_GPU
  • qsub.PERLMUTTER_GPU_80G
  • qsub.PERLMUTTER_CPU
  • qsub.PERLMUTTER_CPU_Cray
  • qsub.SHELL
  • qsub.MARCONI_100G
  • qsub.MARCONI_A

gacode/shared/bin

gacode_getversion-update_gacode.csh
  • gacode_getversion
  • gacode_mpi_tool
  • gacode_platforms
  • gacode_printversion
  • gacode_qsub
  • gacode_qsub_multi
  • gacode_reg
  • gacode_reg_do
  • gacode_regression.py
  • gacode_release_new_version.sh
  • gacode_setup
  • gacode_setup.tcsh
  • gacode_sim_warn
  • osxlist.py
  • summit_tool
  • update_gacode.csh

@wdeshazer
Copy link
Contributor Author

@smithsp, @jcandy, and @jmcclena. We are finally back where we started. I noted it above, but I am refactoring and verifying that it works and reproduces everyone's existing settings.

@wdeshazer wdeshazer changed the title Introduced user configurable memory specification WIP: Introduced user configurable memory specification Feb 7, 2024
@wdeshazer
Copy link
Contributor Author

wdeshazer commented Feb 7, 2024

Progress: Platform specific SLURM settings

This is an example of a qsub and I am not sure where the memory is requested.

bfile=$SIMDIR/batch.src
echo "#!/bin/bash " > $bfile
echo "#SBATCH -J $LOCDIR" >> $bfile
echo "#SBATCH -o $SIMDIR/batch.out" >> $bfile
echo "#SBATCH -e $SIMDIR/batch.err" >> $bfile
echo "#SBATCH -t $WALLTIME" >> $bfile
echo "#SBATCH -n $cores_used" >> $bfile
if [ "$QUEUE" = "null_queue" ]
then
  echo "#SBATCH -p sque" >> $bfile
else
  echo "#SBATCH -p $QUEUE" >> $bfile
fi
echo "$CODE -e $LOCDIR -n $nmpi -nomp $nomp -numa $numa -mpinuma $mpinuma -p $SIMROOT" >> $bfile

Completed:

  • qsub.OMEGA
  • qsub. PPPL
  • qsub.PPPL_gcc
  • qsub.SATURN_GCC

Logic Added But has unusual call

  • qsub.AZURE_GPU
  • What is 32 mpi ranks per node
  • It has the code in question

Reviewed these Files that I need consultation (@jmcclena or @smithsp)

  • qsub.AZURE
  • qsub.PERLMUTTER_CPU
  • qsub.PERLMUTTER_CPU_cray
  • qsub.PERLMUTTER_GPU
  • qsub.PERLMUTTER_GPU_cray
  • qsub.PPPL_atom
  • qsub.PSFCLUSTER
  • qsub.SATURN
  • qsub.SHENMA
  • qsub.STAMPEDE
  • qsub.STAMPEDE2_KNL_HT2_IFORT
  • qsub.STAMPEDE2_KNL_IFORT
  • qsub.STAMPEDE2_SKX_IFORT
  • qsub.STELLAR
  • qsub.SUMMIT
  • qsub.SUMMITDEV_GPU

jmcclena
jmcclena previously approved these changes Feb 7, 2024
Copy link
Contributor

@jmcclena jmcclena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me.

@jcandy jcandy closed this pull request by merging all changes into master in 85f4e34 Feb 25, 2024
@jcandy jcandy deleted the user_mem_with_defaults branch September 3, 2024 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants