Skip to content

User‐Mode

Karl W. Schulz edited this page Mar 17, 2024 · 8 revisions

The following highlights enablement of omniwatch monitoring in user-space on ORNL's Frontier or Crusher system. Consider

#SBATCH -J omniwatch     
#SBATCH -o rochpl.128nodes.%j.out 
#SBATCH -N 128
#SBATCH -n 1024
#SBATCH -t 0:45:00      
#SBATCH -A ven114
#SBATCH --cpu-freq=high
#SBATCH -S 0

# Setup Omniwatch environment
ml use /autofs/nccs-svm1_sw/crusher/amdsw/modules
ml omniwatch/0.1.0

# Enable data collectors and polling (60 sec interval)
${OMNIWATCH_DIR}/omni_util.py --startexporters --use_pdsh
${OMNIWATCH_DIR}/omni_util.py --startserver --interval 60

# Run your desired application(s) as normal
run_app


# Tear-down data collection
${OMNIWATCH_DIR}/query.py --job ${SLURM_JOB_ID}
${OMNIWATCH_DIR}/omni_util.py --stop
Clone this wiki locally