Skip to content

Commit

Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into allow_pam_access_group
Browse files Browse the repository at this point in the history
mboisson authored Jan 20, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
2 parents 97edcfb + 5de4c7f commit f05a21d
Showing 52 changed files with 1,202 additions and 935 deletions.
63 changes: 62 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -3,6 +3,68 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [14.1.2] 2024-11-19

No changes to Puppet code.

Refer to [magic_castle changelog](https://github.com/ComputeCanada/magic_castle/blob/main/CHANGELOG.md)

## [14.1.1] 2024-11-19

No changes to Puppet code.

Refer to [magic_castle changelog](https://github.com/ComputeCanada/magic_castle/blob/main/CHANGELOG.md)

## [14.1.0] 2024-11-17

### Added

- Added a dependency to ipa-install-client exec on resources that takes most of the configuration time.
This levels the configuration time of instances resulting in the cluster configuration time corresponds
to the management instance configuration time.

## [14.0.0] 2024-11-12

### Added

- Added ability to define gieradata per instances group (PR #324)
- Added option to disable CVMFS_STRICT_MOUNT (PR #335)
- Added option to install additional OS packages (PR #290)
- Added enable_scrontab to slurm::base (PR #329)
- Added support for vector.dev (PR #356)
- Added user quota support with XFS (PR #308)
- Added puppet-forge's rsyslog (PR #321)
- Added support for Slurm 23.11 (PR #359)
- Added support for Slurm 24.05 (PR #364)
- Added perl-Sys-hostname in base slurm when os major >= 9 (PR #366)
- Added option to disable Slurm's spank plugin to manage tmpfs mounts (PR #337)
- Added authenticationmethods param to local user (PR #340)
- Added option for user to specify CephFS version (PR #380)
- Provided JupyterHub the ability to create users in FreeIPA (PR #397)

### Changed

- Fixed issue #351 - "prepare4image.sh fails to run to completion..." (PR #352)
- Generalized ceph.pp to allow multiple cephfs mounting (PR #313)
- Adjusted user limits on compute node (PR #311)
- Refactored NFS server and client to allow running nfs and mgmt on distinct instances (PR #300)
- Improved prepare4image.sh handling of mounted volumes (#338)
- Added firewall definition to bootstrap
- Made profile::sssd module works on its own with or without profile::freeipa (PR #330)
- Replaced file_line by sshd_config provider for UseDNS and HostbasedAuthentication (PR #367)
- Fixed Arbutus vgpu rpm src (PR #369)
- Changed Nvidia driver source to stream - proprietary version (PR #373)
- Made prepare4image.sh remove only puppet data cache directories
- Replaced nvidia-persistenced user by dynamic systemd user (PR #383)
- Bumped nvidia driver to v550-dkms
- Upgraded default Compute Canada environment to StdEnv/2023 (PR #390)
- Bumped puppet-jupyterhub to 6.3.0

### Removed

- Removed branching related to CentOS 7 (PR #358)
- Pruned slurm versions no longer supported by SchedMD (PR #365)

## [13.5.0] 2024-04-11

### Added
@@ -145,7 +207,6 @@ definition.
- Disabled unused epel repos
- Removed profile::nfs exec `exportfs -ua; cat...; exportfs -a`
- Removed puppet alias from etc/hosts
-

## [12.6.7] 2023-09-29

6 changes: 4 additions & 2 deletions Puppetfile
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@

forge "https://forgeapi.puppetlabs.com"

mod 'cmdntrf-consul_template', '2.3.6'
mod 'cmdntrf-consul_template', '2.3.8'
mod 'derdanne-nfs', '2.1.11'
mod 'heini-wait_for', '2.2.0'
mod 'puppet-augeasproviders_core', '4.0.1'
@@ -20,6 +20,7 @@ mod 'puppet-healthcheck', '1.0.1'
mod 'puppet-kmod', '4.0.0'
mod 'puppet-logrotate', '5.0.0'
mod 'puppet-prometheus', '12.5.0'
mod 'puppet-rsyslog', '7.1.0'
mod 'puppet-selinux', '3.4.1'
mod 'puppet-squid', '3.0.0'
mod 'puppet-systemd', '3.10.0'
@@ -34,7 +35,8 @@ mod 'puppetlabs-mysql', '13.3.0'
mod 'puppetlabs-stdlib', '5.2.0'
mod 'puppetlabs-transition', '0.1.3'
mod 'treydock-globus', '9.0.0'
mod 'saz-limits', '3.0.4'

mod 'computecanada-jupyterhub',
:git => 'https://github.com/ComputeCanada/puppet-jupyterhub.git',
:ref => 'v5.0.3'
:ref => 'v6.7.0'
198 changes: 129 additions & 69 deletions README.md
Original file line number Diff line number Diff line change
@@ -40,10 +40,12 @@ The `profile::` sections list the available classes, their role and their parame
- [`profile::rsyslog::base`](#profilersyslogbase)
- [`profile::rsyslog::client`](#profilersyslogclient)
- [`profile::rsyslog::server`](#profilersyslogserver)
- [`profile::vector`](#profilervector)
- [`profile::slurm::base`](#profileslurmbase)
- [`profile::slurm::node`](#profileslurmnode)
- [`profile::slurm::accounting`](#profileslurmaccounting)
- [`profile::slurm::controller`](#profileslurmcontroller)
- [`profile::slurm::node`](#profileslurmnode)
- [`profile::software_stack`](#profilesoftware_stack)
- [`profile::squid::server`](#profilesquidserver)
- [`profile::sssd::client`](#profilesssdclient)
@@ -53,6 +55,7 @@ The `profile::` sections list the available classes, their role and their parame
- [`profile::ssh::hostbased_auth::server`](#profilesshhostbased_authserver)
- [`profile::users::ldap`](#profileusersldap)
- [`profile::users::local`](#profileuserslocal)
- [`profile::volumes`](#profilevolumes)

For classes with parameters, a folded **default values** subsection provides the default
value of each parameter as it would be defined in hieradata. For some parameters, the value is
@@ -182,13 +185,18 @@ This class configures two services to bridge LDAP users, Slurm accounts and user
| :-------------- | :------------------------------------------------------------ | :-------- |
| `project_regex` | Regex identifying FreeIPA groups that require a corresponding Slurm account | String |
| `skel_archives` | Archives extracted in each FreeIPA user's home when created | Array[Struct[{filename => String[1], source => String[1]}]] |

| `manage_home` | When true, `mkhome` create home folder for new FreeIPA users | Boolean |
| `manage_scratch`| When true, `mkhome` create scratch folder for new FreeIPA users | Boolean |
| `manage_project`| When true, `mkproject` create project folder for new FreeIPA users | Boolean |
<details>
<summary>default values</summary>

```yaml
profile::accounts::project_regex: '(ctb\|def\|rpp\|rrg)-[a-z0-9_-]*'
profile::accounts::skel_archives: []
profile::accounts::manage_home: true
profile::accounts::manage_scratch: true
profile::accounts::manage_project: true
```
</details>

@@ -223,13 +231,15 @@ cluster operations.
| :------------- | :------------------------------------------------------------------------------------- | :----- |
| `version` | Current version number of Magic Castle | String |
| `admin_email` | Email of the cluster administrator, use to send log and report cluster related issues | String |
| `packages` | List of additional OS packages that should be installed | Array[String] |

<details>
<summary>default values</summary>

```yaml
profile::base::version: '13.0.0'
profile::base::admin_emain: ~ #undef
profile::base::packages: []
```
</details>

@@ -239,6 +249,9 @@ profile::base::admin_emain: ~ #undef
```yaml
profile::base::version: '13.0.0-rc.2'
profile::base::admin_emain: "you@email.com"
profile::base::packages:
- gcc-c++
- make
```
</details>

@@ -278,50 +291,35 @@ that provides object storage, block storage, and file storage built on a common
cluster foundation.
[reference](https://en.wikipedia.org/wiki/Ceph_(software))

This class install Ceph packages, and configure and mount a CephFS share.
This class installs the Ceph packages, and configure and mount CephFS shares.

### parameters

| Variable | Description | Type |
| :--------------------------- | :---------------------------------------------------------- | ------------- |
| `share_name` | CEPH share name | String |
| `access_key` | CEPH share access key | String |
| `export_path` | Path of the share as exported by the monitors | String |
| `mon_host` | List of CEPH monitor hostnames | Array[String] |
| `mount_binds` | List of CEPH share folders that will bind mounted under `/` | Array[String] |
| `mount_name` | Name to give to the CEPH share once mounted under `/mnt` | String |
| `binds_fcontext_equivalence` | SELinux file context equivalence for the CEPH share | String |

<details>
<summary>default values</summary>

```yaml
profile::ceph::client::mount_binds: []
profile::ceph::client::mount_name: 'cephfs01'
profile::ceph::client::binds_fcontext_equivalence: '/home'
```
</details>
| Variable | Description | Type |
| :------------ | :---------------------------------------------------------- | -------------------- |
| `mon_host` | List of Ceph monitor hostnames | Array[String] |
| `shares` | List of Ceph share structures | Hash[String, CephFS] |

<details>
<summary>example</summary>

```yaml
profile::ceph::client::share_name: "your-project-shared-fs"
profile::ceph::client::access_key: "MTIzNDU2Nzg5cHJvZmlsZTo6Y2VwaDo6Y2xpZW50OjphY2Nlc3Nfa2V5"
profile::ceph::client::export_path: "/volumes/_nogroup/"
profile::ceph::client::mon_host:
- 192.168.1.3:6789
- 192.168.2.3:6789
- 192.168.3.3:6789
profile::ceph::client::mount_binds:
- home
- project
- software
profile::ceph::client::mount_name: 'cephfs'
profile::ceph::client::binds_fcontext_equivalence: '/home'
profile::ceph::client::shares:
home:
project:
```
</details>

## profile::ceph::client::install

This class only installs the Ceph packages.

## `profile::consul`

> [Consul](https://www.consul.io/) is a service networking platform developed by HashiCorp.
@@ -392,14 +390,16 @@ This class installs CVMFS client and configure repositories.
| Variable | Description | Type |
| :------------------------ | :--------------------------------------------- | -------------- |
| `quota_limit` | Instance local cache directory soft quota (MB) | Integer |
| `repositories` | List of CVMFS repositories to mount | Array[String] |
| `alien_cache_repositories`| List of CVMFS repositories that need an alien cache | Array[String] |
| `strict_mount` | If true, mount only repositories that are listed `repositories` | Boolean |
| `repositories` | Fully qualified repository names to include in use of utilities such as `cvmfs_config` | Array[String] |
| `alien_cache_repositories`| List of repositories that require an alien cache | Array[String] |

<details>
<summary>default values</summary>

```yaml
profile::cvmfs::client::quota_limit: 4096
profile::cvmfs::client::strict_mount: false
profile::cvmfs::client::repositories:
- pilot.eessi-hpc.org
- software.eessi.io
@@ -647,13 +647,6 @@ profile::freeipa::mokey::access_tags: "%{alias('profile::users::ldap::access_tag
```
</details>

<details>
<summary>example</summary>

```yaml
```
</details>

## `profile::gpu`

This class installs and configures the NVIDIA GPU drivers if an NVIDIA GPU
@@ -805,41 +798,19 @@ When `profile::nfs::client` is included, these classes are included too:

## `profile::nfs::server`

This class install NFS and configure an NFS server that will export all provided devices.
The class also make sure that devices sharing a common export name form an LVM volume group
that is exported as a single LVM logical volume formated as XFS.

If a volume's size associated with an NFS server device is expanded after the initial configuration,
the class will not expand the LVM volume automatically. These operations currently have to be
accomplished manually.
This class install NFS and configure an NFS server that will export all volumes tagged as `nfs`.

### parameters

| Variable | Description | Type |
| :-------- | :----------------------------------------------- | :---------------------------- |
| `devices` | Mapping between NFS share and devices to export. | Hash[String, Array[String]] |

| `no_root_squash_tags` | Array of tags identifying instances that can mount NFS exports without root squash | Array[String] |

<details>
<summary>default values</summary>

```yaml
profile::nfs::server::devices: "%{alias('terraform.volumes.nfs')}"
```
</details>

<details>
<summary>example</summary>

```yaml
profile::nfs::server::devices:
home:
- /dev/disk/by-id/b0b686f6-62c8-11ee-8c99-0242ac120002
- /dev/disk/by-id/b65acc52-62c8-11ee-8c99-0242ac120002
scratch:
- bfd50252-62c8-11ee-8c99-0242ac120002
project:
- c3b99e00-62c8-11ee-8c99-0242ac120002
profile::nfs::server::no_root_squash_tags: ['mgmt']
```
</details>

@@ -872,10 +843,10 @@ internal services to the Internet.
```yaml
profile::reverse_proxy::domain_name: "%{alias('terraform.data.domain_name')}"
profile::reverse_proxy::subdomains:
ipa: "ipa.int.%{lookup('terraform.data.domain_name')}"
ipa: "ipa.%{lookup('profile::freeipa::base::ipa_domain')}"
mokey: "%{lookup('terraform.tag_ip.mgmt.0')}:%{lookup('profile::freeipa::mokey::port')}"
jupyter: "https://127.0.0.1:8000"
profile::reverse_proxy::main2sub_redit: "jupyter"
profile::reverse_proxy::main2sub_redir: "jupyter"
profile::reverse_proxy::remote_ips: {}
```
</details>
@@ -922,6 +893,17 @@ When `profile::rsyslog::server` is included, these classes are included too:
- [profile::consul](#profileconsul)
- [profile::rsyslog::base](#profilersyslogbase)

## `profile::vector`

This class install and configures vector.dev service to manage logs.
Refer to the [documentation](https://vector.dev/docs/) for configuration.

### parameters

| Variable | Description | Type | Optional ? |
| :---------------------- | :----------------------- | :------ | --------- |
| `config` | Content of the yaml configuration file | String | Yes |

## `profile::slurm::base`

> The [Slurm](https://github.com/schedmd/slurm) Workload Manager, formerly
@@ -944,11 +926,12 @@ to all Slurm's roles. It also installs and configure Munge service.
| :---------------------- | :----------------------- | :------ |
| `cluster_name` | Name of the cluster | String |
| `munge_key` | Base64 encoded Munge key | String |
| `slurm_version` | Slurm version to install | Enum[20.11, 21.08, 22.05, 23.02] |
| `slurm_version` | Slurm version to install | Enum['23.02', '23.11', '24.05'] |
| `os_reserved_memory` | Memory in MB reserved for the operating system on the compute nodes | Integer |
| `suspend_time` | Idle time (seconds) for nodes to becomes eligible for suspension. | Integer |
| `resume_timeout` | Maximum time permitted (seconds) between a node resume request and its availability. | Integer |
| `force_slurm_in_path` | Enable Slurm's bin path in all users (local and LDAP) PATH environment variable | Boolean |
| `enable_scrontab` | Enable user's Slurm-managed crontab | Boolean |
| `enable_x11_forwarding` | Enable Slurm's built-in X11 forwarding capabilities | Boolean |
| `config_addendum` | Additional parameters included at the end of slurm.conf. | String |

@@ -958,7 +941,7 @@ to all Slurm's roles. It also installs and configure Munge service.
```yaml
profile::slurm::base::cluster_name: "%{alias('terraform.data.cluster_name')}"
profile::slurm::base::munge_key: ENC[PKCS7, ...]
profile::slurm::base::slurm_version: '21.08'
profile::slurm::base::slurm_version: '23.11'
profile::slurm::base::os_reserved_memory: 512
profile::slurm::base::suspend_time: 3600
profile::slurm::base::resume_timeout: 3600
@@ -1105,6 +1088,38 @@ When `profile::slurm::accounting` is included, these classes are included too:
- [`profile::slurm::base`](#profileslurmbase)
- [`profile::mail::server`](#profilemailserver)


## `profile::slurm::node`

This class installs and configure the Slurm node daemon - **slurmd**.

### parameters

| Variable | Description | Type |
| :---------------------- | :-------------------------------------------------------------------------------------------- | :------ |
| `enable_tmpfs_mounts` | Enable [spank-cc-tmpfs_mounts](https://github.com/ComputeCanada/spank-cc-tmpfs_mounts) plugin | Boolean |

<details>
<summary>default values</summary>

```yaml
profile::slurm::node::enable_tmpfs_mounts: true
```
</details>

<details>
<summary>example</summary>

```yaml
profile::slurm::node::enable_tmpfs_mounts: false
```
</details>

### dependency

When `profile::slurm::node` is included, this class is included too:
- [`profile::slurm::base`](#profileslurmbase)

## `profile::software_stack`

This class configures the initial shell profile that user will load on login and
@@ -1385,7 +1400,7 @@ A `profile::users::local_user` is defined as a dictionary with the following key
| `sudoer` | If enable, the user can sudo without password | Boolean | Yes |
| `selinux_user` | SELinux context for the user | String | Yes |
| `mls_range` | MLS Range for the user | String | Yes |

| `authenticationmethods` | Specifies AuthenticationMethods value for this user in sshd_config | String | Yes |

<details>
<summary>default values</summary>
@@ -1396,6 +1411,7 @@ profile::users::local::users:
public_keys: "%{alias('terraform.data.public_keys')}"
groups: ['adm', 'wheel', 'systemd-journal']
sudoer: true
authenticationmethods: 'publickey'
```

If `profile::users::local::users` is present in more than one YAML file in the hierarchy,
@@ -1416,5 +1432,49 @@ profile::users::local::users:
# sudoer: false
# selinux_user: 'unconfined_u'
# mls_range: ''s0-s0:c0.c1023'
# authenticationmethods: 'publickey,password publickey,keyboard-interactive'
```
</details>

## `profile::volumes`

This class creates and mounts LVM volume groups. Each volume is formated as XFS.

If a volume is expanded after the initial configuration, the class will not expand the
LVM volume automatically. These operations currently have to be accomplished manually.

### parameters

| Variable | Description | Type |
| :--------- | :----------------------------------------------------------------------------- | :---------------------------------------- |
| `devices` | Hash of devices | Hash[String, Hash[String, Hash]] |

<details>
<summary>default values</summary>

```yaml
profile::volumes::devices: "%{lookup('terraform.self.volumes')}"
```
</details>


<details>
<summary>examples</summary>

```yaml
profile::volumes::devices:
"local":
"tmp":
"glob": "/dev/vdc",
"size": 100,
#"bind_mount": true,
#"bind_target": "/tmp",
#"owner": "root",
#"group": "root",
#"mode": "0755",
#"seltype": "home_root_t",
#"enable_resize": false,
#"filesystem": "xfs",
#"quota": nil
```
</details>
2 changes: 1 addition & 1 deletion data/cloud/openstack/arbutus.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
profile::gpu::install::vgpu::installer: rpm
profile::gpu::install::vgpu::rpm::source: http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el%{facts.os.release.major}.noarch.rpm
profile::gpu::install::vgpu::rpm::source: http://repo.arbutus.cloud.computecanada.ca/pulp/repos/alma%{facts.os.release.major}/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el%{facts.os.release.major}.noarch.rpm
profile::gpu::install::vgpu::rpm::packages:
- nvidia-vgpu-kmod
- nvidia-vgpu-gridd
39 changes: 26 additions & 13 deletions data/common.yaml
Original file line number Diff line number Diff line change
@@ -9,7 +9,8 @@ lookup_options:
prometheus::alerts:
merge: 'deep'

profile::base::version: 13.5.0
profile::base::version: 14.1.2
profile::base::packages: []

motd::content: ""

@@ -55,21 +56,21 @@ jupyterhub::jupyterhub_config_hash:
ui_args:
notebook:
name: Jupyter Notebook
args: ['--SingleUserNotebookApp.default_url=/tree']
url: '/tree'
lab:
name: JupyterLab
terminal:
name: Terminal
args: ['--SingleUserNotebookApp.default_url=/terminals/1']
url: '/terminals/1'
rstudio:
name: RStudio
args: ['--SingleUserNotebookApp.default_url=/rstudio']
url: '/rstudio'
code-server:
name: VS Code
args: ['--SingleUserNotebookApp.default_url=/code-server']
url: '/code-server'
desktop:
name: Desktop
args: ['--SingleUserNotebookApp.default_url=/Desktop']
url: '/Desktop'

SbatchForm:
ui:
@@ -185,6 +186,7 @@ prometheus::server::scrape_configs:
- __meta_consul_node
target_label: instance
- job_name: jupyterhub
metrics_path: "/hub/metrics"
scrape_interval: 10s
scrape_timeout: 10s
honor_labels: true
@@ -249,9 +251,10 @@ profile::freeipa::mokey::access_tags: "%{alias('profile::users::ldap::access_tag
profile::freeipa::server::id_start: 60001
profile::software_stack::min_uid: "%{alias('profile::freeipa::server::id_start')}"

profile::slurm::base::slurm_version: '23.02'
profile::slurm::base::slurm_version: '24.05'
profile::slurm::base::os_reserved_memory: 512
profile::slurm::controller::autoscale_version: '0.5.1'
profile::slurm::node::enable_tmpfs_mounts: true

profile::accounts::project_regex: '(ctb|def|rpp|rrg)-[a-z0-9_-]*'
profile::users::ldap::access_tags: ['login:sshd', 'node:sshd', 'proxy:jupyterhub-login']
@@ -267,26 +270,36 @@ profile::users::local::users:
public_keys: "%{alias('terraform.data.public_keys')}"
groups: ['adm', 'wheel', 'systemd-journal']
sudoer: true
authenticationmethods: 'publickey'


profile::freeipa::base::domain_name: "%{alias('terraform.data.domain_name')}"
profile::freeipa::base::ipa_domain: "int.%{lookup('terraform.data.domain_name')}"

profile::slurm::base::cluster_name: "%{alias('terraform.data.cluster_name')}"

profile::freeipa::client::server_ip: "%{alias('terraform.tag_ip.mgmt.0')}"
profile::consul::servers: "%{alias('terraform.tag_ip.puppet')}"

profile::nfs::server::domain_name: "%{hiera('profile::freeipa::base::domain_name')}"
profile::nfs::client::domain_name: "%{hiera('profile::freeipa::base::domain_name')}"
profile::nfs::domain: "%{lookup('profile::freeipa::base::ipa_domain')}"
profile::nfs::client::server_ip: "%{alias('terraform.tag_ip.nfs.0')}"

profile::nfs::server::devices: "%{alias('terraform.volumes.nfs')}"
profile::volumes::devices: "%{alias('terraform.self.volumes')}"

profile::reverse_proxy::domain_name: "%{alias('terraform.data.domain_name')}"
profile::reverse_proxy::subdomains:
ipa: "ipa.int.%{lookup('terraform.data.domain_name')}"
ipa: "ipa.%{lookup('profile::freeipa::base::ipa_domain')}"
mokey: "%{lookup('terraform.tag_ip.mgmt.0')}:%{lookup('profile::freeipa::mokey::port')}"
jupyter: "https://127.0.0.1:8000"

profile::jupyterhub::hub::register_url: "https://mokey.%{lookup('terraform.data.domain_name')}/auth/signup"
profile::jupyterhub::hub::reset_pw_url: "https://mokey.%{lookup('terraform.data.domain_name')}/auth/forgotpw"

profile::gpu::install::passthrough::packages:
- nvidia-driver-cuda-libs
- nvidia-driver
- nvidia-driver-devel
- nvidia-driver-libs
- nvidia-driver-NVML
- nvidia-modprobe
- nvidia-xconfig
- nvidia-persistenced
- nvidia-driver-cuda
16 changes: 0 additions & 16 deletions data/os/RedHat/7.yaml

This file was deleted.

12 changes: 0 additions & 12 deletions data/os/RedHat/8.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,2 @@
---
profile::freeipa::server::regen_cert_cmd: ipa-getcert list | grep -oP "Request ID '\K[^']+" | xargs -I '{}' ipa-getcert resubmit -i '{}' -w
profile::gpu::install::passthrough::packages:
- kmod-nvidia-latest-dkms # require to be first package, otherwise kmod-nivida is installed
- nvidia-driver-cuda-libs
- nvidia-driver
- nvidia-driver-devel
- nvidia-driver-libs
- nvidia-driver-NVML
- nvidia-modprobe
- nvidia-xconfig
- nvidia-persistenced

os::redhat::python3::version: 3.6
11 changes: 0 additions & 11 deletions data/os/RedHat/9.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,2 @@
---
os::redhat::python3::version: 3.9
profile::freeipa::server::regen_cert_cmd: ipa-getcert list | grep -oP "Request ID '\K[^']+" | xargs -I '{}' ipa-getcert resubmit -i '{}' -w
profile::gpu::install::passthrough::packages:
- kmod-nvidia-latest-dkms # require to be first package, otherwise kmod-nivida is installed
- nvidia-driver-cuda-libs
- nvidia-driver
- nvidia-driver-devel
- nvidia-driver-libs
- nvidia-driver-NVML
- nvidia-modprobe
- nvidia-xconfig
- nvidia-persistenced
5 changes: 5 additions & 0 deletions data/site.yaml
Original file line number Diff line number Diff line change
@@ -4,6 +4,8 @@ lookup_options:
merge: 'first'
magic_castle::site::tags:
merge: 'hash'
terraform:
merge: 'hash'

magic_castle::site::all:
- profile::base
@@ -13,6 +15,7 @@ magic_castle::site::all:
- profile::sssd::client
- profile::metrics::node_exporter
- profile::rsyslog::client
- profile::volumes
- swap_file

magic_castle::site::tags:
@@ -37,6 +40,7 @@ magic_castle::site::tags:
- profile::freeipa::mokey
- profile::slurm::accounting
- profile::accounts
- profile::nfs
- profile::users::ldap
node:
- profile::gpu
@@ -52,6 +56,7 @@ magic_castle::site::tags:
- profile::cvmfs::alien_cache
proxy:
- profile::jupyterhub::hub
- profile::jupyterhub::hub::keytab
- profile::reverse_proxy
efa:
- profile::efa
16 changes: 8 additions & 8 deletions data/software_stack/computecanada.yaml
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
profile::software_stack::initial_profile: "/cvmfs/soft.computecanada.ca/config/profile/bash.sh"
profile::software_stack::lmod_default_modules:
- gentoo/2020
- imkl/2020.1.217
- gcc/9.3.0
- openmpi/4.0.3
- StdEnv/2023

jupyterhub::kernel::venv::python: /cvmfs/soft.computecanada.ca/easybuild/software/2020/%{facts.cpu_ext}/Core/python/3.9.6/bin/python
jupyterhub::kernel::venv::python: /cvmfs/soft.computecanada.ca/easybuild/software/2023/%{facts.cpu_microarch}/Compiler/gcccore/python/3.11.5/bin/python
jupyterhub::kernel::venv::prefix: /opt/ipython-kernel-computecanada
jupyterhub::kernel::venv::pip_environment:
PYTHONPATH: "/cvmfs/soft.computecanada.ca/custom/python/site-packages"
PIP_CONFIG_FILE: "/cvmfs/soft.computecanada.ca/config/python/pip-%{facts.cpu_ext}-gentoo.conf"
PYTHONPATH: "/cvmfs/soft.computecanada.ca/easybuild/python/site-packages:/cvmfs/soft.computecanada.ca/custom/python/site-packages"
PIP_CONFIG_FILE: "/cvmfs/soft.computecanada.ca/config/python/pip-%{facts.cpu_microarch}-gentoo2023.conf"
jupyterhub::kernel::venv::kernel_environment:
"PYTHONPATH": "/cvmfs/soft.computecanada.ca/easybuild/python/site-packages:${PYTHONPATH}"
"EBPYTHONPREFIXES": "${SLURM_TMPDIR}:${EBPYTHONPREFIXES}"

jupyterhub::jupyterhub_config_hash:
SlurmFormSpawner:
ui_args:
rstudio:
modules: ['gcc/9.3.0', 'rstudio-server']
modules: ['rstudio-server']
code-server:
modules: ['code-server']

32 changes: 27 additions & 5 deletions hiera.yaml
Original file line number Diff line number Diff line change
@@ -8,14 +8,37 @@ defaults:
# datadir: data
# data_hash: yaml_data
hierarchy:
- name: "Terraform and user data"
- name: "Per hostname"
globs:
- "user_data/hostnames/%{facts.networking.hostname}/*.yaml"
- "user_data/hostnames/%{facts.networking.hostname}.yaml"
lookup_key: eyaml_lookup_key # eyaml backend
paths:
options:
pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem
- name: "Per prefix"
globs:
- "user_data/prefixes/%{facts.prefix}/*.yaml"
- "user_data/prefixes/%{facts.prefix}.yaml"
lookup_key: eyaml_lookup_key # eyaml backend
options:
pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem
- name: "Rest of user data"
globs:
- "user_data/*.yaml"
- "user_data.yaml"
- "terraform_data.yaml"
lookup_key: eyaml_lookup_key # eyaml backend
options:
pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem
pkcs7_public_key: /etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem
- name: "Terraform data"
path: "terraform_data.yaml"
lookup_key: eyaml_lookup_key # eyaml backend
options:
pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem
- name: "Terraform self"
data_hash: terraform_self
path: "terraform_data.yaml"
options:
hostname: "%{facts.networking.hostname}"
- name: "Software stack"
path: "software_stack/%{facts.software_stack}.yaml"
- name: "Cloud provider"
@@ -31,6 +54,5 @@ hierarchy:
- "bootstrap.yaml"
options:
pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/boot_private_key.pkcs7.pem
pkcs7_public_key: /etc/puppetlabs/puppet/eyaml/boot_public_key.pkcs7.pem
- name: "site.pp definition"
path: "site.yaml"
19 changes: 19 additions & 0 deletions lib/puppet/functions/terraform_self.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
require 'yaml'

Puppet::Functions.create_function(:terraform_self) do
dispatch :terraform_self do
param 'Hash', :options
param 'Puppet::LookupContext', :context
end

def terraform_self(options, context)
path = options['path']
hostname = options['hostname']
data = context.cached_file_data(path) do |content|
begin
Puppet::Util::Yaml.safe_load(content, [Symbol], path)
end
end
return { 'terraform' => { 'self' => data['terraform']['instances'][hostname] || {} } }
end
end
2 changes: 1 addition & 1 deletion manifests/site.pp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
node default {
$instance_tags = lookup("terraform.instances.${facts['networking']['hostname']}.tags")
$instance_tags = lookup('terraform.self.tags')

$include_all = lookup('magic_castle::site::all', undef, undef, [])

12 changes: 11 additions & 1 deletion site/profile/facts.d/cpu_ext.sh
Original file line number Diff line number Diff line change
@@ -8,4 +8,14 @@ case "$cpu_ext" in
cpu_ext="sse3"
;;
esac
echo "{ 'cpu_ext' : '${cpu_ext}' }"

case "$cpu_ext" in
avx512)
cpu_microarch="x86-64-v4"
;;
avx2)
cpu_microarch="x86-64-v3"
;;
esac

echo "{ 'cpu_ext' : '${cpu_ext}', 'cpu_microarch': '${cpu_microarch}' }"
9 changes: 4 additions & 5 deletions site/profile/facts.d/dev_disk.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
#!/bin/bash

echo "{ '/dev/disk' : {"
echo "---"
echo \"/dev/disk\":
for i in $(find /dev/disk -type l); do
echo \"$i\":\"$(readlink -f $i)\";
done | paste -sd,
echo '}}'
echo " "\"$i\": \"$(readlink -f $i)\"
done
5 changes: 5 additions & 0 deletions site/profile/facts.d/ipa.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/sh
echo "---"
echo '"ipa":'
echo ' "installed":' $(test -f /etc/ipa/default.conf && echo "true" || echo "false")
echo ' "domain":' $(test -f /etc/ipa/default.conf && grep -oP 'domain\s*=\s*\K(.*)' /etc/ipa/default.conf)
91 changes: 42 additions & 49 deletions site/profile/files/accounts/account_functions.sh
Original file line number Diff line number Diff line change
@@ -37,10 +37,9 @@ mkhome () {
return 1
fi

local MNT_USER_HOME="/mnt${USER_HOME}"
local RSYNC_DONE=0
for i in $(seq 1 5); do
rsync -opg -r -u --chown=$USER_UID:$USER_UID --chmod=Dg-rwx,o-rwx,Fg-rwx,o-rwx,u+X /etc/skel.ipa/ ${MNT_USER_HOME}
rsync -opg -r -u --chown=$USER_UID:$USER_UID --chmod=Dg-rwx,o-rwx,Fg-rwx,o-rwx,u+X /etc/skel.ipa/ ${USER_HOME}
if [ $? -eq 0 ]; then
RSYNC_DONE=1
break
@@ -49,12 +48,12 @@ mkhome () {
fi
done
if [ ! $RSYNC_DONE -eq 1 ]; then
echo "ERROR::${FUNCNAME} ${USERNAME}: cannot copy /etc/skel.ipa in ${MNT_USER_HOME}"
echo "ERROR::${FUNCNAME} ${USERNAME}: cannot copy /etc/skel.ipa in ${USER_HOME}"
return 1
else
echo "INFO::${FUNCNAME} ${USERNAME}: created ${MNT_USER_HOME}"
echo "INFO::${FUNCNAME} ${USERNAME}: created ${USER_HOME}"
fi
restorecon -F -R ${MNT_USER_HOME}
restorecon -F -R ${USER_HOME}
}

mkscratch () {
@@ -88,25 +87,24 @@ mkscratch () {
fi

local USER_SCRATCH="/scratch/${USERNAME}"
local MNT_USER_SCRATCH="/mnt${USER_SCRATCH}"
if [[ ! -d "${MNT_USER_SCRATCH}" ]]; then
mkdir -p ${MNT_USER_SCRATCH}
if [[ ! -d "${USER_SCRATCH}" ]]; then
mkdir -p ${USER_SCRATCH}
if [ "$WITH_HOME" == "true" ]; then
local MNT_USER_HOME="/mnt${USER_HOME}"
ln -sfT ${USER_SCRATCH} "${MNT_USER_HOME}/scratch"
chown -h ${USER_UID}:${USER_UID} "${MNT_USER_HOME}/scratch"
ln -sfT ${USER_SCRATCH} "${USER_HOME}/scratch"
chown -h ${USER_UID}:${USER_UID} "${USER_HOME}/scratch"
fi
chown -h ${USER_UID}:${USER_UID} ${MNT_USER_SCRATCH}
chmod 750 ${MNT_USER_SCRATCH}
restorecon -F -R ${MNT_USER_SCRATCH}
echo "INFO::${FUNCNAME} ${USERNAME}: created ${MNT_USER_SCRATCH}"
chown -h ${USER_UID}:${USER_UID} ${USER_SCRATCH}
chmod 750 ${USER_SCRATCH}
restorecon -F -R ${USER_SCRATCH}
echo "INFO::${FUNCNAME} ${USERNAME}: created ${USER_SCRATCH}"
fi
return 0
}

mkproject() {
local GROUP=$1
local WITH_FOLDER=$2
local PROJECT_GROUP="/project/$GROUP"

if [ -z "${GROUP}" ]; then
echo "ERROR::${FUNCNAME}: group unspecified"
@@ -116,27 +114,27 @@ mkproject() {
if mkdir /var/lock/mkproject.$GROUP.lock; then
# A new group has been created
if [ "$WITH_FOLDER" == "true" ]; then
GID=$(SSS_NSS_USE_MEMCACHE=no getent group $GROUP 2> /dev/null | cut -d: -f3)
local GID=$(SSS_NSS_USE_MEMCACHE=no getent group $GROUP 2> /dev/null | cut -d: -f3)
if [ $? -eq 0 ]; then
GID=$(kexec ipa group-show ${GROUP} | grep -oP 'GID: \K([0-9].*)')
local GID=$(kexec ipa group-show ${GROUP} | grep -oP 'GID: \K([0-9].*)')
fi

if [ -z "${GID}" ]; then
echo "ERROR::${FUNCNAME} ${GROUP}: GID not defined"
return 1
fi

MNT_PROJECT_GID="/mnt/project/$GID"
if [ ! -d ${MNT_PROJECT_GID} ]; then
MNT_PROJECT_GROUP="/mnt/project/$GROUP"
mkdir -p ${MNT_PROJECT_GID}
chown root:${GID} ${MNT_PROJECT_GID}
chmod 2770 ${MNT_PROJECT_GID}
ln -sfT "/project/$GID" ${MNT_PROJECT_GROUP}
restorecon -F -R ${MNT_PROJECT_GID} ${MNT_PROJECT_GROUP}
echo "INFO::${FUNCNAME} ${GROUP}: created ${MNT_PROJECT_GID}"
local PROJECT_GID="/project/$GID"
if [ ! -d ${PROJECT_GID} ]; then
local PROJECT_GROUP="/project/$GROUP"
mkdir -p ${PROJECT_GID}
chown root:${GID} ${PROJECT_GID}
chmod 2770 ${PROJECT_GID}
ln -sfT "/project/$GID" ${PROJECT_GROUP}
restorecon -F -R ${PROJECT_GID} ${PROJECT_GROUP}
echo "INFO::${FUNCNAME} ${GROUP}: created ${PROJECT_GID}"
else
echo "WARN::${FUNCNAME} ${GROUP}: ${MNT_PROJECT_GID} already exists"
echo "WARN::${FUNCNAME} ${GROUP}: ${PROJECT_GID} already exists"
fi
fi
# We create the associated account in slurm
@@ -163,12 +161,13 @@ modproject() {
return 3
fi

local PROJECT_GROUP="/project/$GROUP"
# mkproject is currently running, we skip adding more folder under the project
if [ -d /var/lock/mkproject.$GROUP.lock ]; then
echo "ERROR::${FUNCNAME}: $GROUP $USERNAMES group folder is locked"
return 1
fi
local GROUP_LINK=$(readlink /mnt/project/${GROUP})
local GROUP_LINK=$(readlink /project/${GROUP})
# mkproject has yet been ran for this group, skip it
if [[ "${WITH_FOLDER}" == "true" ]]; then
if [[ -z "${GROUP_LINK}" ]]; then
@@ -185,7 +184,6 @@ modproject() {
# If we found none, $USERNAMES will be empty, and it means we don't have
# anything to add to Slurm and /project
if [[ ! -z "${USERNAMES}" ]]; then
local MNT_PROJECT="/mnt${GROUP_LINK}"
if [ "$WITH_FOLDER" == "true" ]; then
for USERNAME in $USERNAMES; do
# Slurm needs the UID to be available via SSSD
@@ -201,23 +199,20 @@ modproject() {
return 1
fi

local MNT_USER_HOME="/mnt${USER_HOME}"

mkdir -p "${MNT_USER_HOME}/projects"
chgrp "${USER_UID}" "${MNT_USER_HOME}/projects"
chmod 0755 "${MNT_USER_HOME}/projects"
ln -sfT "/project/${GROUP}" "${MNT_USER_HOME}/projects/${GROUP}"

local PRO_USER="${MNT_PROJECT}/${USERNAME}"
if [ ! -d "${PRO_USER}" ]; then
mkdir -p ${PRO_USER}

chown "${USER_UID}" "${PRO_USER}"
chmod 2700 "${PRO_USER}"
restorecon -F -R "${PRO_USER}"
echo "INFO::${FUNCNAME} ${GROUP} ${USERNAME}: created ${PRO_USER}"
mkdir -p "${USER_HOME}/projects"
chgrp "${USER_UID}" "${USER_HOME}/projects"
chmod 0755 "${USER_HOME}/projects"
ln -sfT "${PROJECT_GROUP}" "${USER_HOME}/projects/${GROUP}"

local PROJECT_USER="${PROJECT_GROUP}/${USERNAME}"
if [ ! -d "${PROJECT_USER}" ]; then
mkdir -p ${PROJECT_USER}
chown "${USER_UID}" "${PROJECT_USER}"
chmod 2700 "${PROJECT_USER}"
restorecon -F -R "${PROJECT_USER}"
echo "INFO::${FUNCNAME} ${GROUP} ${USERNAME}: created ${PROJECT_USER}"
else
echo "WARN::${FUNCNAME} ${GROUP} ${USERNAME}: ${PRO_USER} already exists"
echo "WARN::${FUNCNAME} ${GROUP} ${USERNAME}: ${PROJECT_USER} already exists"
fi
done
fi
@@ -252,8 +247,7 @@ modproject() {
return 1
fi

local MNT_USER_HOME="/mnt${USER_HOME}"
rm "${MNT_USER_HOME}/projects/$GROUP" &> /dev/null
rm "${USER_HOME}/projects/$GROUP" &> /dev/null
if [ $? -eq 0 ]; then
echo "INFO::${FUNCNAME} ${GROUP}: removed symlink $USER_HOME/projects/$GROUP"
else
@@ -298,8 +292,7 @@ delproject() {
return 1
fi

local MNT_USER_HOME="/mnt${USER_HOME}"
rm "${MNT_USER_HOME}/projects/$GROUP"
rm "${USER_HOME}/projects/$GROUP"
done
fi
fi
16 changes: 11 additions & 5 deletions site/profile/files/base/prepare4image.sh
Original file line number Diff line number Diff line change
@@ -14,19 +14,21 @@ rm -f /var/log/ipaclient-install.log
rm -rf /etc/sssd/sssd.conf.deleted

rm -rf /etc/puppetlabs
rm -rf /opt/puppetlabs/puppet/cache
rm -rf /opt/puppetlabs/puppet/cache/{clientbucket,client_data,client_yaml,state}
rm /opt/consul/node-id /opt/consul/checkpoint-signature /opt/consul/serf/local.snapshot

# Turn off swap
swapoff -a
grep -q "swap" /etc/fstab && rm -f $(grep "swap" /etc/fstab | cut -f 1)
# Unmount filesystems
grep -v -P '(ext4|xfs|vfat|swap|^#|^$)' /etc/fstab | cut -f 2 | xargs umount
grep -P '(ext4|xfs|vfat|^#|^$)' /etc/fstab > /etc/fstab.new
umount -a --types cephfs,nfs4
# for xfs, we unmount only what's in /mnt, not things like / or /boot
grep xfs /etc/fstab | cut -f 2 | grep /mnt | xargs --no-run-if-empty umount
grep -P '(ext4|xfs|vfat|^#|^$)' /etc/fstab | grep -v /mnt > /etc/fstab.new
mv -f /etc/fstab.new /etc/fstab
systemctl daemon-reload

systemctl stop syslog
systemctl stop rsyslog
: > /var/log/messages
: > /var/log/munge/munged.log
: > /var/log/secure
@@ -50,10 +52,14 @@ rm -f /etc/hostname
rm -f /etc/udev/rules.d/70-persistent-net.rules
: > /etc/sysconfig/network
: > /etc/machine-id

rm /etc/NetworkManager/conf.d/zzz-puppet.conf
: > /etc/resolv.conf

cat > /etc/sysconfig/network-scripts/ifcfg-eth0 << EOF
DEVICE=eth0
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=dhcp
EOF
halt -p
halt -p
18 changes: 0 additions & 18 deletions site/profile/files/slurm/slurm-consul.tpl

This file was deleted.

15 changes: 9 additions & 6 deletions site/profile/files/users/ipa_create_user.py
Original file line number Diff line number Diff line change
@@ -109,13 +109,16 @@ def main(users, posix_groups, nonposix_groups, passwd, sshpubkeys):
)
if user is not None:
added_users.add(username)
for group in posix_groups:
group_add(group)
group_add_members(group, users)

for group in nonposix_groups:
group_add(group, nonposix=True)
group_add_members(group, users)
if posix_groups:
for group in posix_groups:
group_add(group)
group_add_members(group, users)

if nonposix_groups:
for group in nonposix_groups:
group_add(group, nonposix=True)
group_add_members(group, users)

if passwd:
# configure user password
11 changes: 11 additions & 0 deletions site/profile/files/vector/default_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
sources:
in:
type: "stdin"

sinks:
out:
inputs:
- "in"
type: "console"
encoding:
codec: "text"
13 changes: 10 additions & 3 deletions site/profile/functions/generate_slurm_node_line.pp
Original file line number Diff line number Diff line change
@@ -1,14 +1,21 @@
function profile::generate_slurm_node_line($name, $attr, $weight) >> String {
function profile::generate_slurm_node_line($name, $attr, $comp_weight) >> String {
if $attr['specs']['gpus'] > 0 {
if $attr['specs']['mig'] and ! $attr['specs']['mig'].empty {
$gres = $attr['specs']['mig'].map|$key,$value| {
$gpu = $attr['specs']['mig'].map|$key,$value| {
['gpu', $key, $value * $attr['specs']['gpus']].join(':')
}.join(',')
} else {
$gres = "gpu:${attr['specs']['gpus']}"
$gpu = "gpu:${attr['specs']['gpus']}"
}
if $attr['specs']['shard'] and ! $attr['specs']['shard'].empty {
$shard = ",shard:${attr['specs']['shard']}"
} else {
$shard = ''
}
$gres = "${gpu}${shard}"
} else {
$gres = 'gpu:0'
}
$weight = pick($attr['specs']['weight'], $comp_weight)
"NodeName=${name} CPUs=${attr['specs']['cpus']} RealMemory=${attr['specs']['ram']} Gres=${gres} Weight=${weight}"
}
16 changes: 16 additions & 0 deletions site/profile/functions/gethostnames_with_class.pp
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
function profile::gethostnames_with_class($class_name) >> Array[String] {
$instances = lookup('terraform.instances')
$site_all = lookup('magic_castle::site::all')
$site_tags = lookup('magic_castle::site::tags')

if $class_name in $site_all {
return $instances.keys()
} else {
$tags = keys($site_tags).filter |$tag| {
$class_name in $site_tags[$tag]
}
return keys($instances).filter |$hostname| {
!intersection($tags, $instances[$hostname]['tags']).empty
}
}
}
2 changes: 1 addition & 1 deletion site/profile/functions/getlocalinterface.pp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
function profile::getlocalinterface() >> String {
$local_ip = lookup("terraform.instances.${facts['networking']['hostname']}.local_ip")
$local_ip = lookup('terraform.self.local_ip')
$interfaces = keys($facts['networking']['interfaces'])
$search = $interfaces.filter | $interface | {
$facts['networking']['interfaces'][$interface]['ip'] == $local_ip
23 changes: 12 additions & 11 deletions site/profile/manifests/accounts.pp
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
# @summary Class configuring services to bridge LDAP users, Slurm accounts and users' folders in filesystems
# @param project_regex Regex identifying FreeIPA groups that require a corresponding Slurm account
# @param manage_home
# @param manage_scratch
# @param manage_project
# @param skel_archives Archives extracted in each FreeIPA user's home when created
class profile::accounts (
String $project_regex,
Boolean $manage_home = true,
Boolean $manage_scratch = true,
Boolean $manage_project = true,
Array[Struct[{ filename => String[1], source => String[1] }]] $skel_archives = [],
) {
Service <| tag == profile::slurm |> -> Service['mkhome']
@@ -12,11 +18,6 @@
Mount <| |> -> Service['mkhome']
Mount <| |> -> Service['mkproject']

$nfs_devices = lookup('profile::nfs::server::devices', undef, undef, {})
$with_home = 'home' in $nfs_devices
$with_project = 'project' in $nfs_devices
$with_scratch = 'scratch' in $nfs_devices

package { 'rsync':
ensure => 'installed',
}
@@ -29,10 +30,10 @@
file { '/sbin/mkhome.sh':
content => epp('profile/accounts/mkhome.sh',
{
with_home => $with_home,
with_scratch => $with_scratch,
project_regex => $project_regex,
with_project => $with_project,
manage_home => $manage_home,
manage_scratch => $manage_scratch,
project_regex => $project_regex,
manage_project => $manage_project,
}
),
mode => '0755',
@@ -93,7 +94,7 @@
path => ['/bin/', '/usr/bin'],
}

$mkhome_running = $with_home or $with_scratch
$mkhome_running = $manage_home or $manage_scratch
service { 'mkhome':
ensure => $mkhome_running,
enable => $mkhome_running,
@@ -113,7 +114,7 @@
content => epp('profile/accounts/mkproject.sh',
{
project_regex => $project_regex,
with_folder => $with_project,
manage_folder => $manage_project,
}
),
mode => '0755',
43 changes: 18 additions & 25 deletions site/profile/manifests/base.pp
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
class profile::base (
String $version,
Array[String] $packages,
Optional[String] $admin_email = undef,
) {
include stdlib
@@ -18,12 +19,6 @@
mode => '0755',
}

if dig($::facts, 'os', 'release', 'major') == '7' {
package { 'yum-plugin-priorities':
ensure => 'installed',
}
}

file { '/etc/localtime':
ensure => link,
target => '/usr/share/zoneinfo/UTC',
@@ -65,13 +60,16 @@
ensure => 'absent',
}

class { 'firewall': }
class { 'firewall':
tag => 'mc_bootstrap',
}

firewall { '001 accept all from local network':
chain => 'INPUT',
proto => 'all',
source => profile::getcidr(),
action => 'accept',
tag => 'mc_bootstrap',
}

firewall { '001 drop access to metadata server':
@@ -80,6 +78,7 @@
destination => '169.254.169.254',
action => 'drop',
uid => '! root',
tag => 'mc_bootstrap',
}

package { 'haveged':
@@ -98,6 +97,8 @@
require => Package['haveged'],
}

ensure_packages($packages, { ensure => 'installed', require => Yumrepo['epel'] })

if $::facts.dig('cloud', 'provider') == 'azure' {
include profile::base::azure
}
@@ -132,8 +133,7 @@

# build /etc/hosts
class profile::base::etc_hosts {
$domain_name = lookup('profile::freeipa::base::domain_name')
$int_domain_name = "int.${domain_name}"
$ipa_domain = lookup('profile::freeipa::base::ipa_domain')
$instances = lookup('terraform.instances')

# build /etc/hosts
@@ -144,28 +144,21 @@
content => epp('profile/base/hosts',
{
'instances' => $instances,
'int_domain_name' => $int_domain_name,
'int_domain_name' => $ipa_domain,
}
),
}
}

class profile::base::powertools {
if versioncmp($::facts['os']['release']['major'], '8') >= 0 {
if versioncmp($::facts['os']['release']['major'], '8') == 0 {
$repo_name = 'powertools'
} else {
$repo_name = 'crb'
}
exec { 'enable_powertools':
command => "dnf config-manager --set-enabled ${$repo_name}",
unless => "dnf config-manager --dump ${repo_name} | grep -q \'enabled = 1\'",
path => ['/usr/bin'],
}
if versioncmp($::facts['os']['release']['major'], '8') == 0 {
$repo_name = 'powertools'
} else {
exec { 'enable_powertools':
command => '/bin/true',
refreshonly => true,
}
$repo_name = 'crb'
}
exec { 'enable_powertools':
command => "dnf config-manager --set-enabled ${$repo_name}",
unless => "dnf config-manager --dump ${repo_name} | grep -q \'enabled = 1\'",
path => ['/usr/bin'],
}
}
155 changes: 79 additions & 76 deletions site/profile/manifests/ceph.pp
Original file line number Diff line number Diff line change
@@ -1,130 +1,133 @@
type BindMount = Struct[{
'src' => Stdlib::Unixpath,
'dst' => Stdlib::Unixpath,
'type' => Optional[Enum['file', 'directory']],
}]

type CephFS = Struct[
{
'share_name' => String,
'access_key' => String,
'export_path' => Stdlib::Unixpath,
'bind_mounts' => Optional[Array[BindMount]],
'binds_fcontext_equivalence' => Optional[Stdlib::Unixpath],
}
]

class profile::ceph::client (
String $share_name,
String $access_key,
String $export_path,
Array[String] $mon_host,
Array[String] $mount_binds = [],
String $mount_name = 'cephfs01',
String $binds_fcontext_equivalence = '/home',
Hash[String, CephFS] $shares,
) {
class { 'profile::ceph::client::config':
share_name => $share_name,
access_key => $access_key,
export_path => $export_path,
mon_host => $mon_host,
}

file { "/mnt/${mount_name}":
ensure => directory,
}
require profile::ceph::client::install

$mon_host_string = join($mon_host, ',')
mount { "/mnt/${mount_name}":
ensure => 'mounted',
fstype => 'ceph',
device => "${mon_host_string}:${export_path}",
options => "name=${share_name},secretfile=/etc/ceph/client.keyonly.${share_name}",
require => Class['profile::ceph::client::config'],
}

$mount_binds.each |$mount| {
file { "/mnt/${mount_name}/${mount}":
ensure => directory,
require => Class['profile::ceph::client::config'],
}
file { "/${mount}":
ensure => directory,
require => Class['profile::ceph::client::config'],
}
mount { "/${mount}":
ensure => 'mounted',
fstype => 'none',
options => 'rw,bind',
device => "/mnt/${mount_name}/${mount}",
require => [
File["/mnt/${mount_name}/${mount}"],
File["/${mount}"],
],
}
$ceph_conf = @("EOT")
[client]
client quota = true
mon host = ${mon_host_string}
| EOT

if ($binds_fcontext_equivalence != '' and "/${mount}" != $binds_fcontext_equivalence) {
selinux::fcontext::equivalence { "/${mount}":
ensure => 'present',
target => $binds_fcontext_equivalence,
require => Mount["/${mount}"],
notify => Selinux::Exec_restorecon["/${mount}"],
}
selinux::exec_restorecon { "/${mount}": }
}
file { '/etc/ceph/ceph.conf':
content => $ceph_conf,
}

ensure_resources(profile::ceph::client::share, $shares, { 'mon_host' => $mon_host, 'bind_mounts' => [] })
}

class profile::ceph::client::install {
class profile::ceph::client::install (
String $release = 'reef',
Optional[String] $version = undef,
) {
include epel

if $version != undef and $version != '' {
$repo = "rpm-${version}"
} else {
$repo = "rpm-${release}"
}

yumrepo { 'ceph-stable':
ensure => present,
enabled => true,
baseurl => "https://download.ceph.com/rpm-nautilus/el${$::facts['os']['release']['major']}/${::facts['architecture']}/",
baseurl => "https://download.ceph.com/${repo}/el${$::facts['os']['release']['major']}/${::facts['architecture']}/",
gpgcheck => 1,
gpgkey => 'https://download.ceph.com/keys/release.asc',
repo_gpgcheck => 0,
}

if versioncmp($::facts['os']['release']['major'], '8') >= 0 {
$argparse_pkgname = 'python3-ceph-argparse'
} else {
$argparse_pkgname = 'python-ceph-argparse'
}

package {
[
'libcephfs2',
'python-cephfs',
'ceph-common',
$argparse_pkgname,
'python3-ceph-argparse',
# 'ceph-fuse',
]:
ensure => installed,
require => [Yumrepo['epel'], Yumrepo['ceph-stable']],
}
}

class profile::ceph::client::config (
define profile::ceph::client::share (
Array[String] $mon_host,
String $share_name,
String $access_key,
String $export_path,
Array[String] $mon_host,
Stdlib::Unixpath $export_path,
Array[BindMount] $bind_mounts,
Optional[Stdlib::Unixpath] $binds_fcontext_equivalence = undef,
) {
require profile::ceph::client::install

$client_fullkey = @("EOT")
[client.${share_name}]
[client.${name}]
key = ${access_key}
| EOT

file { "/etc/ceph/client.fullkey.${share_name}":
file { "/etc/ceph/client.fullkey.${name}":
content => $client_fullkey,
mode => '0600',
owner => 'root',
group => 'root',
}

file { "/etc/ceph/client.keyonly.${share_name}":
file { "/etc/ceph/client.keyonly.${name}":
content => Sensitive($access_key),
mode => '0600',
owner => 'root',
group => 'root',
}
file { "/mnt/${name}":
ensure => directory,
}

$mon_host_string = join($mon_host, ',')
$ceph_conf = @("EOT")
[client]
client quota = true
mon host = ${mon_host_string}
| EOT
mount { "/mnt/${name}":
ensure => 'mounted',
fstype => 'ceph',
device => "${mon_host_string}:${export_path}",
options => "name=${share_name},secretfile=/etc/ceph/client.keyonly.${name}",
require => File['/etc/ceph/ceph.conf'],
}

file { '/etc/ceph/ceph.conf':
content => $ceph_conf,
$bind_mounts.each |$mount| {
file { $mount['dst']:
ensure => pick($mount['type'], 'directory'),
}
mount { $mount['dst']:
ensure => 'mounted',
fstype => 'none',
options => 'rw,bind',
device => "/mnt/${name}${mount['src']}",
require => [
File[$mount['dst']],
Mount["/mnt/${name}"]
],
}

if ($binds_fcontext_equivalence and $binds_fcontext_equivalence != $mount['dst']) {
selinux::fcontext::equivalence { $mount['dst']:
ensure => 'present',
target => $binds_fcontext_equivalence,
require => Mount[$mount['dst']],
}
}
}
}
4 changes: 1 addition & 3 deletions site/profile/manifests/consul.pp
Original file line number Diff line number Diff line change
@@ -3,9 +3,7 @@

include consul_template

$interface = profile::getlocalinterface()
$ipaddress = $facts['networking']['interfaces'][$interface]['ip']

$ipaddress = lookup('terraform.self.local_ip')
if $ipaddress in $servers {
$is_server = true
$bootstrap_expect = length($servers)
4 changes: 3 additions & 1 deletion site/profile/manifests/cvmfs.pp
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
class profile::cvmfs::client (
Integer $quota_limit,
Array[String] $repositories,
Boolean $strict_mount = false,
Array[String] $repositories = [],
Array[String] $alien_cache_repositories = [],
) {
include profile::consul
@@ -41,6 +42,7 @@

file { '/etc/cvmfs/default.local.ctmpl':
content => epp('profile/cvmfs/default.local', {
'strict_mount' => $strict_mount ? { true => 'yes', false => 'no' }, # lint:ignore:selector_inside_resource
'quota_limit' => $quota_limit,
'repositories' => $repositories + $alien_cache_repositories,
}),
333 changes: 131 additions & 202 deletions site/profile/manifests/freeipa.pp

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion site/profile/manifests/globus.pp
Original file line number Diff line number Diff line change
@@ -3,7 +3,7 @@
ensure => installed,
}

$public_ip = lookup("terraform.instances.${facts['networking']['hostname']}.public_ip")
$public_ip = lookup('terraform.self.public_ip')
class { 'globus':
display_name => $globus::display_name,
client_id => $globus::client_id,
42 changes: 19 additions & 23 deletions site/profile/manifests/gpu.pp
Original file line number Diff line number Diff line change
@@ -82,21 +82,27 @@

class profile::gpu::install::passthrough (
Array[String] $packages,
String $nvidia_driver_stream = '550-dkms'
) {
$os = "rhel${::facts['os']['release']['major']}"
$arch = $::facts['os']['architecture']
if versioncmp($::facts['os']['release']['major'], '8') >= 0 {
$repo_config_cmd = 'dnf config-manager'
} else {
$repo_config_cmd = 'yum-config-manager'
}

exec { 'cuda-repo':
command => "${repo_config_cmd} --add-repo http://developer.download.nvidia.com/compute/cuda/repos/${os}/${arch}/cuda-${os}.repo",
command => "dnf config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/${os}/${arch}/cuda-${os}.repo",
creates => "/etc/yum.repos.d/cuda-${os}.repo",
path => ['/usr/bin'],
}

package { 'nvidia-stream':
ensure => $nvidia_driver_stream,
name => 'nvidia-driver',
provider => dnfmodule,
enable_only => true,
require => [
Exec['cuda-repo'],
],
}

$mig_profile = lookup("terraform.instances.${facts['networking']['hostname']}.specs.mig", Variant[Undef, Hash[String, Integer]], undef, {})
class { 'profile::gpu::config::mig':
mig_profile => $mig_profile,
@@ -106,6 +112,7 @@
package { $packages:
ensure => 'installed',
require => [
Package['nvidia-stream'],
Package['kernel-devel'],
Exec['cuda-repo'],
Yumrepo['epel'],
@@ -115,26 +122,15 @@
# Used by slurm-job-exporter to export GPU metrics
-> package { 'datacenter-gpu-manager': }

-> file { '/run/nvidia-persistenced':
ensure => directory,
owner => 'nvidia-persistenced',
group => 'nvidia-persistenced',
mode => '0755',
}

-> augeas { 'nvidia-persistenced.service':
context => '/files/lib/systemd/system/nvidia-persistenced.service/Service',
changes => [
'set User/value nvidia-persistenced',
'set Group/value nvidia-persistenced',
'set DynamicUser/value yes',
'set StateDirectory/value nvidia-persistenced',
'set RuntimeDirectory/value nvidia-persistenced',
'rm ExecStart/arguments',
],
}

file { '/usr/lib/tmpfiles.d/nvidia-persistenced.conf':
content => 'd /run/nvidia-persistenced 0755 nvidia-persistenced nvidia-persistenced -',
mode => '0644',
}
}

class profile::gpu::config::mig (
@@ -232,7 +228,7 @@

# Used by slurm-job-exporter to export GPU metrics
# DCGM does not work with GRID VGPU, most of the stats are missing
ensure_packages(['python3'], { ensure => 'present' })
ensure_packages(['python3', 'python3-pip'], { ensure => 'present' })
$py3_version = lookup('os::redhat::python3::version')

exec { 'pip install nvidia-ml-py':
@@ -247,9 +243,9 @@
String $source,
Array[String] $packages,
) {
$source_pkg_name = split(split($source, '[/]')[-1], '[.]')[0]
$source_pkg_name = (split($source, '[/]')[-1]).regsubst(/\.rpm/, '', 'G')
package { 'vgpu-repo':
ensure => 'latest',
ensure => 'installed',
provider => 'rpm',
name => $source_pkg_name,
source => $source,
61 changes: 60 additions & 1 deletion site/profile/manifests/jupyterhub.pp
Original file line number Diff line number Diff line change
@@ -3,8 +3,8 @@
String $reset_pw_url = '', # lint:ignore:params_empty_string_assignment
) {
contain jupyterhub
ensure_resource('service', 'sssd', { 'ensure' => running, 'enable' => true })

Service <| tag == profile::sssd |> ~> Service['jupyterhub']
Yumrepo['epel'] -> Class['jupyterhub']

file { '/etc/jupyterhub/templates/login.html':
@@ -21,6 +21,16 @@
tags => ['jupyterhub'],
token => lookup('profile::consul::acl_api_token'),
}

file { "${jupyterhub::prefix}/bin/ipa_create_user.py":
source => 'puppet:///modules/profile/users/ipa_create_user.py',
mode => '0755',
}

file { "${jupyterhub::prefix}/bin/kinit_wrapper":
source => 'puppet:///modules/profile/freeipa/kinit_wrapper',
mode => '0755',
}
}

class profile::jupyterhub::node {
@@ -31,3 +41,52 @@
}
}
}

class profile::jupyterhub::hub::keytab {
$ipa_domain = lookup('profile::freeipa::base::ipa_domain')
$jupyterhub_prefix = lookup('jupyterhub::prefix', undef, undef, '/opt/jupyterhub')

$fqdn = "${facts['networking']['hostname']}.${ipa_domain}"
$service_name = "jupyterhub/${fqdn}"
$service_register_script = @("EOF")
api.Command.batch(
{ 'method': 'service_add', 'params': [['${service_name}'], {}]},
{ 'method': 'service_add_principal', 'params': [['${service_name}', 'jupyterhub/jupyterhub'], {}]},
{ 'method': 'role_add', 'params': [['JupyterHub'], {'description' : 'JupyterHub User management'}]},
{ 'method': 'role_add_privilege', 'params': [['JupyterHub'], {'privilege' : 'Group Administrators'}]},
{ 'method': 'role_add_privilege', 'params': [['JupyterHub'], {'privilege' : 'User Administrators'}]},
{ 'method': 'role_add_member', 'params': [['JupyterHub'], {'service' : '${service_name}'}]},
)
|EOF

file { "${jupyterhub_prefix}/bin/ipa_register_service.py":
content => $service_register_script,
require => Exec['jupyterhub_venv'],
}

$ipa_passwd = lookup('profile::freeipa::server::admin_password')
$keytab_command = @("EOT")
kinit_wrapper ipa console ${jupyterhub_prefix}/bin/ipa_register_service.py && \
kinit_wrapper ipa-getkeytab -p jupyterhub/jupyterhub -k /etc/jupyterhub/jupyterhub.keytab
|EOT
exec { 'jupyterhub_keytab':
command => $keytab_command,
creates => '/etc/jupyterhub/jupyterhub.keytab',
require => [
File['/etc/jupyterhub'],
File["${jupyterhub_prefix}/bin/kinit_wrapper"],
Exec['ipa-install'],
],
subscribe => File["${jupyterhub_prefix}/bin/ipa_register_service.py"],
environment => ["IPA_ADMIN_PASSWD=${ipa_passwd}"],
path => ['/bin', '/usr/bin', '/sbin','/usr/sbin', "${jupyterhub_prefix}/bin"],
}

file { '/etc/jupyterhub/jupyterhub.keytab':
owner => 'root',
group => 'jupyterhub',
mode => '0640',
subscribe => Exec['jupyterhub_keytab'],
require => Group['jupyterhub'],
}
}
1 change: 0 additions & 1 deletion site/profile/manifests/metrics.pp
Original file line number Diff line number Diff line change
@@ -89,7 +89,6 @@
require => [
Package['prometheus-slurm-exporter'],
File['/etc/systemd/system/prometheus-slurm-exporter.service'],
Wait_for['slurmctldhost_set'],
],
}
}
142 changes: 28 additions & 114 deletions site/profile/manifests/nfs.pp
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
class profile::nfs {
class profile::nfs (String $domain) {
$server_ip = lookup('profile::nfs::client::server_ip')
$interface = profile::getlocalinterface()
$ipaddress = $facts['networking']['interfaces'][$interface]['ip']
$ipaddress = lookup('terraform.self.local_ip')

if $ipaddress == $server_ip {
include profile::nfs::server
@@ -12,19 +11,19 @@

class profile::nfs::client (
String $server_ip,
String $domain_name,
) {
$nfs_domain = "int.${domain_name}"

$nfs_domain = lookup('profile::nfs::domain')
class { 'nfs':
client_enabled => true,
nfs_v4_client => true,
nfs_v4_idmap_domain => $nfs_domain,
}

$devices = lookup('profile::nfs::server::devices', undef, undef, {})
if $devices =~ Hash[String, Array[String]] {
$nfs_export_list = keys($devices)
$instances = lookup('terraform.instances')
$nfs_server = Hash($instances.map| $key, $values | { [$values['local_ip'], $key] })[$server_ip]
$nfs_volumes = $instances.dig($nfs_server, 'volumes', 'nfs')
if $nfs_volumes =~ Hash[String, Hash] {
$nfs_export_list = keys($nfs_volumes)
$options_nfsv4 = 'proto=tcp,nosuid,nolock,noatime,actimeo=3,nfsvers=4.2,seclabel,x-systemd.automount,x-systemd.mount-timeout=30,_netdev'
$nfs_export_list.each | String $name | {
nfs::client::mount { "/${name}":
@@ -46,22 +45,17 @@
}

class profile::nfs::server (
String $domain_name,
# $devices is an empty string (i.e.: String[0, 0]) when
# the key terraform.data.volumes.nfs does not exist because
# "A lookup resulting in an interpolation of `alias` referencing
# a non-existant key returns an empty string"
Variant[Hash[String, Array[String]], String[0, 0]] $devices,
Array[String] $no_root_squash_tags = ['mgmt']
) {
$nfs_domain = "int.${domain_name}"
include profile::volumes

$cidr = profile::getcidr()
$nfs_domain = lookup('profile::nfs::domain')
class { 'nfs':
server_enabled => true,
nfs_v4 => true,
storeconfigs_enabled => false,
nfs_v4_export_root => '/export',
nfs_v4_export_root_clients => "${cidr}(ro,fsid=root,insecure,no_subtree_check,async,root_squash)",
nfs_v4_export_root_clients => "*.${nfs_domain}(ro,fsid=root,insecure,no_subtree_check,async,root_squash)",
nfs_v4_idmap_domain => $nfs_domain,
}

@@ -79,104 +73,24 @@
notify => Service[$nfs::server_service_name],
}

package { 'lvm2':
ensure => installed,
}

if $devices =~ Hash[String, Array[String]] {
$hostname = $facts['networking']['hostname']
$instance_tags = lookup("terraform.instances.${hostname}.tags")
$ldap_access_tags = lookup('profile::users::ldap::access_tags').map|$tag| { split($tag, /:/)[0] }
$users_tags = unique(
flatten(
lookup('profile::users::ldap::users').map|$key,$values| {
if has_key($values, 'access_tags') {
$values['access_tags'].map|$tag| { split($tag, /:/)[0] }
} else {
$ldap_access_tags
}
}
)
)
$devices = lookup('terraform.self.volumes.nfs', Hash, undef, {})
if $devices =~ Hash[String, Hash] {
# Allow instances with specific tags to mount NFS without root squash
$instances = lookup('terraform.instances')
$common_options = 'rw,async,no_all_squash,security_label'
$prefixes = $instances.filter|$key, $values| { ! intersection($values['tags'], $no_root_squash_tags ).empty }.map|$key, $values| { $values['prefix'] }.unique
$prefix_rules = $prefixes.map|$string| { "${string}*.${nfs_domain}(${common_options},no_root_squash)" }.join(' ')
$clients = "${prefix_rules} *.${nfs_domain}(${common_options},root_squash)"
$devices.each | String $key, $glob | {
profile::nfs::server::export_volume { $key:
glob => $glob,
root_bind_mount => ! intersection($instance_tags, $users_tags).empty,
nfs::server::export { "/mnt/nfs/${key}":
ensure => 'mounted',
clients => $clients,
notify => Service[$nfs::server_service_name],
require => [
Profile::Volumes::Volume["nfs-${key}"],
Class['nfs'],
],
}
}
}
}

define profile::nfs::server::export_volume (
Array[String] $glob,
Boolean $root_bind_mount = false,
String $seltype = 'home_root_t',
) {
$regexes = regsubst($glob, /[?*]/, { '?' => '.', '*' => '.*' })

ensure_resource('file', "/mnt/${name}", { 'ensure' => 'directory', 'seltype' => $seltype })

$pool = $::facts['/dev/disk'].filter |$key, $values| {
$regexes.any|$regex| {
$key =~ Regexp($regex)
}
}.map |$key, $values| {
$values
}.unique

exec { "vgchange-${name}_vg":
command => "vgchange -ay ${name}_vg",
onlyif => ["test ! -d /dev/${name}_vg", "vgscan -t | grep -q '${name}_vg'"],
require => [Package['lvm2']],
path => ['/bin', '/usr/bin', '/sbin', '/usr/sbin'],
}

physical_volume { $pool:
ensure => present,
}

volume_group { "${name}_vg":
ensure => present,
physical_volumes => $pool,
createonly => true,
followsymlinks => true,
}

lvm::logical_volume { $name:
ensure => present,
volume_group => "${name}_vg",
fs_type => 'xfs',
mountpath => "/mnt/${name}",
mountpath_require => true,
}

selinux::fcontext::equivalence { "/mnt/${name}":
ensure => 'present',
target => '/home',
require => Mount["/mnt/${name}"],
notify => Selinux::Exec_restorecon["/mnt/${name}"],
}

selinux::exec_restorecon { "/mnt/${name}": }

$cidr = profile::getcidr()
nfs::server::export { "/mnt/${name}":
ensure => 'mounted',
clients => "${cidr}(rw,async,root_squash,no_all_squash,security_label)",
notify => Service[$nfs::server_service_name],
require => [
Mount["/mnt/${name}"],
Class['nfs'],
],
}
if $root_bind_mount {
ensure_resource('file', "/${name}", { 'ensure' => 'directory', 'seltype' => $seltype })
mount { "/${name}":
ensure => mounted,
device => "/mnt/${name}",
fstype => none,
options => 'rw,bind',
require => File["/${name}"],
}
}
}
10 changes: 4 additions & 6 deletions site/profile/manifests/rsyslog.pp
Original file line number Diff line number Diff line change
@@ -1,15 +1,13 @@
class profile::rsyslog::base {
package { 'rsyslog':
ensure => 'installed',
}
service { 'rsyslog':
ensure => running,
enable => true,
class { 'rsyslog':
purge_config_files => false,
override_default_config => false,
}
}

class profile::rsyslog::client {
include profile::rsyslog::base
include rsyslog::config

$remote_host_conf = @(EOT)
{{ with $local := node -}}
237 changes: 114 additions & 123 deletions site/profile/manifests/slurm.pp
Original file line number Diff line number Diff line change
@@ -9,11 +9,12 @@
class profile::slurm::base (
String $cluster_name,
String $munge_key,
Enum['21.08', '22.05', '23.02'] $slurm_version,
Enum['23.02', '23.11', '24.05'] $slurm_version,
Integer $os_reserved_memory,
Integer $suspend_time = 3600,
Integer $resume_timeout = 3600,
Boolean $enable_x11_forwarding = true,
Boolean $enable_scrontab = false,
String $config_addendum = '',
)
{
@@ -57,7 +58,6 @@

package { 'munge':
ensure => 'installed',
require => Yumrepo['epel']
}

file { '/var/log/slurm':
@@ -96,15 +96,6 @@
),
}

if versioncmp($slurm_version, '22.05') < 0 {
file { '/etc/slurm/cgroup_allowed_devices_file.conf':
ensure => 'present',
owner => 'slurm',
group => 'slurm',
source => 'puppet:///modules/profile/slurm/cgroup_allowed_devices_file.conf'
}
}

file { '/etc/slurm/epilog':
ensure => 'present',
owner => 'slurm',
@@ -160,20 +151,32 @@
require => [
Exec['enable_powertools'],
Package['munge'],
Yumrepo['slurm-copr-repo']
Yumrepo['slurm-copr-repo'],
Yumrepo['epel'],
],
}

package { ['slurm-contribs', 'slurm-perlapi' ]:
ensure => 'installed',
require => [Package['munge'],
Yumrepo['slurm-copr-repo']],
require => [
Package['slurm'],
Package['munge'],
Yumrepo['slurm-copr-repo']],
}

# slurm-contribs command "seff" requires Sys/hostname.pm
# which is not packaged by default with perl in RHEL >= 9.
if versioncmp($facts['os']['release']['major'], '9') >= 0 {
ensure_packages(['perl-Sys-Hostname'], { 'ensure' => 'installed' })
}

package { 'slurm-libpmi':
ensure => 'installed',
require => [Package['munge'],
Yumrepo['slurm-copr-repo']]
require => [
Package['slurm'],
Package['munge'],
Yumrepo['slurm-copr-repo']
],
}

$instances = lookup('terraform.instances')
@@ -188,12 +191,15 @@
'cluster_name' => $cluster_name,
'slurm_version' => $slurm_version,
'enable_x11_forwarding' => $enable_x11_forwarding,
'enable_scrontab' => $enable_scrontab,
'nb_nodes' => length($nodes),
'suspend_exc_nodes' => join($suspend_exc_nodes, ','),
'resume_timeout' => $resume_timeout,
'suspend_time' => $suspend_time,
'memlimit' => $os_reserved_memory,
'partitions' => $partitions,
'slurmctl' => profile::gethostnames_with_class('profile::slurm::controller'),
'slurmdb' => profile::gethostnames_with_class('profile::slurm::accounting'),
}),
group => 'slurm',
owner => 'slurm',
@@ -215,25 +221,6 @@
require => File['/etc/slurm'],
}

file { '/etc/slurm/slurm-consul.tpl':
ensure => 'present',
source => 'puppet:///modules/profile/slurm/slurm-consul.tpl',
notify => Service['consul-template'],
}

wait_for { 'slurmctldhost_set':
query => 'cat /etc/slurm/slurm-consul.conf',
regex => '^SlurmctldHost=',
polling_frequency => 10, # Wait up to 5 minutes (30 * 10 seconds).
max_retries => 30,
require => [
Service['consul-template'],
Class['consul::reload_service'],
],
refreshonly => true,
subscribe => File['/etc/slurm/slurm-consul.tpl'],
}

# SELinux policy required to allow confined users to submit job with Slurm 19, 20, 21.
# Slurm commands tries to write to a socket in /var/run/munge.
# Confined users cannot stat this file, neither write to it. The policy
@@ -257,32 +244,6 @@
}),
}

file { '/opt/software/slurm/bin/cond_restart_slurm_services':
require => Package['slurm'],
mode => '0755',
content => @("EOT"),
#!/bin/bash
{
/usr/bin/systemctl -q is-active slurmd && /usr/bin/systemctl restart slurmd || /usr/bin/true
/usr/bin/systemctl -q is-active slurmctld && /usr/bin/systemctl restart slurmctld || /usr/bin/true
} &> /var/log/slurm/cond_restart_slurm_services.log
|EOT
}


consul_template::watch { 'slurm-consul.conf':
require => [
File['/etc/slurm/slurm-consul.tpl'],
File['/opt/software/slurm/bin/cond_restart_slurm_services'],
],
config_hash => {
perms => '0644',
source => '/etc/slurm/slurm-consul.tpl',
destination => '/etc/slurm/slurm-consul.conf',
command => '/opt/software/slurm/bin/cond_restart_slurm_services',
}
}

}

# Slurm accouting. This where is slurm accounting database and daemon is ran.
@@ -322,8 +283,11 @@
package { 'slurm-slurmdbd':
ensure => present,
name => "slurm-slurmdbd-${slurm_version}*",
require => [Package['munge'],
Yumrepo['slurm-copr-repo']],
require => [
Package['slurm'],
Package['munge'],
Yumrepo['slurm-copr-repo']
],
}

service { 'slurmdbd':
@@ -377,7 +341,6 @@
require => [
Service['slurmdbd'],
Wait_for['slurmdbd_started'],
Wait_for['slurmctldhost_set'],
],
before => [
Service['slurmctld']
@@ -501,39 +464,36 @@
|EOT
}


$slurm_version = lookup('profile::slurm::base::slurm_version')
if versioncmp($slurm_version, '21.08') >= 0 {
file { '/etc/slurm/job_submit.lua':
ensure => 'present',
owner => 'slurm',
group => 'slurm',
content => epp('profile/slurm/job_submit.lua',
{
'selinux_context' => $selinux_context,
}
),
}
file { '/etc/slurm/job_submit.lua':
ensure => 'present',
owner => 'slurm',
group => 'slurm',
content => epp('profile/slurm/job_submit.lua',
{
'selinux_context' => $selinux_context,
}
),
}

consul::service { 'slurmctld':
port => 6817,
require => Tcp_conn_validator['consul'],
token => lookup('profile::consul::acl_api_token'),
before => Wait_for['slurmctldhost_set'],
}

package { 'slurm-slurmctld':
ensure => 'installed',
require => Package['munge']
require => [
Package['munge'],
Package['slurm'],
],
}

service { 'slurmctld':
ensure => 'running',
enable => true,
require => [
Package['slurm-slurmctld'],
Wait_for['slurmctldhost_set'],
],
subscribe => [
File['/etc/slurm/slurm.conf'],
@@ -561,48 +521,46 @@

# Slurm node class. This is where slurmd is ran.
class profile::slurm::node (
Boolean $enable_tmpfs_mounts = true,
Array[String] $pam_access_groups = ['wheel'],
) {
contain profile::slurm::base

$slurm_version = lookup('profile::slurm::base::slurm_version')
if versioncmp($slurm_version, '22.05') >= 0 {
$cc_tmpfs_mounts_url = "https://download.copr.fedorainfracloud.org/results/cmdntrf/spank-cc-tmpfs_mounts-${slurm_version}/"
} else {
$cc_tmpfs_mounts_url = 'https://download.copr.fedorainfracloud.org/results/cmdntrf/spank-cc-tmpfs_mounts/'
}

yumrepo { 'spank-cc-tmpfs_mounts-copr-repo':
enabled => true,
descr => 'Copr repo for spank-cc-tmpfs_mounts owned by cmdntrf',
baseurl => "${cc_tmpfs_mounts_url}/epel-\$releasever-\$basearch/",
skip_if_unavailable => true,
gpgcheck => 1,
gpgkey => "${cc_tmpfs_mounts_url}/pubkey.gpg",
repo_gpgcheck => 0,
}

package { ['slurm-slurmd', 'slurm-pam_slurm']:
ensure => 'installed',
require => Package['slurm']
require => Package['slurm'],
}

package { 'spank-cc-tmpfs_mounts':
ensure => 'installed',
require => [
Package['slurm-slurmd'],
Yumrepo['spank-cc-tmpfs_mounts-copr-repo'],
]
if $enable_tmpfs_mounts {
$slurm_version = lookup('profile::slurm::base::slurm_version')
$cc_tmpfs_mounts_url = "https://download.copr.fedorainfracloud.org/results/cmdntrf/spank-cc-tmpfs_mounts-${slurm_version}/"

yumrepo { 'spank-cc-tmpfs_mounts-copr-repo':
enabled => true,
descr => 'Copr repo for spank-cc-tmpfs_mounts owned by cmdntrf',
baseurl => "${cc_tmpfs_mounts_url}/epel-\$releasever-\$basearch/",
skip_if_unavailable => true,
gpgcheck => 1,
gpgkey => "${cc_tmpfs_mounts_url}/pubkey.gpg",
repo_gpgcheck => 0,
}
package { 'spank-cc-tmpfs_mounts':
ensure => 'installed',
require => [
Package['slurm-slurmd'],
Yumrepo['spank-cc-tmpfs_mounts-copr-repo'],
]
}
$plugstack = 'required /opt/software/slurm/lib64/slurm/cc-tmpfs_mounts.so bindself=/tmp bindself=/dev/shm target=/localscratch bind=/var/tmp/'
} else {
$plugstack = ''
}

file { '/etc/slurm/plugstack.conf':
ensure => 'present',
owner => 'slurm',
group => 'slurm',
content => @(EOT/L),
required /opt/software/slurm/lib64/slurm/cc-tmpfs_mounts.so \
bindself=/tmp bindself=/dev/shm target=/localscratch bind=/var/tmp/
|EOT
content => $plugstack,
}

pam { 'Add pam_slurm_adopt':
@@ -649,9 +607,54 @@
source_pp => 'puppet:///modules/profile/slurm/slurmd.pp',
}

file { '/localscratch':
ensure => 'directory',
seltype => 'tmp_t'

# Implementation of user limits as recommended in
# https://cloud.google.com/architecture/best-practices-for-using-mpi-on-compute-engine
# + some common values found on Compute Canada clusters
limits::limits{'*/core':
soft => '0',
hard => 'unlimited'
}

limits::limits{'*/nproc':
soft => '4096',
}

limits::limits{'root/nproc':
soft => 'unlimited',
}

limits::limits{'*/memlock':
both => 'unlimited',
}

limits::limits{'*/stack':
both => 'unlimited',
}

limits::limits{'*/nofile':
both => '1048576',
}

limits::limits{'*/cpu':
both => 'unlimited',
}

limits::limits{'*/rtprio':
both => 'unlimited',
}

ensure_resource('file', '/localscratch', { 'ensure' => 'directory', 'seltype' => 'tmp_t' })
if '/dev/disk/by-label/ephemeral0' in $facts['/dev/disk'] {
mount { '/localscratch':
ensure => mounted,
device => '/mnt/ephemeral0',
fstype => none,
options => 'rw,bind',
require => [
File['/localscratch'],
],
}
}

file { '/var/spool/slurmd':
@@ -712,7 +715,6 @@
],
require => [
Package['slurm-slurmd'],
Wait_for['slurmctldhost_set'],
],
}

@@ -730,17 +732,6 @@
create_group => 'root',
postrotate => '/usr/bin/pkill -x --signal SIGUSR2 slurmd',
}

$hostname = $facts['networking']['hostname']

# If slurmctld server is rebooted slurmd needs to be restarted.
# Otherwise, slurmd keeps running, but the node is not in any partition
# and no job can be scheduled on it.
exec { 'systemctl restart slurmd':
onlyif => "test $(sinfo -n ${hostname} -o %t -h | wc -l) -eq 0",
path => ['/usr/bin', '/opt/software/slurm/bin'],
require => Service['slurmd'],
}
}

# Slurm submitter class. This is for instances that neither run slurmd
2 changes: 1 addition & 1 deletion site/profile/manifests/software_stack.pp
Original file line number Diff line number Diff line change
@@ -4,7 +4,7 @@
Optional[Array[String]] $lmod_default_modules = undef,
Optional[Hash[String, String]] $extra_site_env_vars = undef,
) {
include consul
include profile::consul
include profile::cvmfs::client

package { 'cvmfs-config-eessi':
54 changes: 13 additions & 41 deletions site/profile/manifests/ssh.pp
Original file line number Diff line number Diff line change
@@ -47,51 +47,25 @@
source => 'puppet:///modules/profile/base/opensshserver.config',
notify => Service['sshd'],
}
} elsif versioncmp($::facts['os']['release']['major'], '8') >= 1 {
# In RedHat 9, the sshd policies are defined as an include that of the
} elsif versioncmp($::facts['os']['release']['major'], '9') >= 0 {
# In RedHat 9, the sshd policies are defined as an include of the
# crypto policies. Parameters defined before the include supersede
# the crypto policy. The include is done in a file named 50-redhat.conf.
file { '/etc/ssh/sshd_config.d/49-magic_castle.conf':
mode => '0700',
owner => 'root',
group => 'root',
source => 'puppet:///modules/profile/base/opensshserver-9.config',
notify => Service['sshd'],
}
} elsif versioncmp($::facts['os']['release']['major'], '8') < 0 {
file_line { 'MACs':
ensure => present,
path => '/etc/ssh/sshd_config',
line => 'MACs umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com',
notify => Service['sshd'],
}

file_line { 'KexAlgorithms':
ensure => present,
path => '/etc/ssh/sshd_config',
line => 'KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org',
notify => Service['sshd'],
}

file_line { 'HostKeyAlgorithms':
ensure => present,
path => '/etc/ssh/sshd_config',
line => 'HostKeyAlgorithms ssh-rsa',
notify => Service['sshd'],
}

file_line { 'Ciphers':
ensure => present,
path => '/etc/ssh/sshd_config',
line => 'Ciphers chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com',
notify => Service['sshd'],
}
}
}

# building /etc/ssh/ssh_known_hosts
# for host based authentication
class profile::ssh::known_hosts {
$instances = lookup('terraform.instances')
$domain_name = lookup('profile::freeipa::base::domain_name')
$int_domain_name = "int.${domain_name}"
$ipa_domain = lookup('profile::freeipa::base::ipa_domain')

file { '/etc/ssh/ssh_known_hosts':
content => '# This file is managed by Puppet',
@@ -109,7 +83,7 @@
{
'key' => split($v['hostkeys'][$type], /\s/)[1],
'type' => "ssh-${type}",
'host_aliases' => ["${k}.${int_domain_name}", $v['local_ip'],],
'host_aliases' => ["${k}.${ipa_domain}", $v['local_ip'],],
'require' => File['/etc/ssh/ssh_known_hosts'],
}
]
@@ -125,25 +99,23 @@
include profile::ssh::known_hosts

$instances = lookup('terraform.instances')
$domain_name = lookup('profile::freeipa::base::domain_name')
$ipa_domain = lookup('profile::freeipa::base::ipa_domain')
$hosts = $instances.filter |$k, $v| { ! intersection($v['tags'], $shosts_tags).empty }
$shosts = join($hosts.map |$k, $v| { "${k}.int.${domain_name}" }, "\n")
$shosts = join($hosts.map |$k, $v| { "${k}.${ipa_domain}" }, "\n")

file { '/etc/ssh/shosts.equiv':
content => $shosts,
}

file_line { 'HostbasedAuthentication':
sshd_config { 'HostbasedAuthentication':
ensure => present,
path => '/etc/ssh/sshd_config',
line => 'HostbasedAuthentication yes',
value => 'yes',
notify => Service['sshd'],
}

file_line { 'UseDNS':
sshd_config { 'UseDNS':
ensure => present,
path => '/etc/ssh/sshd_config',
line => 'UseDNS yes',
value => 'yes',
notify => Service['sshd'],
}

40 changes: 30 additions & 10 deletions site/profile/manifests/sssd.pp
Original file line number Diff line number Diff line change
@@ -8,7 +8,7 @@
package { 'sssd-ldap': }

if ! defined('$deny_access') {
$tags = lookup("terraform.instances.${facts['networking']['hostname']}.tags")
$tags = lookup('terraform.self.tags')
$deny_access = intersection($tags, $access_tags).empty
}

@@ -54,15 +54,35 @@
break()
}

$domain_name = lookup('profile::freeipa::base::domain_name')
$ipa_domain = "int.${domain_name}"
$domain_list = join([$ipa_domain] + keys($domains), ',')
file_line { 'sssd_domains':
ensure => present,
path => '/etc/sssd/sssd.conf',
line => "domains = ${domain_list}",
match => "^domains = ${$ipa_domain}$",
if $facts['ipa']['installed'] {
$domain_list = join([$facts['ipa']['domain']] + keys($domains), ',')
} else {
$domain_list = join(keys($domains), ',')
}

if ! $domain_list.empty {
$augeas_domains = "set target[ . = 'sssd']/domains ${domain_list}"
} else {
$augeas_domains = ''
}

file { '/etc/sssd/sssd.conf':
ensure => 'file',
owner => 'root',
group => 'root',
mode => '0600',
notify => Service['sssd'],
}

augeas { 'sssd.conf':
lens => 'sssd.lns',
incl => '/etc/sssd/sssd.conf',
changes => [
"set target[ . = 'sssd'] 'sssd'",
"set target[ . = 'sssd']/services 'nss, sudo, pam, ssh'",
$augeas_domains,
],
require => File['/etc/sssd/sssd.conf'],
notify => Service['sssd'],
require => Exec['ipa-install'],
}
}
25 changes: 17 additions & 8 deletions site/profile/manifests/users.pp
Original file line number Diff line number Diff line change
@@ -25,10 +25,10 @@
group => 'root',
}

file { '/etc/sudoers.d/90-cloud-init-users':
ensure => absent,
require => $users.map | $k, $v | { Profile::Users::Local_user[$k] },
}
# file { '/etc/sudoers.d/90-cloud-init-users':
# ensure => absent,
# require => $users.map | $k, $v | { Profile::Users::Local_user[$k] },
# }

ensure_resources(profile::users::local_user, $users)
}
@@ -94,10 +94,9 @@

if $manage_password and $passwd {
$ds_password = lookup('profile::freeipa::server::ds_password')
$domain_name = lookup('profile::freeipa::base::domain_name')
$int_domain_name = "int.${domain_name}"
$fqdn = "${facts['networking']['hostname']}.${int_domain_name}"
$ldap_dc_string = join(split($int_domain_name, '[.]').map |$dc| { "dc=${dc}" }, ',')
$ipa_domain = lookup('profile::freeipa::base::ipa_domain')
$fqdn = "${facts['networking']['hostname']}.${ipa_domain}"
$ldap_dc_string = join(split($ipa_domain, '[.]').map |$dc| { "dc=${dc}" }, ',')

$ldad_passwd_cmd = @("EOT")
ldappasswd -ZZ -H ldap://${fqdn} \
@@ -128,6 +127,7 @@
Boolean $sudoer = false,
String $selinux_user = 'unconfined_u',
String $mls_range = 's0-s0:c0.c1023',
String $authenticationmethods = '',
) {
# Configure local account and ssh keys
user { $name:
@@ -185,4 +185,13 @@
line => "${name} ALL=(ALL) NOPASSWD:ALL",
require => File['/etc/sudoers.d/90-puppet-users'],
}

if $authenticationmethods != '' {
sshd_config { "${name} authenticationmethods":
ensure => present,
condition => "User ${name}",
key => 'AuthenticationMethods',
value => $authenticationmethods
}
}
}
36 changes: 36 additions & 0 deletions site/profile/manifests/vector.pp
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
class profile::vector
(
String $config = file('puppet:///modules/profile/vector/default_config.yaml')
)
{
yumrepo { 'vector':
ensure => present,
enabled => true,
baseurl => "https://yum.vector.dev/stable/vector-0/${::facts['architecture']}/",
gpgcheck => 1,
gpgkey => [
'https://keys.datadoghq.com/DATADOG_RPM_KEY_CURRENT.public',
'https://keys.datadoghq.com/DATADOG_RPM_KEY_B01082D3.public',
'https://keys.datadoghq.com/DATADOG_RPM_KEY_FD4BF915.public',
],
repo_gpgcheck => 1,
}

package { 'vector':
ensure => 'installed',
require => [Yumrepo['vector']],
}

service { 'vector':
ensure => running,
enable => true,
require => [Package['vector']],
}

file { '/etc/vector/vector.yaml':
notify => Service['vector'],
content => $config,
require => [Package['vector']],
}
}

178 changes: 178 additions & 0 deletions site/profile/manifests/volumes.pp
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
# lookup_options:
# profile::volumes::devices:
# merge: 'deep'

## common.yaml
# profile::volumes::devices: %{alias('terraform.self.volumes')}

## Provided by the user
# profile::volumes::devices:
# nfs:
# home:
# mode: '0600'
# owner: 'root'
# group: 'root'
# quota: '5g'

class profile::volumes (
Hash[String, Hash[String, Hash]] $devices,
) {
package { 'lvm2':
ensure => installed,
}
$devices.each | String $volume_tag, $device_map | {
ensure_resource('file', "/mnt/${volume_tag}", { 'ensure' => 'directory' })
$device_map.each | String $key, $values | {
profile::volumes::volume { "${volume_tag}-${key}":
volume_name => $key,
volume_tag => $volume_tag,
* => $values,
}
}
}
}

define profile::volumes::volume (
String[1] $volume_name,
String[1] $volume_tag,
String[1] $glob,
Integer[1] $size,
String[1] $owner = 'root',
String[1] $group = 'root',
String[3,4] $mode = '0755',
String[1] $seltype = 'home_root_t',
Boolean $bind_mount = true,
Boolean $enable_resize = false,
Enum['xfs', 'ext4'] $filesystem = 'xfs',
Optional[String[1]] $bind_target = undef,
Optional[String[1]] $type = undef,
Optional[String[1]] $quota = undef,
Optional[String[1]] $mkfs_options = undef,
) {
$regex = Regexp(regsubst($glob, /[?*]/, { '?' => '.', '*' => '.*' }))
$bind_target_ = pick($bind_target, "/${volume_name}")

file { "/mnt/${volume_tag}/${volume_name}":
ensure => 'directory',
owner => $owner,
group => $group,
mode => $mode,
seltype => $seltype,
}

$device = (values($::facts['/dev/disk'].filter |$k, $v| { $k =~ $regex }).unique)[0]
$dev_mapper_id = "/dev/mapper/${volume_tag}--${volume_name}_vg-${volume_tag}--${volume_name}"

exec { "vgchange-${name}_vg":
command => "vgchange -ay ${name}_vg",
onlyif => ["test ! -d /dev/${name}_vg", "vgscan -t | grep -q '${name}_vg'"],
require => [Package['lvm2']],
path => ['/bin', '/usr/bin', '/sbin', '/usr/sbin'],
}

physical_volume { $device:
ensure => present,
}

volume_group { "${name}_vg":
ensure => present,
physical_volumes => $device,
createonly => true,
followsymlinks => true,
}

if $filesystem == 'xfs' {
$options = 'defaults,usrquota'
} else {
$options = 'defaults'
}

lvm::logical_volume { $name:
ensure => present,
volume_group => "${name}_vg",
fs_type => $filesystem,
mkfs_options => $mkfs_options,
mountpath => "/mnt/${volume_tag}/${volume_name}",
mountpath_require => true,
options => $options,
}

exec { "chown ${owner}:${group} /mnt/${volume_tag}/${volume_name}":
onlyif => "test \"$(stat -c%U:%G /mnt/${volume_tag}/${volume_name})\" != \"${owner}:${group}\"",
refreshonly => true,
subscribe => Lvm::Logical_volume[$name],
path => ['/bin'],
}

exec { "chmod ${mode} /mnt/${volume_tag}/${volume_name}":
onlyif => "test \"$(stat -c0%a /mnt/${volume_tag}/${volume_name})\" != \"${mode}\"",
refreshonly => true,
subscribe => Lvm::Logical_volume[$name],
path => ['/bin'],
}

if $enable_resize {
$logical_volume_size_cmd = "pvs --noheadings -o pv_size ${device} | sed -nr 's/^.*[ <]([0-9]+)\\..*g$/\\1/p'"
$physical_volume_size_cmd = "pvs --noheadings -o dev_size ${device} | sed -nr 's/^ *([0-9]+)\\..*g/\\1/p'"
exec { "pvresize ${device}":
onlyif => "test `${logical_volume_size_cmd}` -lt `${physical_volume_size_cmd}`",
path => ['/usr/bin', '/bin', '/usr/sbin'],
require => Lvm::Logical_volume[$name],
}

$pv_freespace_cmd = "pvs --noheading -o pv_free ${device} | sed -nr 's/^ *([0-9]*)\\..*g/\\1/p'"
exec { "lvextend -l '+100%FREE' -r /dev/${name}_vg/${name}":
onlyif => "test `${pv_freespace_cmd}` -gt 0",
path => ['/usr/bin', '/bin', '/usr/sbin'],
require => Exec["pvresize ${device}"],
}
}

selinux::fcontext::equivalence { "/mnt/${volume_tag}/${volume_name}":
ensure => 'present',
target => '/home',
require => Mount["/mnt/${volume_tag}/${volume_name}"],
notify => Selinux::Exec_restorecon["/mnt/${volume_tag}/${volume_name}"],
}

selinux::exec_restorecon { "/mnt/${volume_tag}/${volume_name}": }

if $bind_mount {
ensure_resource('file', $bind_target_, { 'ensure' => 'directory', 'seltype' => $seltype })
mount { $bind_target_:
ensure => mounted,
device => "/mnt/${volume_tag}/${volume_name}",
fstype => none,
options => 'rw,bind',
require => [
File[$bind_target_],
Lvm::Logical_volume[$name],
],
}
} elsif (
$facts['mountpoints'][$bind_target_] != undef and
$facts['mountpoints'][$bind_target_]['device'] == $dev_mapper_id
) {
mount { $bind_target_:
ensure => absent,
}
}

if $quota and $filesystem == 'xfs' {
ensure_resource('file', '/etc/xfs_quota', { 'ensure' => 'directory' })
# Save the xfs quota setting to avoid applying at every iteration
file { "/etc/xfs_quota/${volume_tag}-${volume_name}":
ensure => 'file',
content => "#FILE TRACKED BY PUPPET DO NOT EDIT MANUALLY\n${quota}",
require => File['/etc/xfs_quota'],
}

exec { "apply-quota-${name}":
command => "xfs_quota -x -c 'limit bsoft=${quota} bhard=${quota} -d' /mnt/${volume_tag}/${volume_name}",
require => Mount["/mnt/${volume_tag}/${volume_name}"],
path => ['/bin', '/usr/bin', '/sbin', '/usr/sbin'],
refreshonly => true,
subscribe => [File["/etc/xfs_quota/${volume_tag}-${volume_name}"]],
}
}
}
10 changes: 5 additions & 5 deletions site/profile/templates/accounts/mkhome.sh.epp
Original file line number Diff line number Diff line change
@@ -17,24 +17,24 @@ if [[ ! -p ${MKHOME_MODPROJECT_PIPE} ]]; then
fi

(
tail -F /var/log/dirsrv/slapd-*/access | grep --line-buffered -oP 'ADD dn=\"uid=\K([a-z0-9A-Z_]*)(?=,cn=users)' &
tail -F /var/log/dirsrv/slapd-*/access | grep --line-buffered -oP 'ADD dn=\"uid=\K([a-z0-9A-Z-_]*)(?=,cn=users)' &
tail -F ${MKHOME_RETRY_PIPE}
) |
while read USERNAME; do
<% if $with_home { -%>
<% if $manage_home { -%>
if ! mkhome $USERNAME; then
echo $USERNAME > ${MKHOME_RETRY_PIPE} &
continue
fi
<% } -%>
<% if $with_scratch { -%>
if ! mkscratch $USERNAME <%= $with_home %>; then
<% if $manage_scratch { -%>
if ! mkscratch $USERNAME <%= $manage_home %>; then
echo $USERNAME > ${MKHOME_RETRY_PIPE} &
continue
fi
<% } -%>

for PROJECT in $((id -Gn ${USERNAME} 2> /dev/null || kexec ipa user-show ${USERNAME} | grep 'Member of groups:') | grep -P -o "${PROJECT_REGEX}"); do
echo 0 ${PROJECT} <%= $with_project %> ${USERNAME} > ${MKHOME_MODPROJECT_PIPE} &
echo 0 ${PROJECT} <%= $manage_project %> ${USERNAME} > ${MKHOME_MODPROJECT_PIPE} &
done
done
2 changes: 1 addition & 1 deletion site/profile/templates/accounts/mkproject.sh.epp
Original file line number Diff line number Diff line change
@@ -17,7 +17,7 @@


PROJECT_REGEX="<%= $project_regex %>"
WITH_FOLDER="<%= $with_folder %>"
WITH_FOLDER="<%= $manage_folder %>"
PREV_CONN=""

source /sbin/account_functions.sh
4 changes: 3 additions & 1 deletion site/profile/templates/cvmfs/default.local.epp
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
<% if ! $repositories.empty { -%>
CVMFS_REPOSITORIES="<%= $repositories.join(',') %>"
CVMFS_STRICT_MOUNT="yes"
<% } -%>
CVMFS_STRICT_MOUNT="<%= $strict_mount %>"
CVMFS_QUOTA_LIMIT=<%= $quota_limit %>
{{ if service "squid" -}}
CVMFS_HTTP_PROXY='{{ range $i, $s := service "squid" }}{{if $i}}|{{end}}http://{{.Address}}:{{.Port}}{{end}};DIRECT'
23 changes: 23 additions & 0 deletions site/profile/templates/freeipa/hbac_rules.py.epp
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash
# 1. Create a hostgroup for each tag
# 2. Create an automember rule for each hostgroup
# 3. Add a condition to the automember rule for each prefix with that tag
# 4. Rebuild the automember rules
api.Command.batch(
<% $hbac_services.each |$service| { -%>
{ 'method': 'hbacsvc_add', 'params': [['<%= $service %>'], {}] },
<% } -%>
<% $tags.each |$tag| { -%>
{ 'method': 'hostgroup_add', 'params': [['<%= $tag %>'], {}] },
{ 'method': 'automember_add', 'params': [['<%= $tag %>'], {'type': 'hostgroup'}] },
<% $hbac_services.each |$service| { -%>
{ 'method': 'hbacrule_add', 'params': [['<%= $tag %>:<%= $service %>'], {'accessruletype': 'allow'}] },
{ 'method': 'hbacrule_add_host', 'params': [['<%= $tag %>:<%= $service %>'], {'hostgroup': '<%= $tag %>'}] },
{ 'method': 'hbacrule_add_service', 'params': [['<%= $tag %>:<%= $service %>'], {'hbacsvc': '<%= $service %>'}] },
<% }} -%>
<% $prefixes_tags.each |$prefix, $tags| { -%>
<% $tags.each |$tag| { -%>
{ 'method': 'automember_add_condition', 'params': [['<%= $tag %>'], {'type': 'hostgroup', 'key': 'fqdn', 'automemberinclusiveregex': "^<%= $prefix %>\d+.<%= $ipa_domain %>$"}] },
<% }} -%>
{ 'method': 'automember_rebuild', 'params': [[], {'type': 'hostgroup'}] },
)
29 changes: 0 additions & 29 deletions site/profile/templates/freeipa/hbac_rules.sh.epp

This file was deleted.

4 changes: 2 additions & 2 deletions site/profile/templates/freeipa/ipa-rewrite.conf.epp
Original file line number Diff line number Diff line change
@@ -9,5 +9,5 @@ RewriteRule ^/$ /ipa/ui [L,NC,R=301]
# Rewrite for plugin index, make it like it's a static file
RewriteRule ^/ipa/ui/js/freeipa/plugins.js$ /ipa/wsgi/plugins.py [PT]

RequestHeader edit Referer ^https://<%= regsubst("${referer}", '\.', '\.', 'G') %> https://<%= $referee %>
RequestHeader edit Referer ^https://<%= regsubst("${referer_int}", '\.', '\.', 'G') %> https://<%= $referee %>
RequestHeader edit Referer ^https://<%= regsubst("${external_hostname}", '\.', '\.', 'G') %> https://<%= $referee %>
RequestHeader edit Referer ^https://<%= regsubst("${internal_hostname}", '\.', '\.', 'G') %> https://<%= $referee %>
2 changes: 1 addition & 1 deletion site/profile/templates/freeipa/mokey.yaml.epp
Original file line number Diff line number Diff line change
@@ -68,7 +68,7 @@ templates: /usr/share/mokey/templates
# FreeIPA)
#------------------------------------------------------------------------------
keytab: "/etc/mokey/keytab/mokeyapp.keytab"
ktuser: "mokeyapp"
ktuser: "mokey/mokey"

#------------------------------------------------------------------------------
# Enable rate limiting based on remote ip (requires redis)
8 changes: 0 additions & 8 deletions site/profile/templates/slurm/cgroup.conf.epp
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
CgroupMountpoint="/sys/fs/cgroup"
CgroupAutomount=no
<% if versioncmp($slurm_version, '22.05') < 0 { -%>
AllowedDevicesFile="/etc/slurm/cgroup_allowed_devices_file.conf"
<% } -%>
ConstrainCores=yes
<% if $slurm_version == '20.11' { -%>
TaskAffinity=yes
<% } -%>
ConstrainRAMSpace=yes
ConstrainKmemSpace=no
ConstrainSwapSpace=yes
ConstrainDevices=yes
AllowedRamSpace=100
34 changes: 20 additions & 14 deletions site/profile/templates/slurm/slurm.conf.epp
Original file line number Diff line number Diff line change
@@ -1,6 +1,21 @@
include /etc/slurm/slurm-consul.conf
include /etc/slurm/nodes.conf

<% if ! $slurmctl.empty { -%>
SlurmctldHost=<%= join($slurmctl, ',') %>
<% } -%>
SlurmctldPort=6817

## Accounting
<% if ! $slurmdb.empty { -%>
AccountingStorageHost=<%= join($slurmdb, ',') %>
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageTRES=gres/gpu,cpu,mem
AccountingStorageEnforce=associations
JobAcctGatherType=jobacct_gather/cgroup
JobAcctGatherFrequency=task=30
JobAcctGatherParams=NoOverMemoryKill,UsePSS
<% } -%>

# MANAGEMENT POLICIES
ClusterName=<%= $cluster_name %>
AuthType=auth/munge
@@ -12,7 +27,7 @@ SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory

# NODE CONFIGURATIONS
GresTypes=gpu
GresTypes=gpu,shard

TreeWidth=<%= $nb_nodes %>
ReturnToService=2 # A DOWN node will become available for use upon registration with a valid configuration.
@@ -58,30 +73,21 @@ X11Parameters=home_xauthority
<% } else { -%>
PrologFlags=alloc,contain
<% } -%>
<% if $enable_scrontab { -%>
ScronParameters=enable
<% } -%>
# Prolog=/etc/slurm/prolog
Epilog=/etc/slurm/epilog
PlugStackConfig=/etc/slurm/plugstack.conf
MpiDefault=pmi2
ProctrackType=proctrack/cgroup
<% if versioncmp($slurm_version, '21.08') >= 0 { -%>
TaskPlugin=task/affinity,task/cgroup
<% } else { -%>
TaskPlugin=task/cgroup
<% } -%>
PropagateResourceLimits=NONE
MailProg=/usr/sbin/slurm_mail

StateSaveLocation=/var/spool/slurm
InteractiveStepOptions="--interactive --mem-per-cpu=0 --preserve-env --pty $SHELL"
LaunchParameters=use_interactive_step,disable_send_gids
<% if versioncmp($slurm_version, '21.08') >= 0 { -%>
JobSubmitPlugins=lua
<% } -%>

<% if versioncmp($slurm_version, '23.02') < 0 { -%>
# The autoscaling compute nodes are not showed by sinfo unless we set PrivateData=cloud
# Not needed for Slurm >= 23.02
PrivateData=cloud
<% } -%>

include /etc/slurm/slurm-addendum.conf

0 comments on commit f05a21d

Please sign in to comment.