Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sophora-importer] support for native s3 connection #142

Merged
merged 10 commits into from
Jan 23, 2025
4 changes: 2 additions & 2 deletions charts/sophora-importer/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 1.3.3
version: 2.0.0

# This is the version number of the application being deployed. This version number should be
jkruke marked this conversation as resolved.
Show resolved Hide resolved
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
appVersion: 4.11.0
appVersion: 5.0.0
28 changes: 5 additions & 23 deletions charts/sophora-importer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Additional environment variables are supported via `sophora.importer.extraEnv`.

## Importer without s3 bucket

If you don't need a s3 bucket for incoming Sophora documents, you can set `sophora.importer.s3Bucket.enabled` to `false`. This might be useful,
If you don't need a s3 bucket for incoming Sophora documents, you can omit the configuration of `sophora.importer.s3Bucket.name`. This might be useful,
if you only need the SOAP api. The following directories can be referenced in your `application.yaml`:

* success: /import/<instance>/success
Expand All @@ -16,28 +16,10 @@ if you only need the SOAP api. The following directories can be referenced in yo

## Importer directory paths

On startup, the Sophora Importer assumes that all directories you defined in your `application.yaml` under `folders` already exist.
These directories will be created automatically by Helm according to your configuration in `sophora.importer.createImportFolders`.
Use `s3` to create the directory for s3 bucket (`/import/`) or `local` if you don't want to share it (`/import-local/`).

The following example creates directories:

```yaml
sophora:
importer:
createImportFolders:
temp: local
failure: s3
incoming: s3
success: s3
```

```
/import-local/<instance>/temp
/import/<instance>/failure
/import/<instance>/incoming
/import/<instance>/success
```
On startup, the Sophora Importer assumes that all directories you defined in your `application.yaml` under `importer.instances[].folders` already exist.
These directories will be created automatically by Helm for all paths that don't start with `s3://`.
All paths starting with `/import-local/` are persisted and contained data will be kept after a restart.
Use `s3://` for folders that should be saved in an S3 bucket as configured with `sophora.importer.s3Bucket` configuration options.

## Import transformation files via S3 or HTTP

Expand Down
97 changes: 28 additions & 69 deletions charts/sophora-importer/templates/statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,59 +33,6 @@ spec:
hostAliases: {{- toYaml . | nindent 8 }}
{{- end }}
containers:
{{- if .Values.sophora.importer.s3Bucket.enabled }}
- name: bucket-mount
image: "{{ .Values.s3fsImage.repository }}:{{ .Values.s3fsImage.tag }}"
env:
- name: AWS_S3_BUCKET
value: {{ .Values.sophora.importer.s3Bucket.name }}
- name: AWS_S3_URL
value: {{ .Values.sophora.importer.s3Bucket.url }}
- name: S3FS_DEBUG
value: "0"
- name: AWS_S3_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
key: {{ .Values.sophora.importer.s3Bucket.secret.secretAccessKeyKey }}
name: {{ .Values.sophora.importer.s3Bucket.secret.name }}
optional: false
- name: AWS_S3_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
key: {{ .Values.sophora.importer.s3Bucket.secret.accessKeyIdKey }}
name: {{ .Values.sophora.importer.s3Bucket.secret.name }}
optional: false
{{- if .Values.sophora.importer.s3Bucket.extraEnv -}}
{{- toYaml .Values.sophora.importer.s3Bucket.extraEnv | nindent 10 }}
{{- end }}
imagePullPolicy: {{ .Values.s3fsImage.pullPolicy }}
lifecycle:
postStart:
exec:
# workaround because the importer can't create the folder by itself.
command:
- sh
- '-c'
- |
echo Creating import folders for importer:
{{- range $instance := .Values.sophora.importer.instances }}
{{/* Ensure that every instance folder is available on s3 */}}
mkdir -pv "/opt/s3fs/bucket/{{ $instance }}"
{{- range $folder, $location := $.Values.sophora.importer.createImportFolders }}
mkdir -pv "{{ ((eq $location "s3") | ternary "/opt/s3fs/bucket" "/import") }}/{{ $instance }}/{{ $folder }}"
{{- end }}
{{- end }}
resources:
{{- toYaml .Values.sophora.importer.s3Bucket.resources | nindent 12 }}
volumeMounts:
- name: shared-imports
mountPath: /opt/s3fs/bucket
mountPropagation: Bidirectional
- name: local-import-folders
mountPath: /import
securityContext:
privileged: true
{{- end }}
- name: importer
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
Expand All @@ -108,6 +55,26 @@ spec:
{{- else }}
value: {{ include "sophora-importer.transformationLibsPath" . }}
{{- end }}
{{- if .Values.sophora.importer.s3Bucket.name }}
- name: IMPORTER_S3_SECRETACCESSKEY
valueFrom:
secretKeyRef:
key: {{ .Values.sophora.importer.s3Bucket.secret.secretAccessKeyKey }}
name: {{ .Values.sophora.importer.s3Bucket.secret.name }}
optional: false
- name: IMPORTER_S3_ACCESSKEYID
valueFrom:
secretKeyRef:
key: {{ .Values.sophora.importer.s3Bucket.secret.accessKeyIdKey }}
name: {{ .Values.sophora.importer.s3Bucket.secret.name }}
optional: false
- name: IMPORTER_S3_BUCKETNAME
value: {{ .Values.sophora.importer.s3Bucket.name }}
- name: IMPORTER_S3_HOST
value: {{ .Values.sophora.importer.s3Bucket.url }}
- name: IMPORTER_S3_REGION
value: {{ .Values.sophora.importer.s3Bucket.region }}
{{- end }}
{{ if .Values.sophora.importer.extraEnv -}}
{{- toYaml .Values.sophora.importer.extraEnv | nindent 10 }}
{{- end }}
Expand All @@ -123,9 +90,6 @@ spec:
- name: importer-config
mountPath: /sophora/logback-spring.xml
subPath: logback-spring.xml
- name: shared-imports
mountPath: /import/
mountPropagation: Bidirectional
- name: local-import-folders
mountPath: /import-local/
{{- if .Values.transformation.data.useSaxon }}
Expand All @@ -143,24 +107,21 @@ spec:
{{- end }}
securityContext:
privileged: true
{{- if not .Values.sophora.importer.s3Bucket.enabled }}
lifecycle:
postStart:
exec:
command:
- sh
- '-c'
- |
echo "Creating import folders for importer:"
{{- range $folder := tuple "import" "import-local" }}
{{- range $instance := $.Values.sophora.importer.instances }}
mkdir -p /{{$folder}}/{{$instance}}/success
mkdir -p /{{$folder}}/{{$instance}}/temp
mkdir -p /{{$folder}}/{{$instance}}/failure
mkdir -p /{{$folder}}/{{$instance}}/incoming
{{ end }}
{{ end }}
{{- end }}
echo "Creating local import folders for importer:"
{{- range $.Values.sophora.importer.configuration.importer.instances }}
{{- range $folderType, $folderPath := .folders }}
{{- if and (not (hasPrefix "s3://" $folderPath)) (hasKey (dict "watch" 1 "temp" 1 "success" 1 "failure" 1) $folderType) }}
mkdir -pv {{ $folderPath }}
{{- end }}
{{- end }}
{{- end }}
initContainers:
{{/* Transformations Download */}}
{{- with .Values.transformation }}
Expand Down Expand Up @@ -227,8 +188,6 @@ spec:
- name: importer-config
configMap:
name: {{ include "sophora-importer.fullname" . }}
- name: shared-imports
emptyDir: {}
- name: local-import-folders
{{- if not .Values.importPvcSpec }}
emptyDir: {}
Expand Down
11 changes: 5 additions & 6 deletions charts/sophora-importer/test-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,13 @@ sophora:
- name: BBB
value: b-value
s3Bucket:
name: "sophora-test-importer"
name: "my-bucket"
url: "https://storage.googleapis.com"
region: "eu-west-3"
secret:
name: "sophora-importer-bucket-credentials"
secretAccessKeyKey: "secretAccessKey"
accessKeyIdKey: "accessKeyId"
instances:
- common
configuration:
sophora:
client:
Expand Down Expand Up @@ -80,10 +79,10 @@ sophora:
key: common
transform: skipTransform
folders:
watch: /import/common/incoming
watch: s3://common/incoming
temp: /import-local/common/temp
success: /import-local/common/success
failure: /import/common/failure
success: s3://common/success
failure: s3://common/failure
xsl: /xsl
defaultStructureNode: /import

Expand Down
27 changes: 3 additions & 24 deletions charts/sophora-importer/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,6 @@ downloadViaS3Image:
tag: "0.0.2"
pullPolicy: IfNotPresent

s3fsImage:
repository: efrecon/s3fs
tag: "1.91"
pullPolicy: IfNotPresent

nodeSelector: {}
imagePullSecrets: []
nameOverride: ""
Expand All @@ -45,31 +40,15 @@ sophora:
usernameKey: "username"
importer:
s3Bucket:
# If enabled, the importer uses a s3 bucket for incoming imports
enabled: true
name: ""
url: "https://storage.googleapis.com"
name:
url:
region:
secret:
name: ""
secretAccessKeyKey: "secretAccessKey"
accessKeyIdKey: "accessKeyId"
extraEnv:
- name: S3FS_ARGS
value: nonempty
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
memory: 256Mi
extraEnv:
loaderPath:
instances: []
createImportFolders:
incoming: s3
failure: s3
success: local
temp: local
configuration: {}

logbackXml: |
Expand Down
Loading