Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace clamd with clamav-client #1991

Open
wants to merge 3 commits into
base: qa/1.x
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ charset-normalizer==3.3.2
# via
# -r requirements.txt
# requests
clamd==1.0.2
clamav-client==0.6.0
# via -r requirements.txt
click==8.1.7
# via pip-tools
Expand Down
2 changes: 1 addition & 1 deletion requirements.in
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ amclient
ammcpc
git+https://github.com/artefactual-labs/bagit-python.git@902051d8410219f6c5f4ce6d43e5b272cf29e89b#egg=bagit
brotli
clamd
clamav-client
django-autoslug
django-csp
django-forms-bootstrap
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ cffi==1.17.1
# via cryptography
charset-normalizer==3.3.2
# via requests
clamd==1.0.2
clamav-client==0.6.0
# via -r requirements.in
cryptography==43.0.1
# via
Expand Down
2 changes: 1 addition & 1 deletion src/MCPClient/lib/archivematicaClientModules
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
assignfileuuids_v0.0 = assign_file_uuids
bindpid_v0.0 = bind_pid
removeunneededfiles_v0.0 = remove_unneeded_files
archivematicaclamscan_v0.0 = archivematica_clamscan
antivirus_v0.0 = antivirus
createevent_v0.0 = create_event
examinecontents_v0.0 = examine_contents
identifydspacefiles_v0.0 = identify_dspace_files
Expand Down
219 changes: 219 additions & 0 deletions src/MCPClient/lib/clientScripts/antivirus.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
#!/usr/bin/env python
# This file is part of Archivematica.
#
# Copyright 2010-2017 Artefactual Systems Inc. <http://artefactual.com>
#
# Archivematica is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Archivematica is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Archivematica. If not, see <http://www.gnu.org/licenses/>.
import argparse
import multiprocessing
import os
import uuid

import django

django.setup()
from clamav_client.scanner import get_scanner
from custom_handlers import get_script_logger
from databaseFunctions import insertIntoEvents
from django.conf import settings as mcpclient_settings
from django.core.exceptions import ValidationError
from django.db import transaction
from main.models import Event
from main.models import File

logger = get_script_logger("archivematica.mcp.client.clamscan")


def concurrent_instances():
return multiprocessing.cpu_count()

Check warning on line 39 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L39

Added line #L39 was not covered by tests


def file_already_scanned(file_uuid):
return (

Check warning on line 43 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L43

Added line #L43 was not covered by tests
file_uuid != "None"
and Event.objects.filter(
file_uuid_id=file_uuid, event_type="virus check"
).exists()
)


def queue_event(file_uuid, date, scanner, passed, queue):
if passed is None or file_uuid == "None":
return

event_detail = ""
if scanner is not None:
info = scanner.info() # This is cached.
event_detail = f'program="{info.name}"; version="{info.version}"; virusDefinitions="{info.virus_definitions}"'

outcome = "Pass" if passed else "Fail"
logger.info("Recording new event for file %s (outcome: %s)", file_uuid, outcome)

queue.append(
{
"fileUUID": file_uuid,
"eventIdentifierUUID": str(uuid.uuid4()),
"eventType": "virus check",
"eventDateTime": date,
"eventDetail": event_detail,
"eventOutcome": outcome,
}
)


def get_parser():
"""Return a ``Namespace`` with the parsed arguments."""
parser = argparse.ArgumentParser()
parser.add_argument("file_uuid", metavar="fileUUID")
parser.add_argument("path", metavar="PATH", help="File or directory location")
parser.add_argument("date", metavar="DATE")
parser.add_argument(

Check warning on line 81 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L77-L81

Added lines #L77 - L81 were not covered by tests
"task_uuid", metavar="taskUUID", help="Currently unused, feel free to ignore."
)
return parser

Check warning on line 84 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L84

Added line #L84 was not covered by tests
Comment on lines +75 to +84
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this helper function if we're not using it?



# Map user-provided backend names from configuration to the corresponding
# internal values used in the clamav_client package. Default to "clamscanner" if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# internal values used in the clamav_client package. Default to "clamscanner" if
# internal values used in the clamav_client package. Default to "clamdscanner" if

# no valid backend is specified in the configuration.
SCANNERS = {"clamscanner": "clamscan", "clamdscanner": "clamd"}
DEFAULT_SCANNER = "clamdscanner"


def create_scanner():
"""Return the ClamAV client configured by the user and found in the
installation's environment variables. Clamdscanner may perform quicker
than Clamscanner given a larger number of objects. Return clamdscanner
object as a default if no other, or an incorrect value is specified.
"""
choice = str(mcpclient_settings.CLAMAV_CLIENT_BACKEND).lower()
backend = SCANNERS.get(choice)
if backend is None:
logger.warning(
'Unexpected antivirus scanner (CLAMAV_CLIENT_BACKEND): "%s"; using "%s".',
choice,
DEFAULT_SCANNER,
)
backend = SCANNERS[DEFAULT_SCANNER]
if backend == "clamd":
return get_scanner(
{
"backend": "clamd",
"address": str(mcpclient_settings.CLAMAV_SERVER),
"timeout": int(mcpclient_settings.CLAMAV_CLIENT_TIMEOUT),
"stream": bool(mcpclient_settings.CLAMAV_PASS_BY_STREAM),
}
)
if backend == "clamscan":
return get_scanner(
{
"backend": "clamscan",
"max_file_size": float(mcpclient_settings.CLAMAV_CLIENT_MAX_FILE_SIZE),
"max_scan_size": float(mcpclient_settings.CLAMAV_CLIENT_MAX_SCAN_SIZE),
}
)
raise ValueError("Unexpected backend configuration.")

Check warning on line 126 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L126

Added line #L126 was not covered by tests


def get_size(file_uuid, path):
# We're going to see this happening when files are not part of `objects/`.
if file_uuid != "None":
try:
return File.objects.get(uuid=file_uuid).size
except (File.DoesNotExist, ValidationError):
pass

Check warning on line 135 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L134-L135

Added lines #L134 - L135 were not covered by tests
# Our fallback.
try:
return os.path.getsize(path)
except Exception:
return None

Check warning on line 140 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L137-L140

Added lines #L137 - L140 were not covered by tests


def scan_file(event_queue, file_uuid, path, date):
if file_already_scanned(file_uuid):
logger.info("Virus scan already performed, not running scan again")
return 0

scanner, passed = None, False

try:
size = get_size(file_uuid, path)
if size is None:
logger.error("Getting file size returned: %s", size)

Check warning on line 153 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L153

Added line #L153 was not covered by tests
return 1

max_file_size = mcpclient_settings.CLAMAV_CLIENT_MAX_FILE_SIZE * 1024 * 1024
max_scan_size = mcpclient_settings.CLAMAV_CLIENT_MAX_SCAN_SIZE * 1024 * 1024

valid_scan = True

if size > max_file_size:
logger.info(

Check warning on line 162 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L162

Added line #L162 was not covered by tests
"File will not be scanned. Size %s bytes greater than scanner "
"max file size %s bytes",
size,
max_file_size,
)
valid_scan = False

Check warning on line 168 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L168

Added line #L168 was not covered by tests
elif size > max_scan_size:
logger.info(

Check warning on line 170 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L170

Added line #L170 was not covered by tests
"File will not be scanned. Size %s bytes greater than scanner "
"max scan size %s bytes",
size,
max_scan_size,
)
valid_scan = False

Check warning on line 176 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L176

Added line #L176 was not covered by tests

if valid_scan:
scanner = create_scanner()
info = scanner.info()
logger.info(
"Using scanner %s (%s - %s)",
info.name,
info.version,
info.virus_definitions,
)

result = scanner.scan(path)
passed, state, details = result.passed, result.state, result.details
else:
passed, state, details = None, None, None

Check warning on line 191 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L191

Added line #L191 was not covered by tests

except Exception:
logger.error("Unexpected error scanning file %s", path, exc_info=True)

Check warning on line 194 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L193-L194

Added lines #L193 - L194 were not covered by tests
return 1
else:
# record pass or fail, but not None if the file hasn't
# been scanned, e.g. Max File Size thresholds being too low.
if passed is not None:
logger.info("File %s scanned!", path)
logger.debug("passed=%s state=%s details=%s", passed, state, details)
finally:
queue_event(file_uuid, date, scanner, passed, event_queue)

# If True or None, then we have no error, the file can move through the
# process as expected...
return 1 if passed is False else 0


def call(jobs):
event_queue = []

Check warning on line 211 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L211

Added line #L211 was not covered by tests

for job in jobs:
with job.JobContext(logger=logger):
job.set_status(scan_file(event_queue, *job.args[1:]))

Check warning on line 215 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L215

Added line #L215 was not covered by tests

with transaction.atomic():
for e in event_queue:
insertIntoEvents(**e)

Check warning on line 219 in src/MCPClient/lib/clientScripts/antivirus.py

View check run for this annotation

Codecov / codecov/patch

src/MCPClient/lib/clientScripts/antivirus.py#L219

Added line #L219 was not covered by tests
Loading