Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OptimizationDataset doesn't inherit GeometricProcedure from OptimizationDatasetFactory #312

Open
amcisaac opened this issue Jan 30, 2025 · 1 comment

Comments

@amcisaac
Copy link

amcisaac commented Jan 30, 2025

When you create an OptimizationDataset using OptimizationDatasetFactory.create_dataset(), it doesn't inherit the optimization settings specified in GeometricProcedure from the OptimizationDatasetFactory.

Given that there's the option to specify the GeometricProcedure at the factory level, I would expect it to transfer that to the dataset, but I think it doesn't since the BaseDataset and BasicDataset don't have this property.

Minimal example:

from openff.qcsubmit.factories import OptimizationDatasetFactory
from openff.qcsubmit.procedures import GeometricProcedure
from openff.toolkit.topology import Molecule

opt_procedure_notdefault = GeometricProcedure(program='geometric', 
                  coordsys='cart', 
                  convergence_set='TURBOMOLE', 
                  constraints={})

dataset_factory = OptimizationDatasetFactory(optimization_program=opt_procedure_notdefault)

dataset_factory.optimization_program == opt_procedure_notdefault # Returns True

mols = [
    Molecule.from_smiles(smiles)
    for smiles in [
        "CN(C)O",
        "CC",
    ]
]

dataset = dataset_factory.create_dataset(
    dataset_name="Test",
    tagline="testing geometric procedure",
    description=(
        "A dataset to test geometric procedure"
    ),
    molecules=mols
)

dataset.optimization_procedure == opt_procedure_notdefault # Returns False

dataset.optimization_procedure # Returns the default

# GeometricProcedure(program='geometric', coordsys='dlc', enforce=0.0, epsilon=1e-05, reset=True, 
# qccnv=False, molcnv=False, check=0, trust=0.1, tmax=0.3, maxiter=300, convergence_set='GAU', constraints={})

dataset._get_specifications() # Returns the default-- see `keywords` at the end for the optimization part

# {'default': OptimizationSpecification(program='geometric', qc_specification=QCSpecification(program='psi4', 
# driver=<SinglepointDriver.deferred: 'deferred'>, method='b3lyp-d3bj', basis='dzvp', keywords={'maxiter': 200, 'scf_properties': 
# [<SCFProperties.Dipole: 'dipole'>, <SCFProperties.Quadrupole: 'quadrupole'>, <SCFProperties.WibergLowdinIndices: 
# 'wiberg_lowdin_indices'>, <SCFProperties.MayerIndices: 'mayer_indices'>]}, protocols=AtomicResultProtocols(wavefunction=
# <WavefunctionProtocolEnum.none: 'none'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, 
# policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)), keywords={'coordsys': 'dlc', 'enforce': 0.0, 'epsilon': 1e-05, 
# 'reset': True, 'qccnv': False, 'molcnv': False, 'check': 0, 'trust': 0.1, 'tmax': 0.3, 'maxiter': 300, 'convergence_set': 'GAU'}, 
# protocols=OptimizationProtocols(trajectory=<TrajectoryProtocolEnum.all: 'all'>))}

The user can fix this by manually setting the optimization settings after creating a dataset:

dataset.optimization_procedure = opt_procedure_notdefault
@amcisaac
Copy link
Author

It won't let me upload the yaml with my full environment, but I'm using QCSubmit 0.54.0, installed today so everything should be up to date. I'm pretty sure the environment's not the problem.

name: openff_min
channels:
  - conda-forge
dependencies:
  - python
  - openff-toolkit
  - openff-qcsubmit
  - jupyter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant