Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify init #82

Merged
merged 79 commits into from
Jan 13, 2025
Merged

Simplify init #82

merged 79 commits into from
Jan 13, 2025

Conversation

mcocdawc
Copy link
Contributor

@mcocdawc mcocdawc commented Jan 9, 2025

  • moved the saving operation out of the init method.

  • don't redundantly store information from fobj, but use fobj itself directly.

- this fixed a likely bug in optimize.
previously we had
    J0 = [[J0[-1, -1]]]
now it is
    J0 = [[J0[-1][-1]]]
since it is a list[list[float]]
finished BEOPT attrs refactoring
call self.scratch_dir.cleanup in both `optimize` and `oneshot`
@mcocdawc mcocdawc mentioned this pull request Jan 10, 2025
@mcocdawc mcocdawc requested a review from mscho527 January 10, 2025 21:50
@mcocdawc mcocdawc marked this pull request as ready for review January 10, 2025 21:50
@mscho527
Copy link
Member

I'll have to think a little bit more, but I am a bit wary of directly using self.fobj. If we are concerned about the redundant information we are retaining here, is there a reason why we wouldn't take the information from the fobj in the init and not retain self.fobj instead? I'll think about this a bit longer and update lol; just my very first thoughts :)

@mcocdawc
Copy link
Contributor Author

I'll have to think a little bit more, but I am a bit wary of directly using self.fobj. If we are concerned about the redundant information we are retaining here, is there a reason why we wouldn't take the information from the fobj in the init and not retain self.fobj instead? I'll think about this a bit longer and update lol; just my very first thoughts :)

The reason why the fragpart (it's the type of self.fobj) class exists, is that its components Nfrag or fsites belong together and it makes sense to collect them into a class. If we were using only one or two components of self.fobj, then I could live with extracting them from self.fobj, but the way it is written currently we extract all components of self.fobj and unnecessarily break encapsulation.

This is:

  1. Unnecessarily verbose
  2. Hard to maintain.
    2.1. What do you do if you add a component to fragpart? Do you always have to remember to add it also to BE?¹
    2.2. What do you do if you change the representation of fragments altogether? At the moment you have to change unnecessarily much in BE.
  3. Redundant. Do you use self.fobj.Nfrag or self.Nfrag? If we have self.Nfrag, then we should not also store self.fobj, or the other way round.

¹ That's not hypothetical. I need to store some more lookup information per fragment for the faster integral transformation.


Just to give a small example where I think it becomes obvious:

Let's say you had a 2D Rectangle:

class Rectangle:
    def __init__(self, a: float, b: float) -> None:
        self.a = a
        self.b = b

    def area(self) -> float:
        return self.a * self.b        

then you can easily implement a Prism on top

class Prism:
    def __init__(self, base: Rectangle, h: float) -> None:
        self.base = base
        self.h = h

    def volume(self) -> float:
        return self.base.area() * self.h

It is easy to see how this code works even with general 2D shapes without any changes (apart from the type hinting)

But you can also unnecessarily couple the Prism to the rectangle:

class CoupledPrism:
    def __init__(self, rectangle: Rectangle, h: float) -> None:
        self.base = rectangle
        self.a = rectangle.a
        self.b = rectangle.b
        self.h = h

    def volume(self) -> float:
        return self.a * self.b * self.h

now you cannot just pass in another 2D shape with an area function, and the code looks more complicated.

Last but not least, writing the uncoupled version is super easy with attrs.define, while the coupled version is a bit more complicated. IMHO it is reliable rule of thumb, that if it is hard to write with attrs.define, then it is probably worse class design.

attrs example
from attrs import define, field

@define
class AttrsPrism:
    base: Rectangle
    h: float

    def volume(self) -> float:
        return self.base.area() * self.h

@define
class CoupledAttrsPrism:
    base: Rectangle
    h: float
    a: float = field(init=False)
    b: float = field(init=False)

     def __attrs_post_init__(self):
        self.a = self.base.a
        self.b = self.base.b

    def volume(self) -> float:
        return self.a * self.b * self.h

Copy link
Member

@mscho527 mscho527 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool cool nice

@mcocdawc mcocdawc merged commit 7ca8d14 into main Jan 13, 2025
4 checks passed
@mcocdawc mcocdawc deleted the simplify_init branch January 13, 2025 15:15
ShaunWeatherly added a commit that referenced this pull request Jan 14, 2025
* Automatic generation of API documentation (#70)

- Create API reference automatically for all public members of quemb
Before this change one had to manually create rst files and manage them. Now they are recursively created automatically.
The layout of the documentation matches the layout of the namespace

- This change automatically ensures that all (public) docstrings are actually parsed by Sphinx and it is ensured that they are properly written. Had to fix some docstrings

- Type-hinting is now picked up by sphinx.

* H5py use contextmanager (#73)

* put if __main__ guard to test

* use h5py contextmanager nearly everywhere

* fixed a bug from wrong close statement

* Fix parallelization and add parallel tests (#75)

* fix _opt to work with be_parallel

* modify oneshot to work with be_func_parallel

* modify two tests to use nproc > 1

* fix ruff errors

* modify octane test

* Remove CCSD reference from octane test

* Final scratch dir attempt (#74)

# Mayor changes

This PR unifies the use of the ScratchDir. It is now guaranteed, that files are cleaned up if the calculation exits with no error. (If the user passes `WorkDir(cleanup_at_end=False)` the files are retained).
A `WorkDir` is passed from the top of the call stack to the bottom. This allows the user to change the WorkDir if necessary. (Fixes #19)

Got rid of some special cased keyword arguments for DMRG and moved them into an explicit `DMRG_solver_kwargs`.
This way we can catch arguments that are supplied and do nothing.
It is probably a good idea to do the same for SHCI specific keywords @lweisburn .
It is probably a good idea to get rid of the dictionary altogether and to replace it with a dataclass.

# Minor changes

If the `__init__` of a class was just populating the attributes, then I replaced the class definition with `attrs.define`.

On the way I did a lot of type hinting. It is probably a good idea to double check my assumptions in review.

Additionally, I did some refactoring, where explicit open-close pairs where replaced with context managers.
The same was true for multiprocessing Pools.

Here and there I encountered constructs that could be simplified a lot
```python3
veffs = []
[veffs.append(result.get()) for result in results]
# becomes
veffs = [result.get() for result in results]
```
or
```python3
self.frozen_core = False if not fobj.frozen_core else True
# becomes
self.frozen_core = fobj.frozen_core
```
or
```python3
rdm_return = False
if relax_density:
    rdm_return = True
# becomes
rdm_return = relax_density
```

* Update fragment energies (#79)

* Untangling eeval and frag_energy energy evaluation options

* small changes for frag_energy consistency

* start changing eeval and ereturn in parallel functions, making serial and parallel consistent

* put if __main__ guard to test

* use h5py contextmanager nearly everywhere

* changed mbe.StoreBE into attrs defined class

* ignore broken link to numpy.float64 in documentation

* fixed a bug from wrong close statement

* testsuite don't wait for analysis

* added types for molbe BE object

* fixed wrong close statements

* fixed scratch-dir for molbe-be

* added typing for molbe.BE.optimize

- this fixed a likely bug in optimize.
previously we had
    J0 = [[J0[-1, -1]]]
now it is
    J0 = [[J0[-1][-1]]]
since it is a list[list[float]]

* renamed _opt.py to opt.py

finished BEOPT attrs refactoring

* pass on solver_kwargs to be_func

* added types to be_func

* added delete multiple_files function

* use the new scratch dir in be_func

* added types to molbe.BE.optimize + call self.scratch_dir.cleanup

call self.scratch_dir.cleanup in both `optimize` and `oneshot`

* moved schmidt_decomposition to prevent circular import

makes much more sense like this

* added type hints to be_func_parallel

* fixed typo

* fixed several small errors in the code

* fixed be_func_parallel calls

* simplified be_func_parallel

* added types to run_solver

* use frag_scratch in be_func_parallel

* use frag_scratch in run_solver

* added typehints to kbe BE class

* simplified a few boolean expressions

* the tests should be working now

* ensure scratch cleanup for kbe

* removed call to self.compute_energy_full (related to #35)

* write nicer Pool statements

- used proper list comprehensions
- use contextmanager for Pool

* fixed naming error

* use more explicit way of freeing memory

(it is still ugly... )

* refactored expression

* use Tuple[int, ...] for numpy shapes :-(

* refactored WorkDir to use atexit.register for cleanup

* added better docstring for config

* require static analysis to be ok for running test suite

* renamed DMRG specific solver kwargs to its proper name

* better naming for DMRG stuff

* added types to block2 DMRG function

* refactor some DMRG arguments

* change behaviour of scratch contextmanager

now it is ensured, that files are deleted even after an exception when
using it as context manager and cleanup_at_end=True

* fixed the deadlock in the test requirements

* added new scratch dir also to ube

* avoid list[list[float]]; use consistently array instead

* Update energy keywords and logic in restricted BE, serial and parallel, oneshot and parallel.

Update and rearrange some documentation for the keywords

TODO: add non-cumulant energy, update unrestricted BE

* Update kbe/pbe and kbe/misc for consistency

* Fix kbe/pbe and kbe/misc to raise error for non-cumulant

* Add non-cumulant energy option for molecular code

* remove mypy attempt from get_energy_frag for now

* fix be2puffin call of oneshot and update ube oneshot default

* Update get_frag_energy function to work for periodic calculations

* Remove double del line

* remove redundant frag_energy keyword

* Update src/quemb/molbe/helper.py, eri_file to be optional

Co-authored-by: Minsik <[email protected]>

* Update src/quemb/molbe/mbe.py: ebe_tot readability

Co-authored-by: Minsik <[email protected]>

* move use_cumulant to optimize and oneshot, not BE

---------

Co-authored-by: Leah Weisburn <[email protected]>
Co-authored-by: Oskar Weser <[email protected]>
Co-authored-by: Minsik <[email protected]>

* Adding back veff0 for molecular code, removing the now-unnecessary hf_veff from be_func (#81)

* More type annotations + simpler boolean expressions (#76)

Simplifications in the code

- A couple more type annotations

- simpler boolean expressions

- removed keyword arguments if they are always true in the code

* Dataclass for solver kwargs (#77)

- Introduced a dataclass for solver kwargs. This is makes it much more explicit what arguments are supported by DMRG and SHCI

- Could simplify the interface of BE by removing a couple of arguments

* Simplify init (#82)

- moved the saving operation out of the __init__ method.

- don't redundantly store information from `fobj`, but use `fobj` itself directly.

* Make numpy shorter (#83)

- replacing `np.dot` with `@`
- replacing the `numpy.something` calls with `from numpy import something` if functions appear several times
- replacing the `numpy.something` call with `np.something` if a function appears rarely

---------

Co-authored-by: Oskar Weser <[email protected]>
Co-authored-by: Leah Weisburn <[email protected]>
Co-authored-by: Leah Weisburn <[email protected]>
Co-authored-by: Minsik <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants