Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General Refactoring #11

Open
2 of 11 tasks
marco-biasion opened this issue Mar 11, 2024 · 0 comments
Open
2 of 11 tasks

General Refactoring #11

marco-biasion opened this issue Mar 11, 2024 · 0 comments
Labels
note > help wanted Extra attention is needed type > enhancement New feature or request type > meta-issue Represent the topic of other issues

Comments

@marco-biasion
Copy link
Member

marco-biasion commented Mar 11, 2024

This issue is related to the topic of refactoring, in particular why we do things this way and if/how we should improve them.

We are writing this issue in preparation of a meeting we will have (not yet planned) regarding improving the codebase.
The following is a (not exhaustive) list of things, everyone is free to leave comments regarding these points, or any ideas they might have:

  • Better typing #12
    • Even though python is a dynamically typed language, we should always try to annotate variables, parameters and returns.
    • This is incredibly useful as the IDEs will help with auto completion and warnings in case we mismatch something.
  • Using a printing utility for colors
    • Currently to color the printing we are manually (library Colorama) adding color tags before and after the printed string. We should avoid this and instead use a unified system that allows for an easier and less verbose colored printing.
    • Another thing we should do is decide on a set of colors (I suggest the ANSI standard color set) and stick to that, also by wrapping the color in an easy to access function, for example .warning(...) will print in yellow and we will not have any .yellow(...) function.
  • Creating a logger class
    • Relating to the previous point, we could have a logger class that includes the features described above, but will automatically save to a log file the printed text (possibly without color codes).
    • This could be implemented as a singleton accessible from all other modules.
  • Using a file-system utility
    • Currently we are duplicating a lot of code to do very similar things in different places, we could unify this in a module.
    • Some example of features: emptying a folder, creating a folder if missing, ...
  • Improving data objects
    • In the code base we have many objects which only purpose is storing data (immutable objects with little to no logic), for example Arguments (this class has only one static method which is a ). Currently they are implemented with 'private' fields and properties, we could simply use dataclasses or attrs.
  • Output file naming
    • Having a name to identify a specific experiment (and iteration ...) file is required.
      The way we do it now is all over the place, any time we need to create a file we generate the name locally.
      The clear improvement is unifying this system to allow better consistency and an easier time naming stuff, without the risk of forgetting some attribute.
  • Overall architecture #81
    • I believe we can improve the code in a more fundamental way, like using the dependency injection architecture.
    • I think this architecture matches well with the type of software we are developing, as we have many 'variations' for the same type of actions.
  • Subgraph extraction unification
    • Currently the subgraph extraction functions are 95+% the same code, with some slight changes in some of the constraints, this causes a lot of code duplication, we could refactor them as a single function with optional constraints inside.
  • One-to-One mappings
    • In the code, mainly in class AnnotatedGraph (and the superclass Graph) we have many places where the data is structured as a 1-to-1 mapping.
    • For this mapping we are using dicts, but given that we require both key->value and value->key access, it would be better to have a data structure that natively supports this kinds of operations (dict does not, and for this reason we have many pieces of code that are overcomplicated).
    • One such datastructure is bidict offered by the library bidict.
  • TemplateCreator hierarchy #56
    • Currently the TemplateCreator classes are not well modularized, meaning that we have a lot of code duplication (eg. Template_SOP1 and Template_SOP1sharelogic are siblings with the only ancestor being an almost empty class).
  • Arguments/Specs refactoring and standardization #71
    • Currently we have an argument class that is only used to generate a specs class, we could instead make it part of the specs class to avoid duplication.
    • We could also standardize the argument/option naming, meaning using the correct number of dashes for options, default values, ecc.

I have other ideas, but I will leave this list as is for now.
As said before, feel free to comment with other points or anything you think would help.

@marco-biasion marco-biasion added type > enhancement New feature or request note > help wanted Extra attention is needed planning > todo The development of this issue is paused or still has to start labels Mar 11, 2024
@marco-biasion marco-biasion added type > meta-issue Represent the topic of other issues and removed planning > todo The development of this issue is paused or still has to start labels Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
note > help wanted Extra attention is needed type > enhancement New feature or request type > meta-issue Represent the topic of other issues
Projects
None yet
Development

No branches or pull requests

1 participant