Dev/mip 705/remove create_primary_data_views method from algorithm execution interface (!346) · Merge requests · HBP Medical Informatics Platform / Exareme2

Kostas FILIPPOPOLITIS requested to merge dev/MIP-705/remove-create_primary_data_views-method_from_AlgorithmExecutionInterface into master Dec 08, 2022

Created by: apmariglis

changelog:

All algorithm flow logic now needs to be implemented as a class which extends the abstract class mipengine/algorithms/algorithm.py::Algorithm
- All algorithm classes must define the class attribute algname in their class declaration. The algname attribute is accessed by the algorithm execution infrastructure to "find" the relevant class, so the algorithm class can be named arbitrarily
- All algorithm classes must implement abstract methods:
  - get_variable_groups Must be implemented to return a List[List[str]] containing the variable names for the creation of the data model views. This is the same list that was previously passed as the variable_groups input parameter to the (now obsolete) create_primary_data_views method.
  - run The implementation of the algorithm flow logic goes in this method.
- Algorithms can override methods:
  - get_dropna Algorithms that need to keep the "Not Available" values in their data model views must now override this method to return False. Otherwise, the default behavior is to remove any "Not Available" values from the data model views.
  - get_check_min_rows Algorithms that need to ignore the "minimum row count" threshold for their data model view tables must override this method to return False. Otherwise, the default behavior is to exclude from the algorithm execution any nodes that would generate at least one data model view table with a row count less than the "minimum row count" threshold.
Data model views are now created before the algorithm flow logic starts executing and are accessible (in the algorithm flow code) through the self.executor.data_model_view list. Method create_primary_data_views that was previously called from inside the algorithm flow code, is now obsolete.
The 3 methods mentioned above, get_variable_groups, get_dropna and get_check_min_rows, are accessed by the algorithm execution infrastructure before the algorithm flow logic starts executing, in order to create the respective data model views, which will then be accessible in the algorithm flow logic via the self.executor.data_model_view list.

Dev/mip 705/remove create_primary_data_views method from algorithm execution interface

Merge request reports