Skip to content

nodes generating empty view tables are excluded from the algorithm execution

Kostas FILIPPOPOLITIS requested to merge check_nodes_at_least_one_row into master

Created by: apmariglis

https://team-1617704806227.atlassian.net/browse/MIP-698

changelog:

  • mipengine/node/monetdb_interface/views.py::create_view(..) Now raises an InsufficientDataError when a view table with "insufficient data" is created. Data in a view table is considered insufficient when it either has zero rows or the row count is less than .deployment.toml::minimum_row_count. The "insufficient data" view table is dropped before the method returns.

  • mipengine/controller/algorithm_executor::_AlgorithmExecutionInterface.create_primary_data_views(...) Since the create_primary_data_views(...) method is called from within the algorithm flow module, there has already been a selection of the nodes that will participate in the algorithm execution. Nevertheless, when the view tables are created on the respective nodes, depending on the variables, filters etc., there is a chance that some of these view tables will have "insuficient data". Code was added to create_primary_data_views(...) method that removes any nodes that create at least one view table that has "insuficient data".

  • mipengine/node/node.py A check is added when the node starts to make sure that the node_config.privacy.minimum_row_count is at least 1

  • tests/algorithm_validation_tests/test_linearregression_cv_validation.py All tests that were previously skipped are reenabled.

  • tests/algorithm_validation_tests/test_logisticregression_cv_validation.py Some tests are still skipped as they seem to fail for a different reason

Merge request reports