Skip to content

Separation WebAPI/Controller/AlgorithmExecutor

Created by: apmariglis

main changes:

  • mipengine/controller/api/algorithms_endpoint.py

    Removed depedencies to AlgorithmExecutor, node_registry and validate_algorithm_request. There is now an instace of Controller which contains the functionality for executing an algorithm, as well as calling the validator for validating the request to execute an algorithm

  • mipengine/controller/controller.py The request to execute an algorithm gets propagated from the algorithm_endpoints to the Controller. The controller finds the relevant nodes for the execution of the algorithm, based on the requested schemas(pathologies) and datasets and generates a list of node tasks handlers (one for each node). The Controller generates a context_id for the current algorithm execution. The Controller then instantiates an AlgorithmExecutor and passes all the above parameters(node tasks handlers and context id), along with the algorithm_request_dto to that AlgorithmExecutor, which starts the execution of the algorithm.

  • mipengine/controller/algorithm_executor.py

    • moved from mipengine/controller/algorithm_executor to mipengine/controller/
    • AlgorithmExecutor's init method now takes 2 dtos as input. 1. An AlgorithmExecutionDTO, containing the context_id(generated by the controller), the algorithm name and AlgorithmRequestDTO(mipengine/controller/api/algorithm_request_dto.py) 2. A NodesTaskHandlersDTO, containing instances of INodeTasksHandler(s) for each of the nodes. This depedency inversion will allow much easier unit testing of the AlgorithmExecutor without complex mocking of the celery functionality
  • mipengine/controller/node_tasks_handler_interface.py

    an interface for the tasks one can ask from a node, ex. get_tables,get_table_schema,create_merge_table,queue_run_udf etc..

  • mipengine/controller/node_tasks_handler_celery.py

    a concrete implementation of the node_tasks_handler_interface for the celery task queue

  • mipengine/controller/node_registry.py added functionality to get all available datasets and schemas, without duplicates

  • mipengine/controller/celery_app.py max_retries,interval_start,interval_step,interval_max are now read from the controller_config


Known issues:

  • AlgorithmExecutor and Controller unit/integration tests- MIP-237
  • integration test for NodeTasksHandlerCelery - MIP-234
  • codeclimate issues - MIP-250

Merge request reports