Skip to content
Snippets Groups Projects
Select Git revision
  • fc48a414fc8282d3ed252ceffeabeb278a0ad642
  • master default protected
  • github/fork/hrani/master
  • github/fork/dilawar/master
  • chamcham
  • chhennapoda
  • wheel
  • 3.2.0-pre0
  • v3.1.3
  • 3.1.2
  • 3.1.1
  • chamcham-3.1.1
  • 3.1.0
  • ghevar_3.0.2_pre2
  • ghevar_3.0.2
15 results

ReduceOperations

Blame
  • user avatar
    Dilawar Singh authored
    aa2deb82
    History
    ReduceOperations 6.68 KiB
    Overview
    Reduce operations are those that scan through many objects, condensing some
    attribute into a reduced form. For example, we might use a reduce operation to
    compute statistics on Vm across neurons in a population in a model
    spread across multiple nodes. Another common use is to keep track of the
    max field dimension in a variable size field such as the number of synapses on
    a channel or an IntFire neuron.
    
    
    There are two modes of operation: 
    1. through a regular Reduce Msg, originating from a ReduceFinfo on a regular 
    object, and terminating on any 'get' DestFinfo. ReduceFinfos are derived from
    SrcFinfos. They are templated on the type of the 'get' function, and on
    the type of the reduce class (for example, a triad of mean, variance and count).
    The ReduceFinfo constructor takes as an argument a 'digest' function. The
    job of the digest function is to take an argument of the reduce class 
    (which has the contents of the entire reduction operation),
    and do something with it (such as saving values into the originating object).
    
    2. Through Shell::doSyncDataHandler. This takes the synced Elm and its
    FieldElement as Ids, and a string for the field to be reduced, assumed an
    unsigned int. It creates a temporary ReduceMsg from the Shell to the Elm with 
    the field to be reduced. Here the digest function just takes the returned
    ReduceMax< uint > and puts the max value in Shell::maxIndex. It then posts 
    the ack. The calling Shell::doSyncDataHandler waits for the ack, and when it
    comes, it calls a 'set' function to put the returned value into the
    FieldDataHandler::fieldDimension_.
    
    At some point I may want to embed the doSyncDataHandler into any of the
    'set' functions that invalidate fieldDimension. Problem is race conditions,
    where a set function would call the doSync stuff which internally has its
    own call to Ack-protected functions, like 'set'. Must fix.
    
    
    
    The setup of the Reduce functionality is like this:
    
    - Create and define the ReduceFinfo.
    	ReduceFinfo< T, F, R >( const string& name, const string& doc,
    		void ( T::*digestFunc )( const Eref& er, const R* arg )
    	Here T is the type of the object that is the src of the ReduceMsg,
    	F is the type of the returned reduded field
    	R is the Reduce class. 
    
    - Create and define the ReduceClass. This does two things:
    	- Hold the data being reduced
    	- Provide three functions, for primaryReduce, secondaryReduce, and
    	tertiraryReduce. We'll come to these in a little while.
    
    - 
    - 
    
    
    When executing reduce operations from the Shell::doSyncDataHandler, this
    	is what happens:
    - Shell::doSyncDataHander does some checking, then 
    	requestSync.send launches the request, and waits for ack
      	- Shell::handleSync handles the request on each node
      		- Creates a temporary ReduceMsg from Shell to target elm
    		- The ReduceMsg includes a pointer to a const ReduceFinfoBase*
    			which provides two functions:
    			- makeReduce, which makes the ReduceBase object
    			- digestReduce, which refers to the digestFunc of
    			the calling object.
    		- Sends a call with a zero arg on this Msg.
    	- Msg is handled by ReduceMsg::exec. 
    		- This extracts the fid, which points to a getOpFunc.
    		- It creates the derived ReduceBase object.
    		- It adds the ReduceBase object into the ReduceQ
    			- indexed by thread# so no data overwrites.
    		- It scans through all targets on current thread and uses the
    			derived virtual function for ReduceBase::primaryReduce
    			on each.
    	  Overall, for each thread, the 'get' values get reduced and stored
    	  into the ReduceBase derived class in the queue. There is such an
    	  object for each thread.
    
    	- Nasty scheduling ensues for clearing the ReduceQ.
    		- in Barrier 3, we call Clock::checkProcState
    		- this calls Qinfo::clearReduceQ
    		- This marches through each thread on each reduceQ entry
    			- Uses the ReduceBase entry from the zeroth thread
    				as a handle.
    			- Calls ReduceBase::secondaryReduce on each entry for 
    				each thread.
    			- This is done in an ugly way using 
    				findMatchingReduceEntry, could be cleaned up. 
    		- Calls ReduceBase::reduceNodes
    			- If MPI is running this does an instantaneous
    				MPI_Allgather with the contents of the 
    				ReduceBase data.
    			- Does ReduceBase::tertiaryReduce on the received data
    /// New version
    			- calls Element::setFieldDimension directly using ptr.
    /// End of new stuff
    			- returns isDataHere.
    			
    
    		- If reduceNodes returns true, calls ReduceBase::assignResult
    			- This calls the digestReduce function, which is
    				Shell::digestReduceMax
    		
    ////////////////////////////////////////////////////////////////////////
    // Old version
    		- Shell::digestReduceMax assigns maxIndex_ and sends ack
    	- ack allows doSyncDataHandler to proceed
    	- calls Field::set on all tgts to set "fieldDimension" to maxIndex_.
    ////////////////////////////////////////////////////////////////////////
    		- Should really do a direct ptr assignment, within the
    		assignResult function: Assume here that we want the assignment
    		to be reflected on all nodes.
    
    The current code is grotty on several fronts:
    	- The findMatchingReduceEntry stuff could be fixed using a more 
    		sensible indexing.
    	- Should use direct ptr assignment for fieldDimension within 
    		assignResult.
    	- I probably don't need to pass in both the FieldElement and its parent.
    
    When executing reduce operations from messages, this is what happens:
    - ReduceFinfo.send launches the request. No args.
    	- The ReduceMsg::exec function (which is called per thread):
    		- This extracts the fid, which points to a getOpFunc.
    		- It creates the derived ReduceBase object.
    		- It adds the ReduceBase object into the ReduceQ
    			- indexed by thread# so no data overwrites.
    		- It scans through all targets on current thread and uses the
    			derived virtual function for ReduceBase::primaryReduce
    			on each.
    	- Now we go again to the scheduling system to clear ReduceQ.
    		- in Barrier 3, we call Clock::checkProcState
    		- this calls Qinfo::clearReduceQ
    		- This marches through each thread on each reduceQ entry
    			- Uses the ReduceBase entry from the zeroth thread
    				as a handle.
    			- Calls ReduceBase::secondaryReduce on each entry for 
    				each thread.
    			- This is done in an ugly way using 
    				findMatchingReduceEntry, could be cleaned up. 
    		- Calls ReduceBase::reduceNodes
    			- If MPI is running this does an instantaneous
    				MPI_Allgather with the contents of the 
    				ReduceBase data.
    			- Does ReduceBase::tertiaryReduce on the received data
    			- returns isDataHere.
    
    		- If reduceNodes returns true, calls ReduceBase::assignResult
    			on the originating Element.
    			- This calls the digestReduce function, which is
    			what was given to the ReduceFinfo when it was created.
    			- This does whatever field assignments are needed,
    			internally to the originating element which asked for
    			the field values.
    	Note no acks. It happens in Barrier 3 is all.