Skip to content
Snippets Groups Projects
user avatar
Ivan Martinez authored
* first version of openmp threading back end

* adding openmp parallel sort implementation

* OpenMP sort working

* Support for units syntax within state block.

* Add soma-less cable cell to test cells.

Also:
* Ensure intrinsic and passive properties properly set on test cells.

* Change bulk resistivity default.

* Align defaults with values used in most of the NEURON
  validation scripts.
* Use consistent 100 Ω·m bulk resistivity across both
  NEURON test models and basic validation cells.

* OpenMP back end working

* Add Extrae+paraver support, needs to fix compilation warnings

* Reorganize validation data generation

* Move generation and data to top-level validation directory.
* Make BUILD_VALIDATION_DATA and VALIDATION_DATA_DIR cache vars.
* Add helper CMake functions for data generation.

Note `validation/ref/numeric/foo.sh` is just a placeholder.

* Bugfix: hh_soma.jl

* Use consistent scaling for y[1] scalar voltage in hh_soma.jl
* Also: add more reserved target names to CMakeLists.txt
  helper function.

* Refactor convergence tests; add numeric soma ref.

* Amend data directory path in validation tests.
* Enmodulate `hh_soma.jl`
* Add HH channel reference data generations script.
* Switch `validate_soma.cpp` to numeric reference data.
* Consolidate common code in `validate_ball_and_stick.cpp`
* Add (nearly) Rallpack1 validation test (see below).
* Gentle failure on absence of reference data in
  `validate_ball_and_stick.cpp`

Can't yet override mechanism default parameter values,
so the cable cell model added to `test_common_cells.hpp`
lets the default stand; validation script will have
to use the default membrane conductance rather than that
given by Rallpack1.

* Add Rallpack1 validation, plus bugfix, clean

* Implement Rallpack1 validation test (with a workaround
  for inability to set membrane conductance).
* Fix bug in L≠1 case in PassiveCable.jl (this may still be
  wrong).
* Fix bug in peak delta computation in trace analysis when
  both traces have no local maxima.
* Gentle failure on missing `numeric_soma.json`
* Allow multiple `-s` selection operations for `tsplot`,
  acting disjunctively.

* Remove errant test file.

* file's cleanup

* Remove tabs

* Use correct routine in numeric_rallpack1.jl x0.3

* Configure-time test for julia

* `math::infinity<>()` wrapper for infinity

* Use name `i_e` for Stim current density

* Use `math::infinity<>()` for infinite value

* Adds unit tests for the STATE block.

* Add "lib" to search prefixes for libtbb

* Fix quoting error in library search.
* Add "lib" to prefixes when system is "Linux".

* Address deprecated use of 'symbol' warning.

Julia 0.5 deprecates use of `symbol` instead of
`Symbol`. This patch just substitutes the
correct call.

* Address deprecated use of 'symbol' warning.

Julia 0.5 deprecates use of `symbol` instead of
`Symbol`. This patch just substitutes the
correct call.

* Addresses PR comments.

* Unit tests for math.hpp

* Tests for `math::pi`, `math::lerp`, `math::area_frustrum`
  and `math::volume_frustrum`
* Fix `math:pi<long double>()`.

* Extend range, view functionality.

* New `filter` view: lazily selects based on predicate.
* Generic `front` and `back` for sequences.
* New rangeutil STL wrappers `stable_sort_by`, `all_of`, `any_of`.
* Consolidate common utility unit testing structures into
  `tests/unit/common.hpp`

* Add `ball_and_squiggle` model; fix `ball_and_taper`.

* Make `test_common_cells.hpp` and `ball_and_taper.py` agree.
* Add `ball_and_squiggle` model that has a tapering undulating
  profile.

* Address PR#46 review comments.

* Add documentation of template parameters for `filter_iterator`.
* Document use of `uninitalized<F>` for holding functional objects
  in `filter_iterator` and `transform_iterator`

* Consolidate validation test code (issue #41)

* Simplify trace analysis and reporting code in
  `trace_analysis.hpp`
* Consolidate convergence test run procedures into
  new class `convergence_test_runner`.

* New compartment info structure for FVM.

* Make `algorithm::sum`, `algorithm::mean` more generic,
  allowing use with array types.
* Add `div_compartment` compartment representation, that
  holds geometric information for each half of a compartment
  that will then be used in calculating control volumes.
* Add three compartmentalisation schemes/policies that
  discretize a segment into `div_compartment` objects:
    * `div_compartment_by_ends` divides based only on the
      segment end points and radii.
    * `div_compartment_sampler` forms frusta by sampling
      the segment radius at each compartment boundary
    * `div_compartment_integrator` computes the compartment
      areas and volumes exactly by summing all frustra
      in the intersection of the segment and the compartmnet
      span.

* Extrae linked at execution time

* cleaning project

* Complex compartments

* Use divided compartments to determine FVM coefficients.
* Pick correct control volume in FVM from sgement position (avoids
  off-by-half error.)
* Add colour override functionality to tsplot: `--colour` option.
* Add const accessor for cell soma.
* Source formatting, comments in `math.hpp`
* Fix `range_view`: was using incorrectly named type trait.
* Add unit test for `range_view`.
* Allow points of discontinuity to be omitted from L-infinity norm
  calculations.
* Add `-d, --min-dt` option to `validate.exe` to control time
  step in validation convergence tests.
* Add validation test: confirm divided compartment policy does
  not effect results on simple frustrum dendrites.
* Change default max compartments on validation tests to 100
  (ad hoc observed convergence limit at dt circa 0.001 ms;
  finder spatial division would required much finer dt.)
* Make NEURON validation data generation scripts use CVODE by
  default, and with `secondorder=2` when non-zero `dt` is given.

* Remove division policy type parameter.

* Use only `div_compartment_integrator` for compartmentalization in
  `fvm_multicell`. The policy will later be moved to a backend
  policy class.
* For now, disable validation tests that test different division
  policies (see above).
* Tweak comments and remove redundant `using`, following comments
  on PR#54.

* Minor twicks and corrections
0ded25a6

NestMC Prototype

This is the repository for the NestMC prototype code. Unfortunately we do not have thorough documentation of how-to guides. Below are some guides for how to build the project and run the miniapp. Contact us or submit a ticket if you have any questions or want help. https://github.com/eth-cscs/nestmc-proto

  1. Basic installation
  2. MPI
  3. TBB
  4. TBB on Cray systems
  5. Targeting KNL
  6. Examples of environment configuration
    • Julia

Basic installation

# clone repository
git clone git@github.com:eth-cscs/nestmc-proto.git
cd nestmc-proto/

# setup environment
# on a desktop system this is probably not required
# on a cluster this is usually required to make sure that an appropriate
# compiler is chosen.
module load gcc
module load cmake
export CC=`which gcc`
export CXX=`which g++`

# build main project (out-of-tree)
mkdir build
cd build
cmake <path to CMakeLists.txt>
make -j

# test
cd tests
./test.exe

MPI

Set the WITH_MPI option either via the ccmake interface, or via the command line as shown below. To ensure that CMake detects MPI correctly, you should specify the MPI wrapper for the compiler by setting the CXX and CC environment variables.

export CXX=mpicxx
export CC=mpicc
cmake <path to CMakeLists.txt> -DWITH_MPI=ON

TBB

Support for multi-threading requires Intel Threading Building Blocks (TBB). When TBB is installed, it comes with some scripts that can be run to set up the user environment. The scripts set the TBB_ROOT environment variable, which is used by the CMake configuration to find TBB.

source <path to TBB installation>/tbbvars.sh
cmake <path to CMakeLists.txt> -DWITH_TBB=ON

TBB on Cray systems

To compile with TBB on Cray systems, load the intel module, which will automatically configure the environment. The guide below shows how to use the version of TBB that is installed as part of the Intel compiler toolchain. It is recommended that you install the most recent version of TBB yourself, and link against this, because older versions of TBB don't work with recent versions of GCC.

# load the gnu environment for compiling the application
module load PrgEnv-gnu
# gcc 5.x does not work with the version of TBB installed on Cray
# requires at least version 4.4 of TBB
module swap gcc/4.9.3
# load the intel programming module
# on Cray systems this automatically sets `TBB_ROOT` environment variable
module load intel
module load cmake
export CXX=`which CC`
export CC=`which cc`

# multithreading only
cmake <path to CMakeLists.txt> -DWITH_TBB=ON -DSYSTEM_CRAY=ON

# multithreading and MPI
cmake <path to CMakeLists.txt> -DWITH_TBB=ON -DWITH_MPI=ON -DSYSTEM_CRAY=ON

targeting KNL

build modparser without KNL environment

The source to source compiler "modparser" that generates the C++/CUDA kernels for the ion channels and synapses is in a separate repository. By default it will be built with the same compiler and flags that are used to build the miniapp and tests.

This can cause problems if we are cross compiling, e.g. for KNL, because the modparser compiler might not be runnable on the compilation node. You are probably best of building the software twice: Once without KNL support to create the modcc parser and next the KNL version using the now compiled executable

Modparser requires a C++11 compiler, and has been tested on GCC, Intel, and Clang compilers

  • if the default compiler on your is some ancient version of gcc you might need to load a module/set the CC and CXX environment variables.

CMake will look for the source to source compiler executable, modcc, in the PATH environment variable, and will use the version if finds instead of building its own. So add the g++ compiled modcc to your path e.g:

# First build a 'normal' non KNL version of the software

# Load your environment (see section 6 for detailed example)
export CC=`which gcc`; export CXX=`which g++`

# Make directory , do the configuration and build
mkdir build
cd build
cmake <path to CMakeLists.txt> -DCMAKE_BUILD_TYPE=release
make -j8

# set path and test that you can see modcc
export PATH=`pwd`/bin:$PATH
which modcc

set up environment

  • source the intel compilers
  • source the TBB vars
  • I have only tested with the latest stable version from on-line, not the version that comes installed sometimes with the Intel compilers.

build miniapp

# clone the repository and set up the submodules
git clone https://github.com/eth-cscs/nestmc-proto.git
cd nestmc-proto

# make a path for out of source build
mkdir build_knl
cd build_knl

# run cmake with all the magic flags
export CC=`which icc`
export CXX=`which icpc`
cmake <path to CMakeLists.txt> -DCMAKE_BUILD_TYPE=release -DWITH_TBB=ON -DWITH_PROFILING=ON -DVECTORIZE_TARGET=KNL -DUSE_OPTIMIZED_KERNELS=ON
make -j

The flags passed into cmake are described:

  • -DCMAKE_BUILD_TYPE=release : build in release mode with -O3.
  • -WITH_TBB=ON : use TBB for threading on multi-core
  • -DWITH_PROFILING=ON : use internal profilers that print profiling report at end
  • -DVECTORIZE_TARGET=KNL : generate AVX512 instructions, alternatively you can use:
    • AVX2 for Haswell & Broadwell
    • AVX for Sandy Bridge and Ivy Bridge
  • -DUSE_OPTIMIZED_KERNELS=ON : tell the source to source compiler to generate optimized kernels that use Intel extensions
    • without these vectorized code will not be generated.

run tests

Run some unit tests

cd tests
./test.exe
cd ..

run miniapp

The miniapp is the target for benchmarking. First, we can run a small problem to check the build. For the small test run, the parameters have the following meaning

  • -n 1000 : 1000 cells
  • -s 200 : 200 synapses per cell
  • -t 20 : simulated for 20ms
  • -p 0 : no file output of voltage traces

The number of cells is the number of discrete tasks that are distributed to the threads in each large time integration period. The number of synapses per cell is the amount of computational work per cell/task. Realistic cells have anywhere in the range of 1,000-10,000 synapses per cell.

cd miniapp

# a small run to check that everything works
./miniapp.exe -n 1000 -s 200 -t 20 -p 0

# a larger run for generating meaningful benchmarks
./miniapp.exe -n 2000 -s 2000 -t 100 -p 0

This generates the following profiler output (some reformatting to make the table work):

              ---------------------------------------
             |       small       |       large       |
-------------|-------------------|-------------------|
total        |  0.791     100.0  | 38.593     100.0  |
  stepping   |  0.738      93.3  | 36.978      95.8  |
    matrix   |  0.406      51.3  |  6.034      15.6  |
      solve  |  0.308      38.9  |  4.534      11.7  |
      setup  |  0.082      10.4  |  1.260       3.3  |
      other  |  0.016       2.0  |  0.240       0.6  |
    state    |  0.194      24.5  | 23.235      60.2  |
      expsyn |  0.158      20.0  | 22.679      58.8  |
      hh     |  0.014       1.7  |  0.215       0.6  |
      pas    |  0.003       0.4  |  0.053       0.1  |
      other  |  0.019       2.4  |  0.287       0.7  |
    current  |  0.107      13.5  |  7.106      18.4  |
      expsyn |  0.047       5.9  |  6.118      15.9  |
      pas    |  0.028       3.5  |  0.476       1.2  |
      hh     |  0.006       0.7  |  0.096       0.2  |
      other  |  0.026       3.3  |  0.415       1.1  |
    events   |  0.005       0.6  |  0.125       0.3  |
    sampling |  0.003       0.4  |  0.051       0.1  |
    other    |  0.024       3.0  |  0.428       1.1  |
  other      |  0.053       6.7  |  1.614       4.2  |
-----------------------------------------------------

Examples of environment configuration

Julia (HBP PCP system)

module load cmake
module load intel-ics
module load openmpi_ics/2.0.0
module load gcc/6.1.0