Skip to content
Snippets Groups Projects
  1. Mar 15, 2017
    • Sam Yates's avatar
      Simple `net_receive` device kenels (#193) · 9df68703
      Sam Yates authored
      Fixes #183
      
      Use a device kernel for net_receive state updates.
      Note: very naive, but gives about a 30% speed up on the 1000 cell miniapp test. All the fun optimization will end up under issue #184.
      
      This also incorporates PR #192, so this PR will be amended if that one is rejected.
      9df68703
    • Sam Yates's avatar
      Allow explicit make of GPU mechanisms. (#192) · 195befaa
      Sam Yates authored
      When NMC_WITH_CUDA is off, still allow the generation
      of CUDA mechanisms with modcc by exposing the
      `build_all_gpu_mods` target; this can be performed
      independently of the presence (or otherwise) of a
      CUDA development environment.
      
      This eases development of GPU-related modcc tasks,
      as preliminary work can be performed and checked on
      machines without a CUDA environment.
      195befaa
    • Sam Yates's avatar
      Base report generation `makefile` on `latexmk`. (#191) · c8428889
      Sam Yates authored
      Using latexmk simplifies the makefile structure and avoids many of the problems with redundant pdflatex invocation.
      
      Rewrite docs/model/makefile and docs/model/images/makefile to use latexmk for building and cleaning.
      Remove generated report.bbl from repo.
      c8428889
  2. Mar 10, 2017
  3. Mar 09, 2017
    • Ben Cumming's avatar
      Unify matrix assembly+solve in backends (#179) · e9b6fc1c
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      The storage of matrix data, and the operations on matrices (i.e. matrix assembly and matrix solution), have not been implemented in a consistent manner.
      
      The main problem was that matrix assembly was managed by a `matrix_assembler` type provided by the back end, which had views on information that it required to perform assembly. Specifically, the views were on properties like `face_conductance`, model state like `voltage` and `current`, and on the underlying matrix storage `d`, `u` and `p`.
      
      This was not a good solution because 
        * there was a hidden dependence of the assembly on model data. e.g. if the voltage storage was reallocated, the reference in the `matrix_assembler` would become stale.
        * if we want back end specific optimizations that require a different data layout to that used elsewhere in the back end, this layout should be shared with the solver, but there is no obvious mechanism for doing that.
      
      This patch addresses this by making a `matrix_state` type in the back end.
        * stores the matrix state opaquely, allowing back-end specific optimizations
        * provides interface for performing operations on the state, namely `assemble`, `solve` and `get_solution`.
        * stores fields such as `face_conductance` and `cv_capacitance` that were stored in the `fvm_multicell`, despite being used only in matrix assembly.
        * takes `voltage` and `current` as parameters to the `assemble` interface, removing the hidden reference to model state.
      
      The actual data layout has not been changed in this PR. Instead, the interface has been refactored and hidden references removed so that it is now possible to implement back-end specific optimizations cleanly.
      e9b6fc1c
    • Sam Yates's avatar
      Morphologies in miniapp (#178) · f91d0b3c
      Sam Yates authored
      Fix morphology section ctor bug.
      Add morphology pools for miniapp from which morphologies are drawn in the recipe.
      Add command-line options to expose above.
      Add option --report-compartments to (slowly) check the min, mean, and max number of compartments in the generated cells across the simulation.
      Use morphology::add_section to do all the heavy lifting; no need to make section_geometry objects by hand, unless you really, really want to.
      f91d0b3c
    • Sam Yates's avatar
      Add flat morphology representation to `nestmc` lib. (#176) · 88dfb499
      Sam Yates authored
      This PR is a prelude to closer integration of the random morphology generation with the miniapp, with the first step being support for recipes that create cells from morphologies generated off-line. It aims to use nest::mc::morphology as the flat morphology-only representation that can be used to construct nest::mc::cell objects and which can exist as a target for SWC conversion and random morphology generation.
      
      Simplify swc io implementation:
      Avoid throwing exceptions in istream parsing and swc_record constructors — only throw when explicitly checking consistency, or when parsing a full sequence of records.
      Allow direct access to record members.
      Separate parsing considerations from canonicalization (renumbering, sorting) of a sequence of records.
      Move lmorpho morphology classes into src/
      Add invariant check procedure for morphology.
      Make cells via swc -> morphology -> cell building, rather than direct swc -> cell.
      Allow option to use 'natural' discretization in morphology to determine number of compartments in built cell.
      Update test_swcio.cpp to accommodate new API.
      88dfb499
  4. Mar 08, 2017
    • Ben Cumming's avatar
      remove small cudaMemcpys (#175) · d9839df0
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      This patch removes the many small `cudaMemcpy` calls for single values, except for those from calling `net_receive` in event delivery.
      
      The small copies during initialization were from when the upper diagonal and time invariant component of the diagonal were computed on the host. There were many small reads/writes to device memory accessing the `p` and `u` vectors.
      
      * Remove many small device copies in matrix setup by copying required data to host, computing, and then copying back in one copy.
      * Add `constexpr` test `is_debug_mode()` for having been compiled in debug mode (tests `NDEBUG`).
      * Only perform `is_physical_solution` test if `is_debug_mode()` is true. (The `is_physical_solution` test triggers a single copy from device to host on each time step to test whether the voltage has exceeded some "reasonable" physical bounds.)
      d9839df0
    • Ben Cumming's avatar
      Use native cuda atomicAdd on Pascal (#174) · 0e0bcd8f
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Fixes #125
      
      * Add `cuda_atomic_add` and `cuda_atomic_sub` wrappers for atomic addition.
      * Choose native atomic add for Pascal and later architectures.
      * Choose CAS workaround for devices earlier than Pascal.
      * Add unit test for wrappers.
      * Change default CUDA architecture target to `sm_60` in `CMakeLists.txt`.
      0e0bcd8f
  5. Mar 07, 2017
    • Vasileios Karakasis's avatar
      Fix compilation with Clang 3.9 (#172) · 2ca1d47f
      Vasileios Karakasis authored and Sam Yates's avatar Sam Yates committed
      * Add missing `<string>` header to `modcc/msparse.hpp`
      2ca1d47f
    • Ben Cumming's avatar
      move spiked detection to back end (#167) · 86384267
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Moves spike detection to the back end, which is required to have fast spike detection on the gpu.
      
      Fixes #106.
      
      Backend:
      * Move spike detection into the backend.
      * Add a `threshold_watcher` type to each backend that is initialized with a reference to the field that it is to watch, along with indexes and thresholds for each variable in the field that is being watched. This class presents a simple host-side interface:
         * `test(t)` tests for crossings at time `t`.
         *  `crossings()` returns all threshold crossings.
         * `clear_crossings()` resets (clears) the collected crossing events.
      * Implement the multicore back end detector directly from the original code.
      * Implement a `gpu_stack` for use with the gpu back end, that lets threads in a kernel conditionally push back into a flat array.
      
      Miniapp:
      * Run a single-step dummy run of model before starting the profiler, when profiling is enabled.
      * Initialize the spike output callback functions _after_ the dummy run so that spikes from the dummy step are not output.
      
      `cell_group`:
      * Pass responsibility for spike detection to the lowered cells (`fvm_multicell`) and associated back ends.
      
      `memory`:
      * Add a new allocator for CUDA managed memory.
      * Implement `managed_ptr` and `make_managed_ptr`, which are managed memory equivalents of `std::unique_ptr` and `std::make_unique_ptr`.
      
      Tests:
      * Improve host-side spike detection unit tests.
      * Add device-side spike detection unit tests.
      * Add unit tests for `gpu_stack`.
      
      Building:
      * Surpress CMP0023 CMake warning.
      86384267
    • Sam Yates's avatar
      Bugfixes for `lmorpho` (#170) · 216ae6ed
      Sam Yates authored
      * Ignore dendrite branches with negative radii arising from correlated child diameter distribution.
      * Fix fencepost errors in morphology discretization.
      * Rename `tip.p` to `tip.point`.
      216ae6ed
  6. Mar 06, 2017
    • Sam Yates's avatar
      Morphology generation with L-systems (#162) · 7f9288fb
      Sam Yates authored
      Adds a stand-alone program for the generation of random morphologies form a L-system description. The algorithm is that of Burke (1992), with some of the extensions provided by Ascoli et al. (2001).
      
      Two sets of L-system parameters have been included, corresponding to alpha motoneurons and Purkinje cells, but there is certainly something wrong with the data for the latter, and more correct numbers will probably need to be synthesized from existing Purkinje cell morphological information.
      
      Documentation for `lmorpho` is incomplete, but the command line help (`--help`) goes some way to explain the usage. In order to get output, one must specify `--swc` or `--pvec` (or both) to emit SWC files or the structural parent vectors. Coarser discretization can be obtained with the `--segment` option.
      
      Some minor modifications have been included in other parts of the source repo:
      * Added copy constructor for `TextBuffer` in `modcc/textbuffer.hpp`, required to keep clang++ 3.5 happy.
      * Disabled 'maybe-uninitialized' warnings from gcc, as these get raised improperly with some uses of `util::optional`. (Seems that this is also an issue for `boost::optional`.)
      * Moved `tinyopt.hpp` into the `src/` directory, and extended the functionality a bit to support keyword arguments.
      * `validate.cpp` modified slightly to accommodate new `tinyopt`.
      7f9288fb
    • w-klijn's avatar
      Merge pull request #165 from bcumming/bug/modcc · ab7db0d5
      w-klijn authored
      fix #164
      ab7db0d5
    • Benjamin Cumming's avatar
      Fix issue #164 · 7197a976
      Benjamin Cumming authored
      Disambiguate e symbol in statements like the following
        for (auto& e: e->terms())
      This caused GCC 5 to give an error.
      7197a976
    • Benjamin Cumming's avatar
      Merge github.com:eth-cscs/nestmc-proto · 0d1e0c00
      Benjamin Cumming authored
      0d1e0c00
  7. Mar 05, 2017
    • Alexander Peyser's avatar
      dependencies in docs/model/makefile to building images (#149) · 15041be0
      Alexander Peyser authored
      Build images.dir when building report.pdf
      Add outputs to .gitignore
      15041be0
    • Sam Yates's avatar
      Add linear kinetic schemes to modcc. (#145) · 5846f90b
      Sam Yates authored
      Incorporate symbolic GE code from prototype (with some simplifications) in msparse.hpp, symge.hpp and symge.cpp, together with unit tests.
      
      Add two kinetic scheme test cases for validation: test_kin1 (simple exponential scheme) and test_kinlva (combination of exponential gate and a three-species kinetic scheme, modelling a low voltage-activated Calcium channel from Wang, X. J. et al., J. Neurophys. 1991).
      
      Adapt numeric HH validation data generation to LVA Ca channel, with explicit stopping at stimulus discontinuities.
      
      Add two new validation tests based on above: kinetic.kin1_numeric_ref and kinetic.kinlva_numeric_ref (multicore backend only).
      
      Introduce a BlockRewriterBase visitor base class, as an aid for visitors that transform/rewrite procedure bodies; refactor KineticRewriter over this class.
      
      Introduce common error_stack mixin class for common functionality across Module and the various procedure rewriters.
      
      Implement visitors and public-facing convenience wrappers in symdiff.hpp and symdiff.cpp:
      
      involves_identifer for testing if an expression contains given identifiers.
      constant_simplify for constant folding with removal of trivial terms arising from a NumberExpression of zero or one.
      expr_value to extract the numerical value of a NumberExpression, or NaN othereise.
      is_zero to test if an expression is numerically zero.
      symbolic_pdiff to perform symbolic partial differentiation; this adds a new (not parseable) expression subclass to represent opaque partial differential terms.
      substitute to substitute identifiers for other expressions within an expression.
      linear_test for linearity, diagonality and homogeneity testing (this is probably redundant, given ExpressionClassifier already exists).
      Simplify unnecessary uses of make_unique with Vistor subclasses.
      
      Make SOLVE statement rewriting more generic, through the use of solve-rewriter visitors CnexpSolverVisitor, SparseSolverVisitor, and DirectSolverVisitor; implementations in solvers.hpp and solvers.cpp. Supports multiple SOLVE statements for independent subsets of state variables with the BREAKPOINT block.
      
      Add block rewriter for the removal of unused local variables, with convenience wrapper remove_unused_locals.
      
      Generalize is_in utility in modccutil.hpp.
      
      Simplify expression comparison in modcc unit tests with EXPECT_EXPR_EQ macro added to tests/modcc/test.hpp, that operates by comparing expression text representations.
      
      Simplify and consolidate verbose printing in modcc unit tests with verbose_print function that tests the global verbose flag and handles expression_ptr and similar which have to_string methods.
      5846f90b
  8. Feb 21, 2017
  9. Feb 20, 2017
    • Ben Cumming's avatar
      Fix pointer to view conversion for optimized intel kernels (#153) · 000ee422
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Bug: `modcc` was generating invalid code when generating optimized kernels.
      
      The optimized kernels use raw pointers instead of views, and the generated code was using view semantics.
      
      * Use appropriate `memory::copy` invocation for the optimized kernel case.
      000ee422
    • Ben Cumming's avatar
      Add dry run feature (#151) · 61d6b21d
      Ben Cumming authored
      Add a dry run mode, inspired by the dry run mode implemented in NEST. A dry run
      of a model simulates running a large distributed model by running only the work
      of one of the ranks, with artificial spike input from the other "dummy" ranks.
      
      This is implemented as a new global communication back end, dryrun_global_policy,
      the implementation of which is straightforward:
      
      a new implementation of gather_spikes that takes the local spikes and
      replicates them n times where n is the total number of simulated
      ranks.
      the global_policy::size() method returns the number of ranks in the
      simulated run
      the new back end has to store some state that records the number
      of simulated ranks and cells per rank, which are set using the new
      global_policy::set_sizes() method
      Some CMake modificatins were required:
      
      make the selection of the global communication backend have the same
      interface as that for selecting the threading back end.
      small improvements to the selection of the threading back end to
      make the cthread option visible in ccmake, and have consistent
      CMake variable naming.
      Command line options were also extended:
      
      a --dry-run-size or -D option can be used to supple the number
      of dry run ranks on the command line.
      the miniapp driver was updated to set the dry run size and cell
      count via the new global_policy::set_sizes() interface.
       
      61d6b21d
  10. Feb 08, 2017
  11. Feb 07, 2017
  12. Feb 01, 2017
    • Alexander Peyser's avatar
      Threading pool (#144) · 6c98c1fc
      Alexander Peyser authored and Sam Yates's avatar Sam Yates committed
      Add threading pool built on `std::thread`
      
      * Provide new threading model 'cthread' for nestmc based on a pool of `std::thread` objects.
      * Unify duplicated timer class provided by `serial`, `omp` and now `cthread` threading models.
      6c98c1fc
  13. Jan 21, 2017
  14. Jan 12, 2017
    • John Biddiscombe's avatar
      CMake fixes (#137) · 71aa4b18
      John Biddiscombe authored
      * Fix CMakeLists to handle build as a subproject
      
      When several CMake generated projects are build together, it is common
      practice to have a 'superproject' CMakeLists that uses
        add_subdir(proj1)
        add_subdir(proj2)
        ...
      where each subproject is a self contained CMake based project
      (Example proj1=HPX, proj2=nestmc, proj3=another, ...)
      
      CMAKE_SOURCE_DIR always points to the top level directory which
      is the superproject dir in this case, whereas PROJECT_SOURCE_DIR
      always points to the root of the current project() in the CMakeLists
      so one shouod use PROJECT_SOURCE_DIR as this gets the relative paths
      correct.
      
      * Add option to turn off auto generation from *.mod files
      
      * Fix #134 : Change CMake WITH_OPTION to NMC_WITH_OPTION, compiler #define to NMC_HAVE_OPTION
      
      1) The user may select an option by saying NMC_WITH_XXX
      
      2) This may trigger CMake to use find_package(...) or setup some
      other variables. CMake can then set variable NMC_HAVE_XXX and add a
      what has actually been used.
      
      3) Code should use #ifdef NMC_HAVE_XXX to check for a feature
      
      Old CMake/define      New CMake                 Compiler #define
      ----------------      ---------                 ----------------
      THREADING_MODEL       NMC_THREAD_MODEL
          WITH_TBB          NMC_WITH_TBB              NMC_HAVE_TBB
          WITH_OMP          NMC_WITH_OMP              NMC_HAVE_OMP
          WITH_SERIAL       NMC_WITH_SERiAL           NMC_HAVE_SERIAL
      
      WITH_MPI              NMC_WITH_MPI              NMC_HAVE_MPI
      WITH_CUDA             NMC_WITH_CUDA             NMC_HAVE_CUDA
      WITH_GPU                                        NMC_HAVE_GPU
      WITH_ASSERTIONS       NMC_WITH_ASSERTIONS       NMC_HAVE_ASSERTIONS
      WITH_TRACE            NMC_WITH_TRACE            NMC_HAVE_TRACE
      WITH_PROFILING        NMC_WITH_PROFILING        NMC_HAVE_PROFILING
      
      Other user visible CMake vars
      -----------------------------
      VECTORIZE_TARGET            -> NMC_VECTORIZE_TARGET
      USE_OPTIIZED_KERNELS        -> NMC_USE_OPTIIZED_KERNELS
      BUILD_VALIDATION_DATA       -> NMC_BUILD_VALIDATION_DATA
      BUILD_JULIA_VALIDATION_DATA -> NMC_BUILD_JULIA_VALIDATION_DATA
      BUILD_NRN_VALIDATION_DATA   -> NMC_BUILD_NRN_VALIDATION_DATA
      VALIDATION_DATA_DIR         -> NMC_VALIDATION_DATA_DIR
      
      Variables such as NMC_THREADING_MODEL and NMC_VECTORIZE_TARGET now use
      enumerated cmake values so you can toggle between them in ccmake gui.
      SYSTEM_TYPE_CRAY/BGQ        -> NMC_SYSTEM_TYPE (Generic/Cray/BGQ)
      
      * Use generator expression for modcc path
      
      Some IDE's (like Xcode for example), override the CMake binary paths
      and add /Debug or /Release etc so rules that have hard coded paths
      to binaries will fail.
      71aa4b18
    • Vasileios Karakasis's avatar
      Fix Clang 3.9 compilation (#138) · d2c05fb0
      Vasileios Karakasis authored
      d2c05fb0
  15. Dec 22, 2016
  16. Dec 21, 2016
  17. Dec 20, 2016
    • Ben Cumming's avatar
      generalised stimulus (#126) · 8b30c273
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      feature: #67 (GPU Support)
      
      Implement stimulii as part of the mechanism framework, as described here: #87 (comment)
      
      * A hand-written stimulus point process derived from `mechanisms::mechanism` was written for each back end. 
      * The lowered `fvm_multicell` type explicitly constructs a stimulus mechanism if there are any stimuli attached to the its cells.
      * This mechanism is added to the other mechanisms in the lowered cell, so that the update of current is performed in the current update loop (i.e. via the `nrn_current()`) method.
      
      This isn't an ideal solution: we still have a hard coded stimulus type in the lowered cell, however the stimulus is now "in the right spot", and we can refine this better when we work on a better design for generalised mechanisms (i.e. when we have figured out what we going to do).
      
      fixes #104.
      8b30c273
    • Vasileios Karakasis's avatar
      Feature/mechanisms unit tests (#96) · fc7e2785
      Vasileios Karakasis authored and Sam Yates's avatar Sam Yates committed
      These tests are intended to test the sanity of the `modcc` generated code for the individual mechanisms. The don't have any physical background. Potentially optimized CPU-targeted mechanisms generated in the build are compared with unoptimized mechanisms generated from the reference modules.
      
      * Add generic unit tests for individual mechanisms.
      * Make unit tests exercise potential problems with aliased indexes (point processes).
      * Ensure unit tests correspond to multiple low level vector operations.
      * Ensure unit tests run with voltage, current and indices initialized with varying values.
      * Refactor CMake code for module compilation to reduce cut-and-paste code and build complexity.
      fc7e2785
  18. Dec 19, 2016
  19. Dec 13, 2016
    • Sam Yates's avatar
      Re-instate ball-and-taper validation tests. (#124) · 1963634e
      Sam Yates authored
      Fixes #85
      1963634e
    • Sam Yates's avatar
      Fix modcc precedence parsing bug (#127) · dfb32094
      Sam Yates authored
      * Modify `parse_expression` to take a controlling (parent) precedence.
      * `parse_expression` folds left over sequences of sub-expressions with decreasing operator precedence (accumulates in `lhs`).
      * Use recursion rather than accumulator for left fold in `parse_binop` to simplify code logic.
      * Extend parser unit test to cover more complicated, multi-level expression.
      * Remove (now) redundant parenthesis from derivative check block in kinetic rewriter test.
      
      Fixes #94
      dfb32094
    • Ben Cumming's avatar
      Bug/issue#20 (#123) · fbe3f45a
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      This PR addresses two issues that were closely related:
      * correctly accounting for the current contribution of density mechanisms to CVs at branch points, where the density mechnanism is not present on all branches. This was discussed in issue #20.
      * adding support for weighting of current densities calculated from density mechanism. This is required to weight the current contribution to CVs in issue #20.
      
      ## small updates
      
      * update CMake rules for finding libunwind, because it broke for some reason.
      * add `binary_find` and unit tests to the algorithms library
          * returns an iterator, as opposed to `std::binary_search`, which returns a boolean.
          * works with ranges.
      * added `subrange_view` specialization that takes a subrange specified by a pair of indexes
      * added `assign_from` to range utils
          * a helper function that returns a proxy type that can be copied into a container
          * evaluate a range and store contents in a container, with minimal verbosity in user code
          * simple syntax for initializnig a container where it is declared, e.g. `std::vector<int> vec = util::assign_from(...);`
      * update the LaTeX documentation for the FVM scheme
      * fix bug when an event has to be delivered exactly 0 ms in the future in `cell_group`
          * avoid divide by zero on the diagonal of the linear system in the new formulation.
      
      ## updated FVM formulation
      
      Most of the changes were in the `fvm_multicell` type.
      * The FVM formulation was changed slightly, moving parameters (e.g. dividing both sides of equation by dt)
          * to ensure symmetric positive definate matrix property with new partial weights
          * to give the terms in the linear syste, i.e. the matrix, solution and rhs vectors more natural units    * the system is no `G*v = i`, where `G` is conductance matrix with units [uS], `v` is voltage [mV] and `i` is current [nA].
          * change names of fields from non-descriptive things like `face_alpha` to `face_conductance`
          * add more comments that explicitly give the expected units of fields inside the back end (to help future generationas trying to understand the code... and to help me understand it three weeks from now) 
      * keep additional information about the surface area of sub-control-volumes at the start and end of each segment when calcuating CV areas, capacitances and face conductances
      * use this information to ensure that current contributions from density channels on branching points are properly accounted for by weighting
      * remove weighting from point process currents, because they are calculated with the correct units of nA
      * plumbing work to add support for user-supplied weights
          * the backend code for multicore and gpu now supports weights for mechanism generation
          * update `cprinter` and `cudaprinter` to generate kernels that use user-supplied weights for density channels
      * backend modifications to generate matrices and RHS vectors according to the new formulation
      
      fixes #20
      fixes #120
      fbe3f45a