Skip to content
Snippets Groups Projects
  1. Apr 21, 2017
    • Sam Yates's avatar
      Try more places for validation data; workaround CMake FindCUDA bug. (#233) · 5f85bd7d
      Sam Yates authored
      Fixes #232.
      
      * Try `NMC_DATADIR` environment variable for validation data path, or else if the #defined `NMC_DATADIR` does not point to a directory, try `./validation/data` and `../validation/data`.
      * Don't define `NMC_DATADIR` if CMake version 3.7 or 3.8.
      * Extend C++17 filesystem emulation with (POSIX) implementation of `is_directory` and supporting classes, functions and enums `filesystem_error`, `file_status`, `status(..)`, `file_type` and `perms`.
      5f85bd7d
  2. Apr 18, 2017
    • Sam Yates's avatar
      Fail gracefully if git missing or broken. (#235) · 2b1ba7b7
      Sam Yates authored
      Fixes #234.
      
      * Adds dummy `DOWNLOAD_COMMAND` to `ExternalProject_Add` invocation.
      * More thorough status/warning messages regarding git submodule update
      process.
      2b1ba7b7
    • Ben Cumming's avatar
      Add power meter and refactor meter interfaces. · 99a0b1c8
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Fixes #190.
      
      The final piece in the metering features.
      
      * Add a `power_meter` which currently records energy used on each node of a Cray XC{30,40,50} systems, which all have built in `pm_counters` interface to power measurement.
      * Add information about which node each MPI rank runs on to the metering output in `meters.json`, which is needed to analyse energy recordings, which are per node, not per MPI rank.
      * Refactor collation of measurements: now the responsibility of the meter manager.
      * Add support for `gather` with `std::string` to the global communication policy, which required a back end MPI implementation and corresponding unit test.
      * Add `src/util/config.hpp` that populate the `nest::mc::config` namespace with `constexpr bool` flags describing system or environment capabilities.
      99a0b1c8
  3. Apr 13, 2017
    • Ben Cumming's avatar
      Remove generic event interface on cell_group · a0640a11
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      `cell_group` had a template method `cell_group::enqueue_events()`, which was parameterized on the type of container used to pass the set of events to enqueue. This PR removes the template, and makes `time_type` a globally defined type in `common_types.hpp`.
      
      The `time_type` that permeated the code is taken from `spike`, which is itself a specialized type alias of `basic_spike`. This is not an intuitive location to define the `time_type`, and hides the fact that as implemented it was effectively a global typedef.
      
        * Define the default time type in `common_types.hpp`: `using time_type = float`.
        * Use this global `time_type` in the definition of `spike` and `postsynaptic_spike_event`.
        * Replace generic `cell_group::enqueue_events` method with concrete `cell_group::enque_events(const std::vector<post_synaptic_event>&)`.
      a0640a11
    • Ben Cumming's avatar
      Simplify cell_group interface · 2e57545c
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Preparatory work for abstract cell group interface.
      
      * Remove `cell_group` public member functions that are not needed as part of the interface: `clear_events`, `remove_samplers`, `probe` and `reset_samplers`.
      * Remove value_type from model, and declare explicitly that samplers receive values that are doubles.
      2e57545c
  4. Apr 12, 2017
    • Sam Yates's avatar
      Move string-printf utility from lmorpho into util. (#228) · e2f48e19
      Sam Yates authored
      Move string-printf utility from lmorpho into util.
      
      * Add `util::strprintf` printf-alike function that returns a `std::string` result.
      * Include simple adaptors for `std::string` and standard smart pointer arguments fot `util::strprintf` (i.e. `std::string` arguments can be used with `%s`, smart pointers with `%p`).
      * Add unit tests to suit.
      e2f48e19
    • Ben Cumming's avatar
      Move event_binner out of cell_group.hpp (#229) · 6f6a2db6
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      * Move `event_binner` class to own header file `event_binner.hpp` with implementation in `event_binner.cpp`.
      * Move `event_binner` unit tests to own source file `test_event_binner.cpp`.
      6f6a2db6
    • Sam Yates's avatar
      Initial microbenchmark builds. (#227) · c579fa19
      Sam Yates authored
      * Use git submodule for incorporating Google benchmark library.
      * Add one microbenchmark for comparing `util::transform_view` performance.
      
      Note that the microbenchmarks are not built by default; they can be built with `make ubenches`, and then run individually. The microbenchmarks will be built in `tests/ubench/`, relative to the build directory.
      c579fa19
  5. Apr 07, 2017
    • Ben Cumming's avatar
      Feature/comm tests (#201) · 0443c271
      Ben Cumming authored
      Add unit tests for communicator
      
      fixes #200
      
          update global_communication test driver to initialize correctly in
          dry run mode
          add unit tests that test the communication::global_policy:
              basic initialization
              global spike exchange (just the spike gather step, not the
              event delivery).
          improve the formatting of the reporting from the MPI GTest wrapper
          to make it easy to see if tests have failed.
      0443c271
    • Vasileios Karakasis's avatar
      Updated Readme (#226) · 5b297b29
      Vasileios Karakasis authored and Sam Yates's avatar Sam Yates committed
      * Update `README.md` to reflect the CMake options for vectorization.
      5b297b29
    • Vasileios Karakasis's avatar
      Fixes for the Intel compiler version 16 and 17 (#223) · aa36e317
      Vasileios Karakasis authored and Sam Yates's avatar Sam Yates committed
      Proposed patch to master branch:
      * Qualify template in `indirect_view` to accommodate incorrect function template specialization determination (icpc ignores §14.8.2.4/9 rule on lvalue versus rvalue template arguments).
      * Use parenthesis constructors for type parameter in `pointer_proxy`, as icpc has not adopted the corrected behaviour for DR#1467 [http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1467].
      aa36e317
  6. Apr 06, 2017
    • Sam Yates's avatar
      Fix issue #224 test assertion failure. (#225) · 4c5bf302
      Sam Yates authored
      `cell_tree::depth_from_root()` was incorrectly traversing only
      the first branch from the tree root, leading to uninitialized
      values in the returned depth array.
      
      This leads to an incorrect maximum leaf node in `find_minimum_root()`,
      which then returns `no_parent`. This gets passed to
      `tree::change_root(size_t)`.
      
      * Correct `cell_tree::depth_from_root()` implementation.
      * Re-enable curiously disabled test case in `cell_tree.from_parent_index`
      * Add unit tests for `cell_tree::depth_from_root`.
      4c5bf302
    • Sam Yates's avatar
      Add `cc-filter` line-based filtering script (#221) · 3867a6c4
      Sam Yates authored
      * Add perl program `cc-filter`, a general purpose by-line text filter with built-in default rules for filtering text containing C++ types and expressions.
      * Add documentation for the tool to the scripts `README.md` file.
      * Add demonstration table `filters/massif-strip-cxx` for using `cc-filter` with valgrind massif output.
      3867a6c4
  7. Apr 05, 2017
  8. Apr 04, 2017
    • Ben Cumming's avatar
      Feature/time meter (#219) · 829df8d6
      Ben Cumming authored
      meter_manager correctly detects first checkpoint, which is necessary to ensure that all the timers are synchronized.
      829df8d6
    • Ben Cumming's avatar
      Metering support with time meter (#217) · 54f47392
      Ben Cumming authored
        * An abstract `meter` class that defines interface for taking a reading, and returning the meter results as a json object.
        * A `time_meter` implementation of the `meter` that measures wall time.
        * To generate metering reports with global information, the global communication policy interfaces were extended to support `gather` and `barrier` operations. These are trivial for serial and dry run policies, and wrap the appropriate MPI calls for the MPI policy.
        * a `meter_manager` type that stores a list of meters was created
          * will also have memory and power meters soon.
        * a meter manager was added to the miniapp and now records startup, model initialization, time stepping and final file io times.
      54f47392
  9. Apr 03, 2017
    • Sam Yates's avatar
      Bug fix: crash on indirect test (#218) · 5f9d4020
      Sam Yates authored
      * Bug fix: crash on indirect test
      
      * Simplify indirect overloads, add nomove/nocopy tests
      5f9d4020
    • Sam Yates's avatar
      Implementation of asynchronous integration API. (#210) · 4def111a
      Sam Yates authored
      * Stage events for next integration interval on lowered cell.
      * Use explicit binning for event coalescence.
      * Extend `event_queue` to allow checking top of queue against arbitrary predicates.
      * Add `--bin-dt` and `--bin-regular` options to miniapp (disable binning with `--bin-dt 0`).
      * Tidy up miniapp option settings class.
      
      Integration in lowered cell over multiple steps is deferred until samplers can be set up with back-end polling.
      
      Asynchronous integration itself is not yet implemented.
      4def111a
  10. Mar 31, 2017
    • Ben Cumming's avatar
      block-interleaved gpu matrix solver (#208) · 15230c69
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Fixes #185.
      
      Add a new back end GPU Hines matrix solver that uses a block-interleaved storage pattern to improve memory coalescing during the matrix solve.
      
        * Refactor the `src/backends` path into `src/backends/gpu` and `src/backends/multicore` paths that contain `gpu` and `multicore` implementations.
        * Refactor the matrix state and threshold detection members that were declared inline in the back end specifications to separate files.
        * Add a new interleaved matrix state back end.
        * Refactor all of the GPU kernels that were originally in the one back end header file into their own header files.
        * Write more comprehensive unit tests for the GPU matrix solver back end to test the `interleave` and `reverse_interleave` operations in isolation, as well as ensure that the flat and interleaved back ends produce identical results.
        * Add the GPU versions of the kinetic scheme validation tests.
      15230c69
    • Sam Yates's avatar
      Add more general indirect access view. (#216) · ca328a21
      Sam Yates authored
      * Implement `indirect_view` for indexed access via `transform_view`.
      * Extend `transform_iterator` to permit non-const access to reference-returning functor results.
      * Replace use of `indexed_view` with `indirect_view`.
      * Fix missing cpu target for vectorized modcc outputs.
      ca328a21
  11. Mar 29, 2017
  12. Mar 28, 2017
    • Sam Yates's avatar
      Bug fix for issue #196 (#211) · 1db73767
      Sam Yates authored
      Fixes #196.
      
      Correct treatment of missing coefficients in `cnexp` solver.
      
      * Extend `EXPECT_EXPR_EQ` functionality with wrapper that works with `Expression *` and `expression_ptr` arguments.
      * Replace string comparison checks in `test_symdiff.cpp` with equivalents that use `EXPECT_EXPR_EQ`.
      * Check explicitly for missing coefficient in `cnexp` solver, which should be treated equivalently to zero.
      1db73767
  13. Mar 23, 2017
    • Sam Yates's avatar
      Add `event_binner` class for explicit binning. (#204) · 48815fa6
      Sam Yates authored
      * Add class for managing state associated with binning event times across integration periods.
      * Include support for no or fixed 'regular' binning.
      * Add a gtest-assertion compatible test for comparing sequences of floating point numbers: `testing::seq_almost_eq` in `tests/unit/common.hpp`.
      * Rename `cell_` in `cell_group` to `lowered_`, to clarify intent (i.e. lowered cell state is very different from a `cell` object, and maintains state for many cells).
      * Reformat some comments for consistency.
      
      Note that the `event_binner` class is not used in this commit for actual binning: the original logic is still in place.
      48815fa6
    • Ben Cumming's avatar
      Remove openmp threading back end (#205) · 5a6f230e
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Fixes #152
      
      * Remove OpenMP back end implementation.
      * Remove CMake options for OpenMP.
      * Simplify the threading model selection code in CMake.
      * Remove Extrae benchmarking scripts, which require OpenMP.
      5a6f230e
  14. Mar 22, 2017
    • Alexander Peyser's avatar
      Debugging dryrun with 28k nodes (#160) · 393f2775
      Alexander Peyser authored
      Includes a fix to the readme which applies to any external modcc, and a way to keep from rebuilding locally an external modcc.
      393f2775
    • Sam Yates's avatar
      Tidy `event_queue` event class requirements. (#202) · ebf96678
      Sam Yates authored
      The requirements for event types for use in event_queue were restrictive and a bit 'special'.
      
        * Allow event times of any time which is well ordered by operator>.
        * Allow any event type with a public time member containing the time value.
        * Provide customization point event_time() via ADL for extracting the time value for event types that do not have a time member.
        * Simplify interface: push(begin, end) was only ever used in the unit test; add empty() const method.
        * Add unit test for more flexible functionality.
      ebf96678
    • Sam Yates's avatar
      Tame `time_type` proliferation. (#203) · c83adfcf
      Sam Yates authored
      Differing classes had their own time_type and other classes were parameterized on it in ways that were compatibile, but only by chance. 
      
      With these changes, modifying the time type used in spike will propagate through to all dependent classes.
      
        * Rename generic spike<I, T> as basic_spike<I, T>. 
        * Use spike = basic_spike<cell_member_type, float> as the common spike type.
        * Replace instances of spike_type aliases with just spike.
        * time_type aliases are defined in terms of spike::time_type.
        * Remove time_type parameterization in connection.
        * Remove time_type parameterization in communicator.
        * Remove time_type parameterization in exporter classes.
      c83adfcf
  15. Mar 21, 2017
    • Ben Cumming's avatar
      More robust NMC_NUM_THREADS parsing (#198) · 219b782f
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      fixes #197
      
      * Add functions that can find thread affinity and the number of available cores on linux systems via `sched_getaffinity`.
      * On other systems they default to "unknown affinity" and return 0 to indicate that the number of cores is unknown.
      * Set the default number of threads according to the new functions above if no environment variable explicitly setting the number of threads is set.
      * Validate environment variable value against regex and range check; terminate if improper.
      * Terminate if no number of threads is provided and the library is unable to determine a sensible number automatically.
      219b782f
  16. Mar 20, 2017
    • Vasileios Karakasis's avatar
      modcc: AVX512 vectorisation backend (#154) · c55d9d66
      Vasileios Karakasis authored
      Basic features:
      
      * Compile with -t avx512
      * Automatically set up by CMake if USE_OPTIMIZED_KERNELS is on and VECTORIZE_TARGET is set to KNL 
      * Generic SIMD printer that contacts a SIMD backend for emitting the actual SIMD intrinsics
      
      Note: compilation for the avx512 target requires the Intel compiler. 
      c55d9d66
  17. Mar 15, 2017
    • Ben Cumming's avatar
      Fix that ensures all spikes are saved to disk (#195) · f01e2147
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      * Add an additional exchange at the end of the time stepping loop to ensure that any spikes still in spike buffers are exchanged and saved to disk.
      * Remove the parallel sort from the cthread back end to fix compilation errors with CUDA and cthreads.
      * Fix `nvcc` warning concerning initialization of `std::size_t` variable with literal `-1`.
      f01e2147
    • Sam Yates's avatar
      Simple `net_receive` device kenels (#193) · 9df68703
      Sam Yates authored
      Fixes #183
      
      Use a device kernel for net_receive state updates.
      Note: very naive, but gives about a 30% speed up on the 1000 cell miniapp test. All the fun optimization will end up under issue #184.
      
      This also incorporates PR #192, so this PR will be amended if that one is rejected.
      9df68703
    • Sam Yates's avatar
      Allow explicit make of GPU mechanisms. (#192) · 195befaa
      Sam Yates authored
      When NMC_WITH_CUDA is off, still allow the generation
      of CUDA mechanisms with modcc by exposing the
      `build_all_gpu_mods` target; this can be performed
      independently of the presence (or otherwise) of a
      CUDA development environment.
      
      This eases development of GPU-related modcc tasks,
      as preliminary work can be performed and checked on
      machines without a CUDA environment.
      195befaa
    • Sam Yates's avatar
      Base report generation `makefile` on `latexmk`. (#191) · c8428889
      Sam Yates authored
      Using latexmk simplifies the makefile structure and avoids many of the problems with redundant pdflatex invocation.
      
      Rewrite docs/model/makefile and docs/model/images/makefile to use latexmk for building and cleaning.
      Remove generated report.bbl from repo.
      c8428889
  18. Mar 10, 2017
  19. Mar 09, 2017
    • Ben Cumming's avatar
      Unify matrix assembly+solve in backends (#179) · e9b6fc1c
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      The storage of matrix data, and the operations on matrices (i.e. matrix assembly and matrix solution), have not been implemented in a consistent manner.
      
      The main problem was that matrix assembly was managed by a `matrix_assembler` type provided by the back end, which had views on information that it required to perform assembly. Specifically, the views were on properties like `face_conductance`, model state like `voltage` and `current`, and on the underlying matrix storage `d`, `u` and `p`.
      
      This was not a good solution because 
        * there was a hidden dependence of the assembly on model data. e.g. if the voltage storage was reallocated, the reference in the `matrix_assembler` would become stale.
        * if we want back end specific optimizations that require a different data layout to that used elsewhere in the back end, this layout should be shared with the solver, but there is no obvious mechanism for doing that.
      
      This patch addresses this by making a `matrix_state` type in the back end.
        * stores the matrix state opaquely, allowing back-end specific optimizations
        * provides interface for performing operations on the state, namely `assemble`, `solve` and `get_solution`.
        * stores fields such as `face_conductance` and `cv_capacitance` that were stored in the `fvm_multicell`, despite being used only in matrix assembly.
        * takes `voltage` and `current` as parameters to the `assemble` interface, removing the hidden reference to model state.
      
      The actual data layout has not been changed in this PR. Instead, the interface has been refactored and hidden references removed so that it is now possible to implement back-end specific optimizations cleanly.
      e9b6fc1c
    • Sam Yates's avatar
      Morphologies in miniapp (#178) · f91d0b3c
      Sam Yates authored
      Fix morphology section ctor bug.
      Add morphology pools for miniapp from which morphologies are drawn in the recipe.
      Add command-line options to expose above.
      Add option --report-compartments to (slowly) check the min, mean, and max number of compartments in the generated cells across the simulation.
      Use morphology::add_section to do all the heavy lifting; no need to make section_geometry objects by hand, unless you really, really want to.
      f91d0b3c
    • Sam Yates's avatar
      Add flat morphology representation to `nestmc` lib. (#176) · 88dfb499
      Sam Yates authored
      This PR is a prelude to closer integration of the random morphology generation with the miniapp, with the first step being support for recipes that create cells from morphologies generated off-line. It aims to use nest::mc::morphology as the flat morphology-only representation that can be used to construct nest::mc::cell objects and which can exist as a target for SWC conversion and random morphology generation.
      
      Simplify swc io implementation:
      Avoid throwing exceptions in istream parsing and swc_record constructors — only throw when explicitly checking consistency, or when parsing a full sequence of records.
      Allow direct access to record members.
      Separate parsing considerations from canonicalization (renumbering, sorting) of a sequence of records.
      Move lmorpho morphology classes into src/
      Add invariant check procedure for morphology.
      Make cells via swc -> morphology -> cell building, rather than direct swc -> cell.
      Allow option to use 'natural' discretization in morphology to determine number of compartments in built cell.
      Update test_swcio.cpp to accommodate new API.
      88dfb499
  20. Mar 08, 2017
    • Ben Cumming's avatar
      remove small cudaMemcpys (#175) · d9839df0
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      This patch removes the many small `cudaMemcpy` calls for single values, except for those from calling `net_receive` in event delivery.
      
      The small copies during initialization were from when the upper diagonal and time invariant component of the diagonal were computed on the host. There were many small reads/writes to device memory accessing the `p` and `u` vectors.
      
      * Remove many small device copies in matrix setup by copying required data to host, computing, and then copying back in one copy.
      * Add `constexpr` test `is_debug_mode()` for having been compiled in debug mode (tests `NDEBUG`).
      * Only perform `is_physical_solution` test if `is_debug_mode()` is true. (The `is_physical_solution` test triggers a single copy from device to host on each time step to test whether the voltage has exceeded some "reasonable" physical bounds.)
      d9839df0
    • Ben Cumming's avatar
      Use native cuda atomicAdd on Pascal (#174) · 0e0bcd8f
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Fixes #125
      
      * Add `cuda_atomic_add` and `cuda_atomic_sub` wrappers for atomic addition.
      * Choose native atomic add for Pascal and later architectures.
      * Choose CAS workaround for devices earlier than Pascal.
      * Add unit test for wrappers.
      * Change default CUDA architecture target to `sm_60` in `CMakeLists.txt`.
      0e0bcd8f
  21. Mar 07, 2017