Skip to content
Snippets Groups Projects
  1. Jun 25, 2018
    • Sam Yates's avatar
      Feature/lib install target part i (#506) · ad1c78ab
      Sam Yates authored and Benjamin Cumming's avatar Benjamin Cumming committed
      CMake and build refactoring
      
      *   Use CUDA as first-class language (leading to CMake 3.9 minimum version requirement).
      
      *   Use 'modern CMake' interface libraries for compiler options, include file and library dependency tracking. Interface library targets:
          * `arbor-deps`: compiler options and library requirements for the `libarbor.a` static library, as governed by configure-time options and environment.
          * `arbor-private-headers`: include path for non-installed headers, as required by unit tests and arbor itself.
          * `arbor-aux`: helper classes and utilities used across tests and examples.
          * `ext-json`, `ext-tclap`, `ext-tbb`, `ext-benchmark`, `ext-sphinx_rtd_theme`: externally maintained software that we include (directly or via submodule) in the `ext/` subdirectory.
       
      *   Single static library `libarbor.a` includes all built-in modules and CUDA objects.
      
      *   Simply configuration options:
          *  `ARB_WITH_TRACE`, `ARB_AUTORUN_MODCC_ON_CHA...
      ad1c78ab
  2. Jun 22, 2018
    • Benjamin Cumming's avatar
      Benchmark cell type (#500) · 6ba39a92
      Benjamin Cumming authored
      Add a new cell type, and corresponding cell_group implementation, for benchmarking the simulator library architecture.
      
      Add an benchmark_cell_group, where each cell in the group
      
      generates a spike train prescribed by a time_seq
      takes a prescribed time interval per cell to perform the cell_group::advance method.
      With this cell type, one can easily build arbitrary networks with prescribed spiking and cell update overheads.
      A miniapp that uses this cell type to build a benchmark model is implemented in example/bench.
      
      Fixes #493
      Fixes #501
      6ba39a92
  3. Jun 07, 2018
    • Benjamin Cumming's avatar
      profile multicore mechanism state and current calls individually (#492) · 5e65a939
      Benjamin Cumming authored
      The built in profiler generates timings for state and current for individual multicore mechanisms.
      
      Modcc generates and PE(advance_integrate_{state,current}_X) profiler calls (along with corresponding PL() for calls to multicore mechanism nrn_state and nrn_current API calls.
      
      No timings are made for the gpu back end, which is not properly supported by the current profiling tools.
      5e65a939
  4. Jun 04, 2018
    • noraabiakar's avatar
      Simd partition by constraint (#494) · 64171e43
      noraabiakar authored and Benjamin Cumming's avatar Benjamin Cumming committed
      Changes have been made to the simd implementation of mechansim functions: 
      
      - The node_index array (array of indices that specifies for each mechanism the CVs where it is present), is now partitioned into 4 arrays according to the constraint on each simd_vector in node_index:
          1. contiguous array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are contiguous
          2. constant array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are identical
          3. independent array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are independent (no repetitions) but not contiguous 
          4. none array: contains the indices of all simd_vectors in node_index where the none of the above constraints apply
      
          When mechanism functions are executed, they loop over each of the 4 arrays separately. This allows for optimizations in every category. 
      
      - The modcc compiler was modified to generate code for the previous changes, including the optimizations per constraint:
          1. contiguous array: we use vector load/store and vector arithmetic. 
          2. constant array: we load only one element and broadcast it into a simd_vector; we use vector arithmetic; we reduce the result; we store one element.   
          3. indepndent array: we use vector scatter/gather and vector arithmetic. 
          4. none array: we cannot operate on the simd_vector in parallel, we loop over the elements to read, perform arithmetic and write back 
      
      - Added a mechanism benchmark for pas, hh and expsyn
      
      - Moved/modified some functions in simd.hpp to ensure that the correct implementation of a function is being called. 
      64171e43
    • Benjamin Cumming's avatar
      generalize time sequences (#496) · 3082607f
      Benjamin Cumming authored
      Changes to libarbor
      -------------------------
      
      Time sequences were added in `src/time_sequence.hpp`:
      - added new `time_seq` type that implements a type-erasure interface for the
        concept of a time sequence generator.
      - added poisson, regular and vector-backed implementations of the time sequence
        concept.
      
      Event generators:
      - The poisson, regular and vector-backed implementations of the event generator
        concept were refactored to use the.
      
      Cell groups:
      - Removed the `dss_cell_group` and `rss_cell_group` and associated types.
      - Added a generic spike source cell  that generates a sequence of spikes
        at time points specified by a `time_seq`. Using this approach, an
        additional `cell_group` specialization is not required for each type of
        sequence, and user-defined sequences can be used with minimal overhead.
      
      Unit tests
      ------------
      
      - Added unit tests for `time_seq`.
      - Simplified `event_generator` unit tests, because much of the testing
        of the sequences was moved to the `time_seq` tests.
      - Added unit tests for `spike_source_cell_group`.
      
      Changes to miniapp
      -------------------------
      
      - simplified the miniapp by removing the command line options for using an input spike chain from file.
      - updated the miniapp recipe to use `spike_source` cell group instead of `dss_cell_group`.
      3082607f
  5. Jun 01, 2018
    • Benjamin Cumming's avatar
      Runtime distributed context (#485) · 5fde0b00
      Benjamin Cumming authored and Sam Yates's avatar Sam Yates committed
      Move from choosing the distributed communication model from a compile time choice (the old `arb::communication::communication_policy` type) to a run time decision.
      
      * Add `arb::distributed_context` class that provides the required interface for distributed communication implementations, using type-erasure to provide value semantics.
      * Add two implementations for the distributed context: `arb::mpi_context` and `arb::local_context`.
      * Allow distribution over a user-supplied MPI communicator by providing it as an argument to `arb::mpi_context`.
      * Add `mpi_error` exception type to wrap MPI errors.
      * Move contents of the `arb::communication` namespace to the `arb` namespace.
      * Add preprocessor for-each utility `ARB_PP_FOREACH`.
      * Rewrite all examples and tests to use the new distributed context interface.
      * Add documentation for distributed context class and semantics, and update documentation for load balancer and simulation classes accordingly.
      
      Fixes #472
      5fde0b00
  6. May 15, 2018
  7. May 11, 2018
  8. May 09, 2018
    • Benjamin Cumming's avatar
      CUDA back end for the new mechanism infrastructure (#487) · e0f0b5d7
      Benjamin Cumming authored and Sam Yates's avatar Sam Yates committed
      Completes CUDA printing in modcc.
      * Add CudaPrinter visitor, overriding CPrinter.
      * Add `ostream` `operator<<` overloads for `arb::gpu::shared_state` and `device_view` for debugging.
      * Fix GPU back-end bugs.
      e0f0b5d7
    • Sam Yates's avatar
      Mechanism Refactor: multicore and simd (#484) · 68135148
      Sam Yates authored
      First commit of two for mechanism refactor work (refer to PR #484 and PR #483).
      
      FVM/mechanism code:
      * Refactor mechanism data structures to decouple backend-specific implementations and mechanism metadata.
      * Add mechanism catalogue for managing mechanism metadata and concrete implementation prototypes.
      * Add fingerprint-checking to mechanism metadata and implementations to confirm they come from the same NMODL source (fingerprint is not yet computed, but tests are in place).
      * Split FVM discretization work out from FVM integrator code.
      * Use abstract base class over backend-templated FVM integrator class `fvm_lowered_cell_impl` to allow separate compilation of `mc_cell_group` and to remove the dummy backend code.
      * Add a new FVM-specific scalar type `fvm_index_type` that is an alias for `int` to replace
      `fvm_size_type` in fvm layouts and mechanisms. This was chosen as an alternative
      to making `unsigned` versions of all our SIMD implementation classes.
      * Extend `cable1d_neuron` global data to encompass: mechanism catalogue; default ion concentrations and charges; global temperature (only for Nernst); initial membrane potential.
      
      Modcc:
      * Collect printer sources in modcc under `printer/`.
      * Move common functionality across printers into `printer/printerutil.{hpp,cpp}`.
      * Add string to file I/O implemented in routines read_all and write_all in `io/bulkio.hpp`.
      * Implement indent-friendly source code generation via a `std::streambuf` filter `io::prefixbuf` defined in `io/prefixbuf.hpp`, together with manipulators and a corresponding std::ostream-derived wrapper.
      * Rewrite printers to use new infrastructure: cpu target incorporates SIMD printing options; CUDA printer at this point produces only stubs for CUDA kernel wrappers.
      * Modify SIMD printing command line options for modcc: `-s` enables explicit vectorization using the SIMD classes;  `-S <N>` allows a specific data width to be prescribed.
      * Fix problem in `test_ca.mod` with uninitialized ion current.
      * Add infrastructure support to allow future pre-computation of SIMD index conflict cases for (hopefully) faster scatters and updates.
      * Simplify `IndexedVariable` expressions in the AST, making data source explicit via a `sourceKind` enum, and leaving the indexing method and index names up to the printers.
      * Allow state variables in the AST to 'shadow' an ion concentration — these are assigned in the
      generated `write_ions` method.
      
      SIMD classes:
      * Add `simd_cast` operation between SIMD value types of the same width, and with `std::array`. (Note: this was tested and used in an early development version of the code, but not in this version. It was still a lacuna in the original SIMD wrappers, so it has been left in.)
      * Restructure SIMD gather/scatter API to use a `simd::indirect` expression,  which encapsulates a pointer and SIMD offset.
      * Add `simd::index_constraint` scoped enum to describe knowledge of contention in indirect indices, so that we can branch on this to the appropriate implementation.
      * Add SIMD concrete implementation routines `reduce_add` for horizontal reduction and `element0` for access to first lane scalar value.
      * Add SIMD value method `sum()` that exposes implementation `reduce_add`.
      * Add SIMD concrete implementation routine `compound_indexed_add` that provides the implementation for `indirect(p, simd_indices) += simd_value` construction.
      * Fix SIMD `implbase` bug where some static methods were using the `implbase` fall-back functions instead of the derived class specialized implementations.
      * Move SIMD mathematical functions into friend routines of `simd_impl` in order to resolve implicit conversions from scalars in mixed SIMD-scalar operations.
      * Use a templated `tag` class to dispatch on SIMD concrete implementation types, to avoid problems with incomplete types in method signatures.
      * Remove old SIMD intrinsics.
      
      CMake infrastructure:
      * Downcase some variables in `CMakeLists.txt` files to  distinguish them visually from CMake keywords and variables.
      * Split arbor modcc vectorization option (now `ARB_VECTORIZE`) and target-architecture optimization (now `ARB_ARCH`).
      * For `arbor` and `arbormech` targets, and in particular not the `modcc` target, use `ARB_ARCH` to generate corresponding target-appropriate binaries, including, for example, appropriate SIMD support.
      * Extend `CompilerOptions.cmake` to map as best as able between the various target architecture names (we use the gcc names) and the correct option to pass to the compiler based on the compiler and platform.
      * Add work-around for misidentification by CMake of XL C as Clang.
      * As a temporary work-around, include `arbormech` library twice on link line to resolve circular arbor–arbormech dependencies.
      
      Unit tests:
      * Extend repertoire of generic sequence equality/near equality testing support  in `common.hpp`.
      * Add warning suppression for icc for the malloc instrumentation code.
      * SIMD unit tests for indirect expressions, compound indirect add, reduction.
      * Make some exact tests into floating point 'near' tests when comparing computed areas and lengths in swc and fvm layout tests, to account for compiler (e.g. icc) performing semantically inequivalent floating point operation reordering or fusion at `-O3`.
      * Split out some of the CUDA tests into separate .cpp/.cu files for  separate-compilation purposes.
      
      Other:
      * The `padded_allocator` has been modified to propagate alignment/padding on move and copy (these semantics make their use much easier and safer in the multicore mechanism instantiation code).
      * Map/table searching utilities in `util/maputil.hpp`.
      * Fixes for correct sequence type categorization and `begin/end` ADL.
      * Fixes for type guards for range methods that take universal references.
      * Removal of some redundant code in range utilities through the use of universal references.
      * Add new range view `reverse_view` for ranges delineated by bidirectional iterators.
      * Add single argument form of `make_span` to count up from zero, and associated helper `count_along` that gives a span that indexes a supplied container.
      * Moved `prefixbuf` to `modcc` source.
      * Make sequence positive and negative tests in algorithms generic.
      * Add `private`-subverting helper code/macro to `tests/unit/common.hpp` to reduce the number of public testing-only interfaces in the library code.
      * Add virtual destructors for virtual base classes.
      * Add new arb::math:: functions: `next_pow2` for unsigned integral types, `round_up` to round a number away from zero to next largest magnitude multiple.
      * New `index_into` implementation that supports bidirectional access (moved to `util::` namespace).
      * Fix problem in `test_ca.mod` with uninitialized ion current.
      * Rework dangerous `memory::array(Iter, Iter)` constructor to be less dangerous (and do the expected thing).
      * Allow ranges to be constructed from other ranges if the iterators are compatible.
      68135148
  9. Apr 11, 2018
    • Ben Cumming's avatar
      Domain decomposition and simulation C++ API docs (#471) · 4c742a57
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Add two new documentation pages for the C++ API
      
      * Add domain decomposition page that covers `domain_decomposition`, `node_info` and `partition_load_balance`.
      * Add simulation page that describes `arb::simulation` API interface.
      * Fix some small typos elsewhere in the docs.
      * Use `std::move` when adding spike callbacks to `arb::simulation` (useful if callbacks are stateful).
      4c742a57
    • Ben Cumming's avatar
      Fix support for Keplar (K20 & K80) GPUs. (#470) · 6b659a39
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Fixes issue #467 
      
      * Add GPU synchronization points where required for Kepler to coordinate CPU access of managed memory.
      * Use hand-rolled double precision atomic addition for Kelper targets.
      * Replace `ARB_WITH_CUDA` build option with `ARB_GPU_MODEL` option that takes one of 'none', 'K20', 'K80' or 'P100', and set up source-code defines accoringly.
      * Clean up of redundant compiler flags and defines no longer required now that the project uses separate compilation for CUDA sources.
      6b659a39
  10. Apr 06, 2018
  11. Apr 05, 2018
    • Ben Cumming's avatar
      Add C++ docs for recipe (#461) · bc6fcffd
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Add some C++ API documentation.
      
      * Create C++ API section in docs.
      * Document `arb::recipe`: both a class reference along with more explanatory text and best practices guide.
      * Add some class documentation of basic types required to understand recipe definition.
      * Some in-code comment clean up.
      * Change `arb::cell_kind` from a vanilla enum to a scoped enum.
      bc6fcffd
  12. Mar 29, 2018
    • Ben Cumming's avatar
      rename class 'model' to 'simulation' (#462) · 2b2044a6
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      The name `arb::model` did not clearly describe the role of the class, while `arb::simulation` better captures that this is an instantiation of a model for the purpose of running a simulation, as distinct from the description of a model represented by an `arb::recipe` instance.
      
      * Rename sources `model.{hpp,cpp}` to `simulation.{hpp,cpp}`.
      * Rename class `arb::model` to `arb::simulation`.
      * Update docs and tests to suit.
      2b2044a6
    • Ben Cumming's avatar
      merge all SIMD docs into a single topic (#463) · 3d83af5b
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Put all the SIMD docs in a single topic, to simplify the documentation tree.
      3d83af5b
  13. Mar 27, 2018
    • Ben Cumming's avatar
      Installation Guide (#459) · 0cf65a4c
      Ben Cumming authored
      Added an installation guide to the Read The Docs
      Removed the outdated build/install information from README.md
      Link from README to Read The Docs
      Updated the splash page for Read The Docs
      0cf65a4c
    • Ben Cumming's avatar
      wrap warp intrinsics to fix depricated warnings (#456) · 7e6ea389
      Ben Cumming authored
      CUDA 9 introduced new, fine-grained, thread synchronization primitives.
      In doing so, it introduced new forms of the warp intrinsics like __shfl_up, depricating the old symbols in the process.
      
      It will be a while before we can use 9 as the default minimum, so we have to support compilers that expect the new and old behavior.
      
      There are two options: wrap the intrinsics in question, or pass nvcc a flag to not issue warnings about depricated symbols. I go for the approach of wrapping, because I would rather keep the compiler warning turned on.
      
      Fixes #379.
      7e6ea389
  14. Mar 26, 2018
    • Sam Yates's avatar
      Add padded allocator for aligned and padded vectors. (#460) · 581c4ef3
      Sam Yates authored
      Padded vectors with run-time padding/alignment guarantees will form the basis of the storage class for the new CPU and SIMD generated mechanisms.
      
      * Add `padded_allocator` that aligns and pads allocations.
      * Make microbenchmark for `default_construct_adaptor` that overrides the allocator construct() to default- instead of value-initialization on values.
      * Add `with_instrumented_malloc` class for tracking malloc, realloc, etc. calls.
      * Add unit tests for `padded_allocator`.
      581c4ef3
  15. Mar 20, 2018
  16. Mar 19, 2018
    • Sam Yates's avatar
      Avoid intermediate underflow in expm1 calc with ftz. (#454) · 1499bf1e
      Sam Yates authored
      Intel compiler with default options does not guarantee correct fp behaviour with subnormals; it presumably sets the fp state to flush to zero.
      
      Reordering a multiply and divide in the expm1 calculation avoids a transient subnormal value that was causing the routine to incorrectly return zero for very small, but normal, arguments.
      1499bf1e
  17. Mar 16, 2018
    • Sam Yates's avatar
      Fix broken namespace renaming in SIMD (#453) · f3be6dff
      Sam Yates authored
      f3be6dff
    • Sam Yates's avatar
      SIMD wrappers for Arbor generated mechanisms. (#450) · 2dff9c41
      Sam Yates authored
      This provides a bunch of SIMD intrinsic wrappers as a precursor to the SIMD printers.
      
      The aim is that the SIMD printer can be agnostic regarding the particular vector architecture.
      
      The design is based rather loosely on the proposal P0214R6 for C++ Parallelism TS 2. The transcendental function implementations are adapted from the existing SIMD architecture-specific code, which in turn are based on the Cephes library algorithms.
      
      The custom CSS for the html documentation have been tweaked.
      2dff9c41
    • Ben Cumming's avatar
      5fe81e83
    • Ben Cumming's avatar
      refactor git submodule support in cmake (#448) · 4c66432f
      Ben Cumming authored
      In some places our CMake scripts were attempting to check out git submodules when required, if they have not already been checked out. The code that does this was cut and pasted, and was getting unwieldy.
      
      To minimise the responsibilities of CMake, this PR
      
      removes calls to git
      introduces a function check_git_submodule that can be used to test if a git submodule is installed, and print a helpful message that informs the user how to check it out if needed.
      introduces a function add_error_target that makes a target that prints a message then quits with an error. This can be used to generate a proxy target when a problem is detected during CMake setup. This means that an error is only generated when building a target with a missing dependency, instead of an error during CMake setup.
      refactors the CMake setup for the docs and ubenches targets to use these new features.
      4c66432f
  18. Mar 15, 2018
    • Ben Cumming's avatar
      Improve TBB vs. CMake (#451) · 459d6562
      Ben Cumming authored
      This replaces the CMake templates provided by TBB with a much more sane alternative!
      
      The TBB CMake templates had a very strange workflow, that involved downloading the TBB source and compiling it, which made it impossible to configure the TBB build, and caused problems on systems without connection to the internet.
      
      We replace this with a fork of the TBB repository maintained by Github user @wjakob:
      https://github.com/wjakob/tbb
      This fork provides a sane CMakeLists.txt that can be configured from our CMake setup.
      It is added as a git submodule, so it can be downloaded with the rest of the repository, hence not requiring connection to the internet during CMake configuration.
      
      It could be extended to use a user-provided build of TBB to use instead of building it.
      
      fixes #332.
      459d6562
    • Ben Cumming's avatar
      Multithreading friendly profiler (#447) · 1e461b1b
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      Replace the profiler with a simpler design that works for nested multi-threaded regions.
      
      * Replace hierarchical profiler accounting with strictly exclusive regions.
      * Make tree grouping of profile data a presentation concern.
      * Uncouple profiler semantics from Arbor classes such as `model`.
      * Add thorough documentation for the new profiler to the library documentation.
      1e461b1b
  19. Feb 28, 2018
    • Ben Cumming's avatar
      Value semantics for event_generators · aca84730
      Ben Cumming authored
      Use type erasure tricks to remove abstract base class for `event_generator`.
      
      This simplifies all the code that uses `event_generator`s (not radically, but it is simpler).
      aca84730
  20. Jan 29, 2018
  21. Jan 26, 2018
    • kabicm's avatar
      LIF neurons: CPU backend with Brunel Network miniapp (#441) · 2d4bd154
      kabicm authored
      Two main contributions:
      
      1) Implementation of LIF neuron model with no kernel and no external input (I_e=0)
      
      The input current to each neuron is therefore just the sum of all the weights of incoming spikes. We integrate in jumps dt = min(t_final - t, t_event - t), since we know the exact solution of the differential equal describing the membrane potential.
      
      2) Miniapp for simulating the Brunel network of LIF neurons. 
      
      The network consists of 2 main populations: excitatory and inhibitory populations. Each neuron from the network receives a fixed number (proportional to the size of the population) of incoming connections from both of these groups. In addition to the input from excitatory and inhibitory populations, each neuron receives a fixed number of connections from the Poisson neurons producing the Poisson-like input that is integrated into the LIF cell group so that the communication of Poisson events is bypassed. 
      2d4bd154
  22. Jan 25, 2018
  23. Jan 18, 2018
  24. Jan 15, 2018
    • Sam Yates's avatar
      Fix improper brace-initialization in unit test. (#437) · 81d8d0b5
      Sam Yates authored
      * Replace improper use of brace initialization of vectors of classes with a deleted move constructed with a sequence of `emplace_back` invocations.
      81d8d0b5
    • Ben Cumming's avatar
      Fix event-generator bugs in model (#439) · c27757f9
      Ben Cumming authored
      There were two latent bugs in the event generation part of `model`.
      
      1. A segmentation fault when initializing the `event_generators` in the `model` constructor caused by using an index variable after it had been incremented.
      2. Events generated during the first epoch were not delivered on time.
      
      The first issue was simple to fix, by ensuring that the coutning variable is incremented at the end of the loop.
      
      The second issue required refactoring the event wrangling inside `model`. Events can be introduced into a model via three sources:
      
      1. Generated by spike exchange
      2. By calling the `model::inject_events()` interface
      3. `event_generator`s attached to cells.
      
      The refactoring was required to ensure that all three sources are handled correctly. There is further opportunities for refactoring the code to make it a bit cleaner, specifically putting the wrangling code in its own type that could be tested seperately, outside `model`, but that is beyond the scope of this fix. 
      c27757f9
    • Ben Cumming's avatar
      Make cell type copyable by default (#430) · 51d09f7d
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      * Enable `cell` copy constructor, remove special tag type used to guard `cell` cloning.
      * Provide sane defaults for `recipe` methods.
      51d09f7d
  25. Dec 22, 2017
  26. Dec 21, 2017
    • Sam Yates's avatar
      Fix indirection in ion concentration write. (#425) · 17f7db98
      Sam Yates authored
      * Fix indirection in ion concentration write.
      * Remove second indirection in ion write assignment.
      * Extend ion write unit test to cover non-contiguous ion CV cases and verify correct ion concentration averaging.
      
      Fixes #424.
      17f7db98
  27. Dec 20, 2017
    • Ben Cumming's avatar
      Move miniapps path to 'example/' (#423) · f94f0eab
      Ben Cumming authored and Sam Yates's avatar Sam Yates committed
      * Rename `miniapps` subdirectory to `example`.
      * Have all example executables be built under `example` in the build directory.
      * Update Travis CI to run miniapp from new path.
      f94f0eab