- Aug 24, 2018
-
-
Benjamin Cumming authored
Move implementation of `gpu_context` from header to `cpp` file, so that `ARB_WITH_CUDA` doesn't leak from library implementation.
-
- Aug 22, 2018
-
-
* Add gpu_context as part of execution context containing information about GPU availability, managed_memory synchronization, and atomic double availability. * Choose between ON and OFF for ARB_GPU in CMake. If ON compile for K20, K80, and P100 Note that we still need compile time information about the GPU in cuda_atomic.hpp for atomicAdd(double*, double*). This is because the function is only defined when the program is compiled for sm_60 or more.
-
Fixes #568.
-
Fixes #564
-
Use a compat::fma wrapper for std::fma to avoid a bug in the tree optimizer in GCC version < 8.2. See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87046 Fixes #568.
-
- Aug 20, 2018
-
-
Global temperature for mechanisms. * Make 'celsius' magic in modcc: now an indexed variable. * Add a new temperature data source for indexed variables. * Add support to printers for indexed variables that reference a scalar. * Check that indexed variables aren't used in PROCEDURE blocks (this is a problem not just for 'celsius'). * Modify built-in mod files to pass celsius as a parameter to rates() procedures. * Add global temperature to shared_state classes, and initialize through backend mechanism superclasses. * Add some infrastructure for unit-test only mechanisms. * Set modcc flags globally in top level CMakeLists.txt. * Add test mechanism/module for checking celsius setting. * Add unit test for multicore and gpu mechanism celsius setting. * Make common mechanism private field data access helper for unit tests. * Use helper in temperature, synapses tests. * Fix warning in `distribued_context.hpp` about errant semicolon. * Fix global scal...
-
- Aug 06, 2018
-
-
Sam Yates authored
Two MacPorts/gcc7 issues: std::uint64_t is unsigned long long on OS X, breaking an assumption about size_t in the distributed_context interface. Problems with missing errno defines in the standard library headers. With MacPorts gcc7, the installed c++config.h defines _GLIBCXX_HAVE_EOWNERDEAD and _GLIBCXX_HAVE_ENOTRECOVERABLE, but the corresponding errno defines are not provided by sys/errno.h unless __DARWIN_C_SOURCE, which takes its value from _POSIX_C_SOURCE if defined, is greater than or equal to 200809L. Technically a MacPorts configuration bug? but easily worked around. Use basic integral types for communication collectives interfaces. Define _POSIX_C_SOURCE to be 200809L for glob.cpp. Fixes #562.
-
- Jul 31, 2018
-
-
Sam Yates authored
* Remove dependency on memory library and range utils from `multi_event_stream.cu` source. Fixes #545
-
* Replace distributed_contest with shared_ptr<distributed_context> in execution_context and pass around the shared pointer instead of a raw pointer. * Fix construction of mpi_context * Remove num_threads() from arb and arb::threading. Modify mpi_context so it also returns a shared_ptr. proc_allocation is initialized from execution context to determine available resources. * Rename threading backend files. Delete useless files. * Pass execution_context by const reference or value. * Remove code duplication in thread_system constructors.
-
- Jul 26, 2018
-
-
* Update the install docs for architecture build options * Update to reflect new install target
-
Sam Yates authored
Reduce redundant functionality across event_generator, time_seq and schedule by providing a low-heap overhead interface to schedule and using that for time sequences in event_generator and specialized cell groups. * Have schedule return pair of pointers as view to generated times. * Fix missing DEBUG/TRACE functionality. * Use rate instead of mean_dt for Poisson schedule. * Move merge_events() functionality to simulator.cpp. * Migrate event_generator to event span interface. * Migrate tourney_tree to event span interface. * Only invoke tourney_tree merge if generators have events in the epoch. * Use schedule for times in event_generator implementations. * Replace seq_generator with explicit_generator that keeps a copy of events. * Replace vector_backed_generator and poisson_generator with schedule_generator. * Replace time_seq uses with schedule. * Add default empty schedule. * Move rounding error test for regular time sequence into sch...
-
Use correct profiling and mpi flags. Fix profiler doc
-
- Jul 25, 2018
-
-
Benjamin Cumming authored
This small refactor simplifies the interface and implementation of the `tree` type. * use `std::vector` instead of `memory::array` for internal storage in `arb::tree` * return a `util::range` intstead of a view for `tree::children(int)` * remove unused functionality for changing the root of a tree.
-
Add initialize method to the profiler to set up the needed threading parameters given a simulation's task system.
-
- Jul 24, 2018
-
-
- Task system is no longer a single system private to the implementation of the threading backend and used everywhere. A separate task_system can be used (with a specified number of threads) for every simulation. - arb::execution_context is the interface to task_system and the previously defined distributed_context - TBB and serial support has been removed. Cthreads is the only threading backend available.
-
- Jul 20, 2018
-
-
Sam Yates authored
Fixes #139. * Split colours and `pprintf(...)` into `io/pprintf.hpp` header. * Remove generic `to_string()` function, replacing its very occasional usage with `pprintf`. * Move block pretty printing into own .cpp file; this is the only place that the vector ostream printer was used. * Remove `enum_hash`, as not needed with C++14. * Move `is_in` utility function to `util.hpp`. * Remove old SIMD printer backend code.
-
* Perform device-to-device copy when device_reference is assigned a device_reference.
-
Benjamin Cumming authored
Better error message on missing file.
-
Sam Yates authored
* Don't call a missing file an 'internal compiler error'.
-
- Jul 19, 2018
-
-
Cthreads classes: - Notification queue : Manages tasks: tries or forces popping and pushing tasks. - Task system : manages the notifications queues; controls which queue to pop from/push to; controls spinning on queues if necessary; manages creating/joining threads. Is a singleton. - Task group : manages synchronization on a group of tasks. Operation: - Each thread has an associated queue - Task system _tries to_ push tasks in one of the available queues. If it is unable to acquire a lock on a queue, it tries the next in a round robin fashion. After it loops all queues if it still hasn't successfully pushed the task, it spins on a single queue until lock is acquired and task is pushed. - Task system _tries to_ pop a task from the calling thread's queue. If it is unable to acquire the lock, it tries to steal the task from another thread's queue, in a round robin loop. If it is still unable to pop a task, it spins on its the c...
-
- Jul 13, 2018
-
-
-
All example code and validation tests no longer require access to private include directories. This provides the minimal requirement for an installable target Note that it is still not possible to separately build mechanisms from NMODL with just the public includes, and there is not yet any package configuration file creation for use with CMake or pkg-config. * Replace `hw::node_info` with `proc_allocation`, describing local resources for the purposes of domain decomposition. * Group processor counting and gpu counting implementation under `node_info.cpp`. * Remove `domain_decomposition` dependency from `cell_group_factory.hpp` so we can use the latter to test for backend support for a cell kind. * Add `arb::cell_kind_implementation()` which performs the mapping from cell kind and backend kind to a `cell_group_ptr`-producing function (this will then become the site for custom cell group kind mapping support in future work). * Move headers for aux library ...
-
- Jul 10, 2018
-
-
Fixes #526.
-
Sam Yates authored
Fixes issue #524.
-
- Jul 06, 2018
-
-
Fixes #182.
-
Migrate source/build to c++14 (#522) * Update `CMakeLists.txt` for C++14 option. * Update to gcc 6 minimum. * Update travis CI from gcc-5 to gcc-6 * Use `std::..._t` style type traits, replacing `util::` aliases. * Use `std::cbegin`, `std::cend`, and `std::make_unique`, replacing `util::` versions. * Remove `DEDUCED_RETURN_TYPE` macros. * Remove redundant return type specifications. * Use correct ADL for `begin` and `end` in (almost all) the range utilities. * Remove redundant `mechinfo` ctor (aggregate initialization suffices). * Use lambda capture initializers where appropriate. * Use generic `std::equal_to`. * Use variable templates for `math::infinity` and `math::pi`. * Remove `enum_hash` workaround. * Use `""s` string literals where we were using our own `""_s` construction. * Use generic lambda for recursive lambda instead of `std::function` wrapper. * Use generic lambda for generic arithmetic tests. Fixes #358.
-
Who broke the build? Sam did!
-
- Jul 05, 2018
-
-
Fixes issue #517. Deprecate the IBM xlC compiler. xlC generates code that is an order of a magnitude slower than gcc, while generating spurious warnings, and requiring hacks and workarounds to pass all tests. Supporting it makes no sense. * Add test and fatal error for xlC detection in CheckCompilerXLC.cmake. * Move xlC 13 misdetection work around to CheckCompilerXLC.cmake. * Remove xlC-specific compatibility workarounds from code.
-
This time we're moving `recipe.hpp` and `simulation.hpp`, plus the requirements they bring. Code changes: * Pimplize `simulation`. * Consolidate arbor exceptions: all non-cell kind specific exceptions that might be expected to reach user code now have consistent messages and fit in an exception hierarchy based at `arb::arbor_exception`. Internal errors throw an `arb::arbor_internal_error` exception. * Renamed `postsynaptic_spike_event` to `spike_event`. (Note: `pse_vector` name is unchanged.) * Repurposed `pprintf` and moved it into `strprintf.h` — further consolidation is a TODO. * Made a generic `util::to_string` to avoid redundancy of `operator<<` overloads and other `to_string` definitions. Defaults to ADL `to_string`, `std::to_string`, and finally tries using `operator<<`.
-
- Jul 03, 2018
-
-
Further work to public install target. * Move SIMD classes, cell description classes, simple sampler to public include. * Rename `cell` to `mc_cell`, `segment` to `mc_segment`, and remove `_description` from cell description class names and includes. * Move `compartment_model` out of `mc_cell` interface and use only in `fvm_layout.cpp`. * (Provisionally) remove area/volume methods on `mc_cell` and `mc_segment`.
-
- Jun 25, 2018
-
-
CMake and build refactoring * Use CUDA as first-class language (leading to CMake 3.9 minimum version requirement). * Use 'modern CMake' interface libraries for compiler options, include file and library dependency tracking. Interface library targets: * `arbor-deps`: compiler options and library requirements for the `libarbor.a` static library, as governed by configure-time options and environment. * `arbor-private-headers`: include path for non-installed headers, as required by unit tests and arbor itself. * `arbor-aux`: helper classes and utilities used across tests and examples. * `ext-json`, `ext-tclap`, `ext-tbb`, `ext-benchmark`, `ext-sphinx_rtd_theme`: externally maintained software that we include (directly or via submodule) in the `ext/` subdirectory. * Single static library `libarbor.a` includes all built-in modules and CUDA objects. * Simply configuration options: * `ARB_WITH_TRACE`, `ARB_AUTORUN_MODCC_ON_CHA...
-
- Jun 22, 2018
-
-
Benjamin Cumming authored
Add a new cell type, and corresponding cell_group implementation, for benchmarking the simulator library architecture. Add an benchmark_cell_group, where each cell in the group generates a spike train prescribed by a time_seq takes a prescribed time interval per cell to perform the cell_group::advance method. With this cell type, one can easily build arbitrary networks with prescribed spiking and cell update overheads. A miniapp that uses this cell type to build a benchmark model is implemented in example/bench. Fixes #493 Fixes #501
-
- Jun 07, 2018
-
-
Benjamin Cumming authored
The built in profiler generates timings for state and current for individual multicore mechanisms. Modcc generates and PE(advance_integrate_{state,current}_X) profiler calls (along with corresponding PL() for calls to multicore mechanism nrn_state and nrn_current API calls. No timings are made for the gpu back end, which is not properly supported by the current profiling tools.
-
- Jun 04, 2018
-
-
Changes have been made to the simd implementation of mechansim functions: - The node_index array (array of indices that specifies for each mechanism the CVs where it is present), is now partitioned into 4 arrays according to the constraint on each simd_vector in node_index: 1. contiguous array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are contiguous 2. constant array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are identical 3. independent array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are independent (no repetitions) but not contiguous 4. none array: contains the indices of all simd_vectors in node_index where the none of the above constraints apply When mechanism functions are executed, they loop over each of the 4 arrays separately. This allows for optimizations in every category. - The modcc...
-
Benjamin Cumming authored
Changes to libarbor ------------------------- Time sequences were added in `src/time_sequence.hpp`: - added new `time_seq` type that implements a type-erasure interface for the concept of a time sequence generator. - added poisson, regular and vector-backed implementations of the time sequence concept. Event generators: - The poisson, regular and vector-backed implementations of the event generator concept were refactored to use the. Cell groups: - Removed the `dss_cell_group` and `rss_cell_group` and associated types. - Added a generic spike source cell that generates a sequence of spikes at time points specified by a `time_seq`. Using this approach, an additional `cell_group` specialization is not required for each type of sequence, and user-defined sequences can be used with minimal overhead. Unit tests ------------ - Added unit tests for `time_seq`. - Simplified `event_generator` unit tests, because much of the testing of the sequences was moved to the `time_seq` tests. - Added unit tests for `spike_source_cell_group`. Changes to miniapp ------------------------- - simplified the miniapp by removing the command line options for using an input spike chain from file. - updated the miniapp recipe to use `spike_source` cell group instead of `dss_cell_group`.
-
- Jun 01, 2018
-
-
Move from choosing the distributed communication model from a compile time choice (the old `arb::communication::communication_policy` type) to a run time decision. * Add `arb::distributed_context` class that provides the required interface for distributed communication implementations, using type-erasure to provide value semantics. * Add two implementations for the distributed context: `arb::mpi_context` and `arb::local_context`. * Allow distribution over a user-supplied MPI communicator by providing it as an argument to `arb::mpi_context`. * Add `mpi_error` exception type to wrap MPI errors. * Move contents of the `arb::communication` namespace to the `arb` namespace. * Add preprocessor for-each utility `ARB_PP_FOREACH`. * Rewrite all examples and tests to use the new distributed context interface. * Add documentation for distributed context class and semantics, and update documentation for load balancer and simulation classes accordingly. Fixes #472
-
- May 15, 2018
-
-
Replace hard-coded index variable names in modcc cuda printer with ones derived from the external variable. Uses `decode_indexed_variable`.
-
- May 11, 2018
-
-
* Use `sys/types.h` instead of `endian.h` for greater portability. * Avoid use of constructor for `std::vector` in unit tests that is only available from C++14.
-
Sam Yates authored
* Update SIMD developer docs to reflect newly merged mechanism refactor work.
-
- May 09, 2018
-
-
Completes CUDA printing in modcc. * Add CudaPrinter visitor, overriding CPrinter. * Add `ostream` `operator<<` overloads for `arb::gpu::shared_state` and `device_view` for debugging. * Fix GPU back-end bugs.
-