Skip to content
Snippets Groups Projects
Commit 845077e6 authored by Sam Yates's avatar Sam Yates Committed by Benjamin Cumming
Browse files

Update install docs for architecture build options. (#489)

* Update the install docs for architecture build options
* Update to reflect new install target
parent 0d20df25
No related branches found
No related tags found
No related merge requests found
......@@ -25,9 +25,9 @@ with very few tools.
=========== ============================================
Tool Notes
=========== ============================================
Git To check out the code, min version 2.0.
CMake To set up the build, min version 3.0.
compiler A C++11 compiler. See `compilers <compilers_>`_.
Git To check out the code, minimum version 2.0.
CMake To set up the build, minimum version 3.8 (3.9 for MPI).
compiler A C++14 compiler. See `compilers <compilers_>`_.
=========== ============================================
.. _compilers:
......@@ -35,8 +35,7 @@ with very few tools.
Compilers
~~~~~~~~~
Arbor requires a C++ compiler that fully supports C++11 (we have plans to move
to C++14 soon).
Arbor requires a C++ compiler that fully supports C++14.
We recommend using GCC or Clang, for which Arbor has been tested and optimised.
.. table:: Supported Compilers
......@@ -44,7 +43,7 @@ We recommend using GCC or Clang, for which Arbor has been tested and optimised.
=========== ============ ============================================
Compiler Min version Notes
=========== ============ ============================================
GCC 5.2.0 5.1 probably works, 5.0 doesn't.
GCC 6.1.0
Clang 4.0 Clang 3.8 and later probably work.
Apple Clang 9
Intel 17.0.1 Needs GCC 5 or later for standard library.
......@@ -95,10 +94,9 @@ We recommend using GCC or Clang, for which Arbor has been tested and optimised.
faster compilation times; fewer compiler bugs; and support for recent C++ standards.
.. Note::
The IBM xlc compiler versions 13.1.4 and 13.1.6 have been tested for compiling on
IBM power 8. Arbor contains some patches to work around xlc compiler bugs,
however we do not recommend using xlc because GCC produces faster code,
with faster compilation times.
The IBM XL C/C++ compiler for Linux up to version 14 is not supported, owing to unresolved
compiler issues. We strongly recommend building with GCC or Clang instead on PowerPC
platforms.
Optional Requirements
---------------------
......@@ -153,13 +151,13 @@ If you use the zip file, then don't forget to run Git submodule update manually.
.. _building:
Building Arbor
==============
Building and Installing Arbor
=============================
Once the Arbor code has been checked out, it can be built by first running CMake to configure the build, then running make.
Below is a simple workflow for: **1)** getting the source; **2)** configuring the build;
**3)** building; **4)** then running tests.
**3)** building; **4)** running tests; **5)** install.
For more detailed build configuration options, see the `quick start <quickstart_>`_ guide.
......@@ -185,6 +183,9 @@ For more detailed build configuration options, see the `quick start <quickstart_
./test/test.exe
./test/global_communication.exe
# 5) Install (by default, to /usrlocal).
make install
This will build Arbor in release mode with the `default C++ compiler <note_CC_>`_.
.. _quickstart:
......@@ -195,20 +196,23 @@ Quick Start: Examples
Below are some example of CMake configurations for Arbor. For more detail on individual
CMake parameters and flags, follow links to the more detailed descriptions below.
.. topic:: `Debug <buildtarget_>`_ mode with `assertions <debugging_>`_,
`single threaded <threading_>`_.
.. topic:: `Debug <buildtarget_>`_ mode with `assertions <debugging_>`_ enabled.
If you encounter problems building or running Arbor, compile with these options
for testing and debugging.
.. code-block:: bash
cmake .. -DARB_THREADING_MODEL=serial \
-DARB_WITH_ASSERTIONS=ON \
-DCMAKE_BUILD_TYPE=debug
cmake .. -DARB_WITH_ASSERTIONS=ON -DCMAKE_BUILD_TYPE=debug
.. topic:: `Release <buildtarget_>`_ mode (compiler optimizations enabled) with the default
compiler, optimized for the local `system architecture <architecture_>`_.
.. code-block:: bash
cmake .. -DARB_ARCH=native
.. topic:: `Release <buildtarget_>`_ mode (i.e. build with optimization flags)
with `Clang <compilers_>`_
.. topic:: `Release <buildtarget_>`_ mode with `Clang <compilers_>`_.
.. code-block:: bash
......@@ -216,25 +220,25 @@ CMake parameters and flags, follow links to the more detailed descriptions below
export CXX=`which clang++`
cmake ..
.. topic:: `Release <buildtarget_>`_ mode on `Haswell <vectorize_>`_ with `cthread threading <threading_>`_
.. topic:: `Release <buildtarget_>`_ mode for the `Haswell architecture <architecture_>`_ and `explicit vectorization <vectorize_>`_ of kernels.
.. code-block:: bash
cmake .. -DARB_THREADING_MODEL=cthread -DARB_VECTORIZE_TARGET=AVX2
cmake .. -DARB_VECTORIZE=ON -DARB_ARCH=haswell
.. topic:: `Release <buildtarget_>`_ mode on `KNL <vectorize_>`_ with `TBB threading <threading_>`_
.. topic:: `Release <buildtarget_>`_ mode with `explicit vectorization <vectorize_>`_, targeting the `Broadwell architecture <vectorize_>`_, with support for `P100 GPUs <gpu_>`_, and building with `GCC 5 <compilers_>`_.
.. code-block:: bash
cmake .. -DARB_THREADING_MODEL=tbb -DARB_VECTORIZE_TARGET=KNL
export CC=gcc-5
export CXX=g++-5
cmake .. -DARB_VECTORIZE=ON -DARB_ARCH=broadwell -DARB_GPU_MODEL=P100
.. topic:: `Release <buildtarget_>`_ mode with support for: `P100 GPUs <gpu_>`_; `AVX2 <vectorize_>`_; and `GCC 5 <compilers_>`_
.. topic:: `Release <buildtarget_>`_ mode with `explicit vectorization <vectorize_>`_, optimized for the `local system architecture <architecture_>`_ and `install <install_>`_ in ``/opt/arbor``
.. code-block:: bash
export CC=gcc-5
export CXX=g++-5
cmake .. -DARB_VECTORIZE_TARGET=AVX2 -DARB_GPU_MODEL=P100
cmake .. -DARB_VECTORIZE=ON -DARB_ARCH=native -DCMAKE_INSTALL_PREFIX=/opt/arbor
.. _buildtarget:
......@@ -249,101 +253,51 @@ with ``-g -O0`` flags), use the ``CMAKE_BUILD_TYPE`` CMake parameter.
cmake -DCMAKE_BUILD_TYPE={debug,release}
.. _vectorize:
.. _architecture:
Vectorization
-------------
Explicit vectorization of key computational kernels can be enabled in Arbor by setting the
``ARB_VECTORIZE_TARGET`` CMake parameter:
.. code-block:: bash
cmake -DARB_VECTORIZE_TARGET={none,KNL,AVX2,AVX512}
By default the ``none`` target is selected, which relies on compiler auto-vectorization.
.. Warning::
The vectorization target must be supported by the target architecture.
A sure sign that an unsuported vectorization was chosen is an ``Illegal instruction``
error at runtime. In the example below, the unit tests for an ``ARB_VECTORIZE_TARGET=AVX2``
build are run on an Ivy Bridge CPU, which does not support AVX2 vector instructions:
.. code-block:: none
$ ./tests/test.exe
[==========] Running 581 tests from 105 test cases.
[----------] Global test environment set-up.
[----------] 15 tests from algorithms
[ RUN ] algorithms.parallel_sort
Illegal instruction
See the hints on `cross compiling <crosscompiling_>`_ if you get illegal instruction
errors when trying to compile on HPC systems.
.. Note::
The vectorization selection will change soon, to an interface with two parameters. The first
will toggle vectorization, and the second will specify a specific architecture to target.
For example, to generate optimized code for Intel Broadwell (i.e. AVX2 intrinsics):
.. code-block:: bash
cmake -DCMAKE_BUILD_TYPE=release \
-DARB_ARCH=broadwell \
-DARB_VECTORIZE=ON \
.. _threading:
Architecture
------------
Multithreading
--------------
By default, Arbor is built to target whichever architecture is the compiler default,
which often involves a sacrifice of performance for binary portability. The target
architecture can be explicitly set with the ``ARB_ARCH`` configuration option. This
will be used to direct the compiler to use the corresponding instruction sets and
to optimize for that architecture.
Arbor provides three possible multithreading implementations. The implementation
is selected at compile time by setting the ``ARB_THREADING_MODEL`` CMake option:
When building and installing on the same machine, a good choice for many environments
is to set ``ARB_ARCH`` to ``native``:
.. code-block:: bash
cmake -DARB_THREADING_MODEL={serial,cthread,tbb}
By default Arbor is built with multithreading enabled with the **cthread** backend,
which is implemented in the Arbor source code.
cmake -DARB_ARCH=native
.. table:: Threading Models.
When deploying on a different machine, one should, for an optimized library, specify
the specific architecture of that machine. The valid values correspond to those given
to the ``-mcpu`` or ``-march`` options for GCC and Clang; the build system will translate
these names to corresponding values for other supported compilers.
=========== ============== =================================================
Model Source Description
=========== ============== =================================================
**cthread** Arbor Default. Multithreaded, based on C++11 ``std::thread``.
**serial** Arbor Single threaded.
**tbb** Git submodule `Intel TBB <https://www.threadingbuildingblocks.org/>`_.
Recommended when using many threads.
=========== ============== =================================================
Specific recent x86-family Intel CPU architectures include ``broadwell``, ``skylake`` and
``knl``. Complete lists of architecture names can be found in the compiler documentation:
for example GCC `x86 options <https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html>`_,
`PowerPC options <https://gcc.gnu.org/onlinedocs/gcc/RS_002f6000-and-PowerPC-Options.html#RS_002f6000-and-PowerPC-Options>`_,
and `ARM options <https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html>`_.
.. Note::
The default `cthread` threading is suitable for most applications.
However there are some situations when the overheads of the threading runtime
become significant. This is often the case for:
* simulations with many small/light cells (e.g. LIF cells);
* running with many threads, such as on IBM Power 8 (80 threads/socket) or Intel
KNL (64-256 threads/socket).
The TBB threading back end is highly optimized, and well suited to these cases.
.. _vectorize:
Vectorization
-------------
.. Note::
If the TBB back end is selected, Arbor's CMake uses a Git submodule of the TBB
repository to build and link a static version of the the TBB library. If you get
an error stating that the TBB submodule is not available, you must update the Git
submodules:
Explicit vectorization of computational kernels can be enabled in Arbor by setting the
``ARB_VECTORIZE`` CMake flag:
.. code-block:: bash
.. code-block:: bash
git submodule update --init --recursive
cmake -DARB_VECTORIZE=ON
.. Note::
The TBB back end can be used on IBM Power 8 systems.
With this flag set, the library will use architecture-specific vectorization intrinsics
to implement these kernels. Arbor currently has vectorization support for x86 architectures
with AVX, AVX2 or AVX512 ISA extensions. Enabling the `ARB_VECTORIZE` option for a target
without support in Arbor will give a compilation error.
.. _gpu:
......@@ -360,11 +314,48 @@ CMake ``ARB_GPU_MODEL`` option to match the GPU model to target:
By default ``ARB_GPU_MODEL=none``, and a GPU target must explicitly be set to
build for and run on GPUs.
Depending on the configuration of the system where Arbor is being built, the
C++ compiler may not be able to find the ``cuda.h`` header. The easiest workaround
is to add the path to the include directory containing the header to the
``CPATH`` environment variable before configuring and building Arbor, for
example:
.. code-block:: bash
export CPATH="/opt/cuda/include:$CPATH"
cmake -DARB_GPU_MODEL=P100
.. Note::
The main difference between the Kepler (K20 & K80) and Pascal (P100) GPUs is
the latter's built-in support for double precision atomics and fewer GPU
synchronizations when accessing managed memory.
.. _install:
Installation
------------
Arbor can be installed with ``make install`` after configuration. The
installation comprises:
- The static library ``libarbor.a``.
- Public header files.
- The ``modcc`` NMODL compiler if built.
- The HTML documentation if built.
The default install path (``/usr/local``) can be overridden with the standard
``CMAKE_INSTALL_PREFIX`` configuration option.
Provided that Sphinx is available, HTML documentation for Arbor can be built
with ``make html``. Note that documentation is not built by default — if
built, it too will be included in the installation.
Note that the ``modcc`` compiler will not be built by default if the ``ARB_MODCC``
configuration setting is used to specify a different executable for ``modcc``.
While ``modcc`` can be used to translate user-supplied NMODL mechanism
descriptions into C++ and CUDA code for use with Arbor, this generated code
currently relies upon private headers that are not installed.
.. _cluster:
HPC Clusters
......@@ -380,8 +371,8 @@ MPI
---
Arbor uses MPI for distributed systems. By default it is built without MPI support, which
can enabled by setting the ``DARB_DISTRIBUTED_MODEL`` CMake parameter.
An example of building Arbor with MPI, high-performance threading and optimizations enabled
can enabled by setting the ``ARB_WITH_MPI`` configuration flag.
An example of building a 'release' (optimized) version of Arbor with MPI:
is:
.. code-block:: bash
......@@ -391,17 +382,18 @@ is:
export CXX=`which mpicxx`
# configure with mpi, tbb threading and compiled with optimizations
cmake .. -DARB_DISTRIBUTED_MODEL=mpi \ # Use MPI
-DCMAKE_BUILD_TYPE=release \ # Optimizations on
-DARB_THREADING_MODEL=tbb \ # TBB threading library
cmake .. -DARB_WITH_MPI=ON \ # Use MPI
-DCMAKE_BUILD_TYPE=release # Optimizations on
# run unit tests for global communication on 2 MPI ranks
mpirun -n 2 ./tests/global_communication.exe
The example above set ``CC`` and ``CXX`` environment variables to use compiler
wrappers provided by the MPI implementation. It is recommended to use compiler
wrappers for MPI, unless you know what you are doing and have a specific use
case or issue to work around.
(Note that 'release' build is in fact the default configuration for Arbor.)
The example above sets the ``CC`` and ``CXX`` environment variables to use compiler
wrappers provided by the MPI implementation. While the configuration process
will attempt to find MPI libraries and build options automatically, we recommend
using the supplied MPI compiler wrappers in preference.
.. Note::
MPI distributions provide **compiler wrappers** for compiling MPI applications.
......@@ -473,10 +465,8 @@ then build Arbor is:
module swap PrgEnv-cray PrgEnv-gnu
moudle swap gcc/7.1.0
export CC=`which cc`; export CXX=`which CC`;
cmake .. -DARB_DISTRIBUTED_MODEL=mpi \ # MPI support
-DCMAKE_BUILD_TYPE=release \ # optimized
-DARB_THREADING_MODEL=tbb \ # tbb threading
-DARB_SYSTEM_TYPE=Cray # turn on Cray specific options
cmake .. -DARB_WITH_MPI=ON \ # MPI support
-DCMAKE_BUILD_TYPE=release # optimized
.. Note::
If ``CRAYPE_LINK_TYPE`` isn't set, there will be warnings like the following when linking:
......@@ -500,25 +490,40 @@ Troubleshooting
Cross Compiling NMODL
---------------------
Care must be taken when Arbor is compiled on a system with a different architecture to the target system where Arbor will run.
This occurs quite frequently on HPC systems, for example when building on a login/service node that has a different architecture to the compute nodes.
Care must be taken when Arbor is compiled on a system with a different
architecture to the target system where Arbor will run. This occurs quite
frequently on HPC systems, for example when building on a login/service node
that has a different architecture to the compute nodes.
.. Note::
If building Arbor on a laptop or desktop system, i.e. on the same computer that
you will run Arbor on, cross compilation is not an issue.
.. Note::
The ``ARB_ARCH`` setting is not applied to the building of ``modcc``.
On systems where the build node and compute node have different architectures
within the same family, this may mean that separate compilation of ``modcc``
is not necessary.
.. Warning::
``Illegal instruction`` errors are a sure sign that
Arbor is running on a system that does not support the architecture it was compiled for.
When cross compiling, we have to take care that the *modcc* compiler, which is used to convert NMODL to C++/CUDA code, is able to run on the compilation node.
When cross compiling, we have to take care that the *modcc* compiler, which is
used to convert NMODL to C++/CUDA code, is able to run on the compilation node.
By default, building Arbor will build the ``modcc`` executable from source,
and then use that to build the built-in mechanisms specified in NMODL. This
behaviour can be overridden with the ``ARB_MODCC`` configuration option, for
example:
By default, CMake looks for the *modcc* executable, ``modcc``, in paths specified by the ``PATH`` environment variable, and will use this executable if it finds it.
Otherwise, the CMake script will build *modcc* from source.
To ensure that cross compilation works, a copy of modcc that is compiled for the build system should be in ``PATH``.
.. code-block:: bash
Here we will use the example of compiling for Intel KNL on a Cray system, which has Intel Sandy Bridge CPUs on login nodes that don't support the AVX512 instructions used by KNL.
cmake .. -DARB_MODCC=path-to-local-modcc
Here we will use the example of compiling for Intel KNL on a Cray system, which
has Intel Sandy Bridge CPUs on login nodes that don't support the AVX512
instructions used by KNL.
.. code-block:: bash
......@@ -540,22 +545,19 @@ Here we will use the example of compiling for Intel KNL on a Cray system, which
cmake ..
make -j modcc
# set PATH to find modcc
cd ..
export PATH=`pwd`/build_modcc/modcc:$PATH
#
# Step 2: Build Arbor.
#
cd ..
mkdir build; cd build;
# use the compiler wrappers to build Arbor
export CC=`which cc`; export CXX=`which CC`;
cmake .. -DARB_DISTRIBUTED_MODEL=mpi \
-DCMAKE_BUILD_TYPE=release \
-DARB_THREADING_MODEL=tbb \
-DARB_SYSTEM_TYPE=Cray \
-DARB_VECTORIZE_TARGET=KNL
cmake .. -DCMAKE_BUILD_TYPE=release \
-DARB_WITH_MPI=ON \
-DARB_ARCH=knl \
-DARB_VECTORIZE=ON \
-DARB_MODCC=../build_modcc/bin/modcc
.. Note::
......@@ -576,7 +578,7 @@ Here we will use the example of compiling for Intel KNL on a Cray system, which
mechanisms/CMakeFiles/build_all_mods.dir/build.make:69: recipe for target '../mechanisms/multicore/pas_cpu.hpp' failed
If you have errors when running the tests or a miniapp, then either the wrong
``ARB_VECTORIZE_TARGET`` was selected; or you might have forgot to launch on the
``ARB_ARCH`` target architecture was selected; or you might have forgot to launch on the
compute node. e.g.:
.. code-block:: none
......@@ -617,40 +619,25 @@ and have to be turned on by setting the ``ARB_WITH_ASSERTIONS`` CMake option:
cmake -DARB_WITH_ASSERTIONS=ON
.. Note::
These assertions are in the form of ``EXPECTS`` statements inside the code,
These assertions are in the form of ``arb_assert`` macros inside the code,
for example:
.. code-block:: cpp
void decrement_min_remaining() {
EXPECTS(min_remaining_steps_>0);
arb_assert(min_remaining_steps_>0);
if (!--min_remaining_steps_) {
compute_min_remaining();
}
}
A failing ``EXPECT`` statement indicates that an error inside the Arbor
A failing ``arb_assert`` indicates that an error inside the Arbor
library, caused either by a logic error in Arbor, or incorrectly checked user input.
If this occurs, it is highly recommended that you attach the output to the
`bug report <https://github.com/eth-cscs/arbor/issues>`_ you send to the Arbor developers!
CMake CMP0023 Warning
---------------------
On version 3.9 or greater CMake generates the following warning:
.. code-block:: none
CMake Deprecation Warning at CMakeLists.txt:11 (cmake_policy):
The OLD behavior for policy CMP0023 will be removed from a future version
of CMake.
This is caused because we have to work around conflicting modules in CMake, and
isn't a problem. It will be fixed when we start using the built in support for
CUDA introduced in CMake 3.9.
CMake Git Submodule Warnings
----------------------------
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment