From 0d6c0a77880ae033b556a17ff48b8e5f2b8325fa Mon Sep 17 00:00:00 2001 From: Nora Abi Akar <nora.abiakar@gmail.com> Date: Tue, 9 Mar 2021 15:37:58 +0100 Subject: [PATCH] Docs: Fix file formats (#1418) * Remove parts of the C++ API which leaked into `doc/fileformat/neuroml.rst` * Add links to the relevant C++ and Python API --- doc/cpp/morphology.rst | 312 +++++++++++++++++++++++++++---------- doc/fileformat/index.rst | 14 -- doc/fileformat/neuroml.rst | 159 +------------------ doc/fileformat/swc.rst | 17 +- doc/index.rst | 11 +- doc/python/morphology.rst | 4 + 6 files changed, 264 insertions(+), 253 deletions(-) delete mode 100644 doc/fileformat/index.rst diff --git a/doc/cpp/morphology.rst b/doc/cpp/morphology.rst index cc148054..30fe87e9 100644 --- a/doc/cpp/morphology.rst +++ b/doc/cpp/morphology.rst @@ -118,86 +118,6 @@ by two stitches: cell.paint("\"soma\"", "hh"); -Supported morphology formats -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Arbor supports morphologies described using the SWC file format and the NeuroML file format. - -SWC -""" - -Arbor supports reading morphologies described using the -`SWC <http://www.neuronland.org/NLMorphologyConverter/MorphologyFormats/SWC/Spec.html>`_ file format. And -has three different interpretation of that format. - -A :cpp:func:`parse_swc()` function is used to parse the SWC file and generate a :cpp:type:`swc_data` object. -This object contains a vector of :cpp:type:`swc_record` objects that represent the SWC samples, with a number of -basic checks performed on them. The :cpp:type:`swc_data` object can then be used to generate a -:cpp:type:`morphology` object using one of the following functions: (See the morphology concepts -:ref:`page <morph-formats>` for more details). - - * :cpp:func:`load_swc_arbor` - * :cpp:func:`load_swc_allen` - * :cpp:func:`load_swc_neuron` - -.. cpp:class:: swc_record - - .. cpp:member:: int id - - ID of the record - - .. cpp:member:: int tag - - Structure identifier (tag). - - .. cpp:member:: double x - - x coordinate in space. - - .. cpp:member:: double y - - y coordinate in space. - - .. cpp:member:: double z - - z coordinate in space. - - .. cpp:member:: double r - - Sample radius. - - .. cpp:member:: int parent_id - - Record parent's sample ID. - -.. cpp:class:: swc_data - - .. cpp:member:: std::string metadata - - Contains the comments of an SWC file. - - .. cpp:member:: std::vector<swc_record> records - - Stored the list of samples from an SWC file, after performing some checks. - -.. cpp:function:: swc_data parse_swc(std::istream&) - - Returns an :cpp:type:`swc_data` object given an std::istream object. - -.. cpp:function:: morphology load_swc_arbor(const swc_data& data) - - Returns a :cpp:type:`morphology` constructed according to Arbor's SWC specifications. - -.. cpp:function:: morphology load_swc_allen(const swc_data& data, bool no_gaps=false) - - Returns a :cpp:type:`morphology` constructed according to the Allen Institute's SWC - specifications. By default, gaps in the morphology are allowed, this can be toggled - using the ``no_gaps`` argument. - -.. cpp:function:: morphology load_swc_neuron(const swc_data& data) - - Returns a :cpp:type:`morphology` constructed according to NEURON's SWC specifications. - .. _locsets-and-regions: Identifying sites and subsets of the morphology @@ -413,3 +333,235 @@ given branch will be chosen to be the smallest number that ensures no CV will have an extent on the branch longer than ``max_extent`` micrometres. +Supported morphology formats +---------------------------- + +Arbor supports morphologies described using the SWC file format and the NeuroML file format. + +.. _cppswc: + +SWC +^^^ + +Arbor supports reading morphologies described using the +`SWC <http://www.neuronland.org/NLMorphologyConverter/MorphologyFormats/SWC/Spec.html>`_ file format. And +has three different interpretation of that format. + +A :cpp:func:`parse_swc()` function is used to parse the SWC file and generate a :cpp:type:`swc_data` object. +This object contains a vector of :cpp:type:`swc_record` objects that represent the SWC samples, with a number of +basic checks performed on them. The :cpp:type:`swc_data` object can then be used to generate a +:cpp:type:`morphology` object using one of the following functions: (See the morphology concepts +:ref:`page <morph-formats>` for more details). + + * :cpp:func:`load_swc_arbor` + * :cpp:func:`load_swc_allen` + * :cpp:func:`load_swc_neuron` + +.. cpp:class:: swc_record + + .. cpp:member:: int id + + ID of the record + + .. cpp:member:: int tag + + Structure identifier (tag). + + .. cpp:member:: double x + + x coordinate in space. + + .. cpp:member:: double y + + y coordinate in space. + + .. cpp:member:: double z + + z coordinate in space. + + .. cpp:member:: double r + + Sample radius. + + .. cpp:member:: int parent_id + + Record parent's sample ID. + +.. cpp:class:: swc_data + + .. cpp:member:: std::string metadata + + Contains the comments of an SWC file. + + .. cpp:member:: std::vector<swc_record> records + + Stored the list of samples from an SWC file, after performing some checks. + +.. cpp:function:: swc_data parse_swc(std::istream&) + + Returns an :cpp:type:`swc_data` object given an std::istream object. + +.. cpp:function:: morphology load_swc_arbor(const swc_data& data) + + Returns a :cpp:type:`morphology` constructed according to Arbor's SWC specifications. + +.. cpp:function:: morphology load_swc_allen(const swc_data& data, bool no_gaps=false) + + Returns a :cpp:type:`morphology` constructed according to the Allen Institute's SWC + specifications. By default, gaps in the morphology are allowed, this can be toggled + using the ``no_gaps`` argument. + +.. cpp:function:: morphology load_swc_neuron(const swc_data& data) + + Returns a :cpp:type:`morphology` constructed according to NEURON's SWC specifications. + +.. _cppneuroml: + +NeuroML +^^^^^^^ + +Arbor offers limited support for models described in +`NeuroML version 2 <https://neuroml.org/neuromlv2>`_. +This is not built by default, but can be enabled by +providing the `-DARB_NEUROML=ON` argument to CMake at +configuration time (see :ref:`install-neuroml`). This will +build the ``arborio`` libray with neuroml support. + +The ``arborio`` library uses `libxml2 <http://xmlsoft.org/>`_ +for XML parsing. Applications using NeuroML through ``arborio`` +will need to link against ``libxml2`` in addition, though this +is performed implicitly within CMake projects that add ``arbor::arborio`` +as a link library. + +All classes and functions provided by the ``arborio`` library +are provided in the ``arborio`` namespace. + +Libxml2 interface +================= + +Libxml2 offers threadsafe XML parsing, but not by default. If +the application uses NeuromML support from ``arborio`` in an +unthreaded context, or has already explicitly initialized ``libxml2``, +nothing more needs to be done. Otherwise, the ``libxml2`` function +``xmlInitParser()`` must be called explicitly. + +``arborio`` provides a helper guard object for this purpose, defined +in ``arborio/with_xml.hpp``: + +.. cpp:namespace:: arborio + +.. cpp:class:: with_xml + + An RAII guard object that calls ``xmlInitParser()`` upon construction, and + ``xmlCleanupParser()`` upon destruction. The constructor takes no parameters. + +NeuroML2 morphology support +=========================== + +NeuroML documents are represented by the ``arborio::neuroml`` class, +which in turn provides methods for the identification and translation +of morphology data. ``neuroml`` objects are moveable and move-assignable, but not copyable. + +An implementation limitation restricts valid segment id values to +those which can be represented by an ``unsigned long long`` value. + +.. cpp:class:: neuroml + + .. cpp:function:: neuroml(std::string) + + Build a NeuroML document representation from the supplied string. + + .. cpp:function:: std::vector<std::string> cell_ids() const + + Return the id of each ``<cell>`` element defined in the NeuroML document. + + .. cpp:function:: std::vector<std::string> morphology_ids() const + + Return the id of each top-level ``<morphology>`` element defined in the NeuroML document. + + .. cpp:function:: std::optional<morphology_data> morphology(const std::string&) const + + Return a representation of the top-level morphology with the supplied identifier, or + ``std::nullopt`` if no such morphology could be found. Parse errors or an inconsistent + representation will raise an exception derived from ``neuroml_exception``. + + .. cpp:function:: std::optional<morphology_data> cell_morphology(const std::string&) const + + Return a representation of the morphology associated with the cell with the supplied identifier, + or ``std::nullopt`` if the cell or its morphology could not be found. Parse errors or an + inconsistent representation will raise an exception derived from ``neuroml_exception``. + +The morphology representation contains the corresponding Arbor ``arb::morphology`` object, +label dictionaries for regions corresponding to its segments and segment groups by name +and id, and a map providing the explicit list of segments contained within each defined +segment group. + +.. cpp:class:: morphology_data + + .. cpp:member:: std::optional<std::string> cell_id + + The id attribute of the cell that was used to find the morphology in the NeuroML document, if any. + + .. cpp:member:: std::string id + + The id attribute of the morphology. + + .. cpp:member:: arb::morphology morphology + + The corresponding Arbor morphology. + + .. cpp:member:: arb::label_dict segments + + A label dictionary with a region entry for each segment, keyed by the segment id (as a string). + + .. cpp:member:: arb::label_dict named_segments + + A label dictionary with a region entry for each name attribute given to one or more segments. + The region corresponds to the union of all segments sharing the same name attribute. + + .. cpp:member:: arb::label_dict groups + + A label dictionary with a region entry for each defined segment group + + .. cpp:member:: std::unordered_map<std::string, std::vector<unsigned long long>> group_segments + + A map from each segment group id to its corresponding collection of segments. + + +Exceptions +========== + +All NeuroML-specific exceptions are defined in ``arborio/arbornml.hpp``, and are +derived from ``arborio::neuroml_exception`` which in turn is derived from ``std::runtime_error``. +With the exception of the ``no_document`` exception, all contain an unsigned member ``line`` +which is intended to identify the problematic construct within the document. + +.. cpp:class:: xml_error: neuroml_exception + + A generic XML error generated by the ``libxml2`` library. + +.. cpp:class:: no_document: neuroml_exception + + A request was made on an :cpp:class:`neuroml` document without any content. + +.. cpp:class:: parse_error: neuroml_exception + + Failure parsing an element or attribute in the NeuroML document. These + can be generated if the document does not confirm to the NeuroML2 schema, + for example. + +.. cpp:class:: bad_segment: neuroml_exception + + A ``<segment>`` element has an improper ``id`` attribue, refers to a non-existent + parent, is missing a required parent or proximal element, or otherwise is missing + a mandatory child element or has a malformed child element. + +.. cpp:class:: bad_segment_group: neuroml_exception + + A ``<segmentGroup>`` element has a malformed child element or references + a non-existent segment group or segment. + +.. cpp:class:: cyclic_dependency: neuroml_exception + + A segment or segment group ultimately refers to itself via ``parent`` + or ``include`` elements respectively. \ No newline at end of file diff --git a/doc/fileformat/index.rst b/doc/fileformat/index.rst deleted file mode 100644 index f92b352b..00000000 --- a/doc/fileformat/index.rst +++ /dev/null @@ -1,14 +0,0 @@ -.. _format-overview: - -File formats -============ - -Arbor supports the following file formats. - -.. toctree:: - :maxdepth: 1 - - swc - nmodl - neuroml - diff --git a/doc/fileformat/neuroml.rst b/doc/fileformat/neuroml.rst index f027f978..9c3347af 100644 --- a/doc/fileformat/neuroml.rst +++ b/doc/fileformat/neuroml.rst @@ -1,46 +1,7 @@ .. _formatneuroml: -NeuroML support -=============== - -Arbor offers limited support for models described in -`NeuroML version 2 <https://neuroml.org/neuromlv2>`_. -This is not built by default, but can be enabled by -providing the `-DARB_NEUROML=ON` argument to CMake at -configuration time (see :ref:`install-neuroml`). This will -build the ``arborio`` libray with neuroml support. - -The ``arborio`` library uses `libxml2 <http://xmlsoft.org/>`_ -for XML parsing. Applications using NeuroML through ``arborio`` -will need to link against ``libxml2`` in addition, though this -is performed implicitly within CMake projects that add ``arbor::arborio`` -as a link library. - -All classes and functions provided by the ``arborio`` library -are provided in the ``arborio`` namespace. - -Libxml2 interface ------------------ - -Libxml2 offers threadsafe XML parsing, but not by default. If -the application uses NeuromML support from ``arborio`` in an -unthreaded context, or has already explicitly initialized ``libxml2``, -nothing more needs to be done. Otherwise, the ``libxml2`` function -``xmlInitParser()`` must be called explicitly. - -``arborio`` provides a helper guard object for this purpose, defined -in ``arborio/with_xml.hpp``: - -.. cpp:namespace:: arborio - -.. cpp:class:: with_xml - - An RAII guard object that calls ``xmlInitParser()`` upon construction, and - ``xmlCleanupParser()`` upon destruction. The constructor takes no parameters. - - -NeuroML 2 morphology support ----------------------------- +NeuroML2 +-------- Arbor offers limited support for models described in `NeuroML version 2 <https://neuroml.org/neuromlv2>`_. This is not built by default (see :ref:`NeuroML support <install-neuroml>` for instructions on how @@ -51,6 +12,9 @@ and present the encoded data to the user. This is more than a simple a `segment NeuroML can encode in the same file multiple top-level morphologies, as well as cells: +Example +^^^^^^^ + .. code:: XML <neuroml xmlns="http://www.neuroml.org/schema/neuroml2"> @@ -78,115 +42,8 @@ The morphological data includes the actual morphology as well as the named segme For example, the above ``m1`` morphology has one named segment ``seg-0`` and one named group ``group-0`` that are both represented using Arbor's :ref:`region expressions <labels-expressions>`. -C++ +API ^^^ -NeuroML documents are represented by the ``arborio::neuroml`` class, -which in turn provides methods for the identification and translation -of morphology data. ``neuroml`` objects are moveable and move-assignable, but not copyable. - -An implementation limitation restricts valid segment id values to -those which can be represented by an ``unsigned long long`` value. - -.. cpp:class:: neuroml - - .. cpp:function:: neuroml(std::string) - - Build a NeuroML document representation from the supplied string. - - .. cpp:function:: std::vector<std::string> cell_ids() const - - Return the id of each ``<cell>`` element defined in the NeuroML document. - - .. cpp:function:: std::vector<std::string> morphology_ids() const - - Return the id of each top-level ``<morphology>`` element defined in the NeuroML document. - - .. cpp:function:: std::optional<morphology_data> morphology(const std::string&) const - - Return a representation of the top-level morphology with the supplied identifier, or - ``std::nullopt`` if no such morphology could be found. Parse errors or an inconsistent - representation will raise an exception derived from ``neuroml_exception``. - - .. cpp:function:: std::optional<morphology_data> cell_morphology(const std::string&) const - - Return a representation of the morphology associated with the cell with the supplied identifier, - or ``std::nullopt`` if the cell or its morphology could not be found. Parse errors or an - inconsistent representation will raise an exception derived from ``neuroml_exception``. - -The morphology representation contains the corresponding Arbor ``arb::morphology`` object, -label dictionaries for regions corresponding to its segments and segment groups by name -and id, and a map providing the explicit list of segments contained within each defined -segment group. - -.. cpp:class:: morphology_data - - .. cpp:member:: std::optional<std::string> cell_id - - The id attribute of the cell that was used to find the morphology in the NeuroML document, if any. - - .. cpp:member:: std::string id - - The id attribute of the morphology. - - .. cpp:member:: arb::morphology morphology - - The corresponding Arbor morphology. - - .. cpp:member:: arb::label_dict segments - - A label dictionary with a region entry for each segment, keyed by the segment id (as a string). - - .. cpp:member:: arb::label_dict named_segments - - A label dictionary with a region entry for each name attribute given to one or more segments. - The region corresponds to the union of all segments sharing the same name attribute. - - .. cpp:member:: arb::label_dict groups - - A label dictionary with a region entry for each defined segment group - - .. cpp:member:: std::unordered_map<std::string, std::vector<unsigned long long>> group_segments - - A map from each segment group id to its corresponding collection of segments. - - -Exceptions ----------- - -All NeuroML-specific exceptions are defined in ``arborio/arbornml.hpp``, and are -derived from ``arborio::neuroml_exception`` which in turn is derived from ``std::runtime_error``. -With the exception of the ``no_document`` exception, all contain an unsigned member ``line`` -which is intended to identify the problematic construct within the document. - -.. cpp:class:: xml_error: neuroml_exception - - A generic XML error generated by the ``libxml2`` library. - -.. cpp:class:: no_document: neuroml_exception - - A request was made on an :cpp:class:`neuroml` document without any content. - -.. cpp:class:: parse_error: neuroml_exception - - Failure parsing an element or attribute in the NeuroML document. These - can be generated if the document does not confirm to the NeuroML2 schema, - for example. - -.. cpp:class:: bad_segment: neuroml_exception - - A ``<segment>`` element has an improper ``id`` attribue, refers to a non-existent - parent, is missing a required parent or proximal element, or otherwise is missing - a mandatory child element or has a malformed child element. - -.. cpp:class:: bad_segment_group: neuroml_exception - - A ``<segmentGroup>`` element has a malformed child element or references - a non-existent segment group or segment. - -.. cpp:class:: cyclic_dependency: neuroml_exception - - A segment or segment group ultimately refers to itself via ``parent`` - or ``include`` elements respectively. - - +* :ref:`Python <pyneuroml>` +* :ref:`C++ <cppneuroml>` \ No newline at end of file diff --git a/doc/fileformat/swc.rst b/doc/fileformat/swc.rst index 09d3affa..b4d7048b 100644 --- a/doc/fileformat/swc.rst +++ b/doc/fileformat/swc.rst @@ -30,8 +30,8 @@ its parent and inherits the tag of the sample; and if more than 1 sample have th is interpreted as a fork point in the morphology, and acts as the proximal point to a new branch for each of its "child" samples. There a couple of exceptions to these rules which are listed below. -Arbor interpretation: -""""""""""""""""""""" +Arbor interpretation +"""""""""""""""""""" In addition to the previously listed checks, the arbor interpretation explicitly disallows SWC files where the soma is described by a single sample. It constructs the soma from 2 or more samples, forming 1 or more segments. A *segment* is always constructed between a sample and its parent. This means that there are no gaps in the resulting morphology. @@ -55,8 +55,8 @@ like this: :align: center -Allen interpretation: -""""""""""""""""""""" +Allen interpretation +"""""""""""""""""""" In addition to the previously mentioned checks, the Allen interpretation expects a single-sample soma to be the first sample of the file and to be interpreted as a spherical soma. Arbor represents the spherical soma as a cylinder with length and diameter equal to the diameter of the sample representing the sphere. @@ -71,8 +71,8 @@ or to the proximal end of the soma if they are axons or apical dendrites. Only a Finally the Allen institute interpretation of SWC files centres the morphology around the soma at the origin (0, 0, 0) and all samples are translated in space towards the origin. -NEURON interpretation: -"""""""""""""""""""""" +NEURON interpretation +""""""""""""""""""""" The NEURON interpretation was obtained by experimenting with the ``Import3d_SWC_read`` function. We came up with the following set of rules that govern NEURON's SWC behavior and enforced them in arbor's NEURON-complaint SWC interpreter: @@ -91,3 +91,8 @@ interpreter: * To create a segment with a certain tag, that is to be attached to the soma, we need at least 2 samples with that tag. +API +""" + +* :ref:`Python <pyswc>` +* :ref:`C++ <cppswc>` \ No newline at end of file diff --git a/doc/index.rst b/doc/index.rst index 7a55e36a..e61d4f3a 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -27,7 +27,7 @@ Documentation organisation * :ref:`tutorial` contains a few ready-made examples you can use to quickly get started using Arbor. In the tutorial descriptions we link to the relevant Arbor concepts. * :ref:`modelintro` describes the design and concepts used in Arbor. The breakdown of concepts is mirrored (as much as possible) in the :ref:`pyoverview` and :ref:`cppoverview`, so you can easily switch between languages and concepts. -* The API section details our :ref:`pyoverview` and :ref:`cppoverview` API, as well as :ref:`supported file formats <format-overview>`. :ref:`internals-overview` describes Arbor code that is not user-facing; convenience classes, architecture abstractions, etc. +* The API section details our :ref:`pyoverview` and :ref:`cppoverview` API. :ref:`internals-overview` describes Arbor code that is not user-facing; convenience classes, architecture abstractions, etc. * Contributions to Arbor are very welcome! Under :ref:`contribindex` describe conventions and procedures for all kinds of contributions. Citing Arbor @@ -101,13 +101,20 @@ Arbor is an `eBrains project <https://ebrains.eu/service/arbor/>`_. concepts/spike_source_cell concepts/benchmark_cell +.. toctree:: + :caption: File formats: + :maxdepth: 1 + + fileformat/swc + fileformat/neuroml + fileformat/nmodl + .. toctree:: :caption: API reference: :maxdepth: 1 python/index cpp/index - fileformat/index internals/index .. toctree:: diff --git a/doc/python/morphology.rst b/doc/python/morphology.rst index fd18f618..8f58623d 100644 --- a/doc/python/morphology.rst +++ b/doc/python/morphology.rst @@ -315,6 +315,8 @@ Cable cell morphology :param int i: branch index :rtype: list +.. _pyswc: + .. py:function:: load_swc_arbor(filename) Loads the :class:`morphology` from an SWC file according to arbor's SWC specifications. @@ -557,6 +559,8 @@ constitute part of the CV boundary point set. :param float max_etent: The maximum length for generated CVs. :param str domain: The region on which the policy is applied. +.. _pyneuroml: + .. py:class:: neuroml_morph_data A :class:`neuroml_morphology_data` object contains a representation of a morphology defined in -- GitLab