Skip to content
Snippets Groups Projects
Commit b6cdcea0 authored by Viktor Sip's avatar Viktor Sip
Browse files

Expanded documentation.

parent e9eb395e
No related branches found
No related tags found
No related merge requests found
# tvb-make
# pipeline
This is a workflow for preparing and using TVB brain network models, comprised of
three main components
......@@ -59,9 +59,27 @@ outputs.
### Targets
- `fs-recon` - FreeSurfer reconstruction, `recon-all -all ...`
- `resamp-anat` - Lower resolution cortical surfaces & annotations
- `conn` - Connectivity files
- `fs-recon`: FreeSurfer reconstruction. Consists mainly of running `recon-all -all`.
Uses `T1`.
- `resamp-anat`: Lower resolution cortical surfaces & annotations
Uses `T1`.
- `conn`: Connectivity matrices in text format.
Uses `T1` and `DWI`.
- `tvb`: TVB zipfile, cortical and subcortical surfaces in TVB formats, region mappings.
Uses `T1` and `DWI`.
- `elec`: Positions of the contacts of depth electrodes and gain matrices.
Uses `T1`, `DWI`, `ELEC`, and either `ELEC_ENDPOINTS` or `ELEC_POS_GARDEL`.
- `seeg`: Conversion of SEEG recordings to FIF format and plotting the recordings.
Uses `SEEGRECDIR`, `XLSX` and everything that `elec` uses.
- `ez`: Extraction of the epileptogenic zone from the patient Excel file.
Uses `XLSX` and everything that `elec` uses.
_TODO_ more details & help on this
......@@ -107,13 +125,21 @@ environment variable.
### Marseille Cluster
The `cluster/run` script assists in running the pipeline on the Marseille
cluster through two modes. First, invoke with typical arguments
For quick introduction look at the basic [step-by-step tutorial](doc/TutorialCluster.md).
There are two options for running the pipeline on the cluster: non-interactive and interactive.
For running the full pipeline, non-interactive mode is recommended due to the large time requirements.
For small updates and testing the interactive mode might be more suitable.
#### Non-interactive mode
In the non-interactive regime, you prepare the data and submit the job(s), and the scheduler takes cares of the execution.
The `cluster/run` script assists in running the pipeline on the Marseille cluster through two modes.
First, invoke with typical arguments
```bash
cluster/run SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon
<PIPELINE_DIR>/cluster/run SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon
```
for a single run in a single OAR job, or for many subjects,
create a file `params.txt` with multiple lines of arguments, e.g.
for a single run in a single SLURM job. If you have many subjects, create a file `params.txt` with multiple lines of arguments, e.g.
```
SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon
SUBJECTS_DIR=fs SUBJECT=bar T1=data/T2.nii.gz fs-recon conn
......@@ -121,15 +147,41 @@ SUBJECTS_DIR=fs SUBJECT=baz T1=data/T3.nii.gz conn
```
then
```
cluster/run params.txt
<PIPELINE_DIR>/cluster/run params.txt
```
Each line will result in the pipeline running a SLURM job for every line. You can comment out a line if you prepend it with a `#` sign,
```
# SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon
```
Each line will result in the pipeline running once for the arguments
on a given line, and an OAR job.
NB You need to provide a custom, valid FreeSurfer `SUBJECTS_DIR`,
since the default directories on the cluster (`/soft/freesurfer*/subjects`)
are not writeable by users.
#### Interactive mode
First, request a computational node in the interactive mode
```
srun --pty bash
```
which should give you the interactive node if there is one available.
If you need to run the reconstruction and tractography in the interactive mode
(although that is discouraged), you need to request full node with enough memory:
```
srun -N 1 -n 1 --exclusive --mem=60G --pty bash
```
Then setup your working environment by loading the environment file,
```
source <PIPELINE_DIR>/cluster/env
```
and run `make` by hand:
```
make -f <PIPELINE_DIR>/Makefile SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon
```
## Special Cases
### JPEG encoded images
......@@ -181,4 +233,4 @@ which generate or use files in the `stan` subfolder of the subjects' folder
- `$(sd)/stan/{model_name}.samp.pkl` - posterior samples found during fit
- `$(sd)/stan/{model_name}.png` - visualization produced by `stan/{model_name}.vis.py`
See the [`stan`](stan) folder for an example, to be completed.
\ No newline at end of file
See the [`stan`](stan) folder for an example, to be completed.
# Running the pipeline on INS cluster
Brief tutorial on how to run the reconstruction pipeline on the INS cluster.
## Setting up the environment (do once)
Clone the pipeline repository. Note that you need your SSH key generated on the cluster to be added
in the Gitlab interface to be able to clone the repo from the cluster.
```
cd ~/soft # Or wherever you want the code to be
git clone git@gitlab.thevirtualbrain.org:tvb/pipeline.git
```
## Preparing the data
By default, the pipeline uses following data structure:
```
data/SUBJECT-1/
SUBJECT-2/
SUBJECT-3/
...
fs/SUBJECT-1/
SUBJECT-2/
SUBJECT-3/
...
```
where `data/` contains the raw data, and `fs/` contains the processed results. You should prepare
the contents of the `data/` directory; the contents of the `fs/` directory is filled by the pipeline.
If needed, you can change the the names of the raw data directory by setting the make variable `DATA`
(see below on how to), and the `fs/` directory by the variable `SUBJECTS_DIR`.
First, let's create this main directory structure in `~/reconstruction`
```
cd
mkdir -p reconstruction/data reconstruction/fs
```
Then start with a single patient `SUBJECT-1`.
For the basic TVB dataset, you need at least T1 and DWI scans. Then place the T1 and DWI scans under `t1/`
and `dwi/` directories under the patient directory:
```
data/SUBJECT-1/dwi/
t1/
```
## Running the pipeline
You need to create a file containing the subject specification. In your working directory, create
a file `params.txt` and insert a single line inside:
```
SUBJECT=SUBJECT-1 T1=data/SUBJECT-1/t1/t1.nii.gz DWI=data/SUBJECT-1/dwi/dwi.nii.gz tvb
```
Now what this means? Every line specifies a single subject, so here we have specified a single
subject named `SUBJECT-1`.
By setting the variables `T1` and `DWI` we have told the pipeline where the raw data are.
If you have the raw data in DICOM format (many .dcm files in `t1` and `dwi`
directories) and not single files, simply point to the directories:
```
SUBJECT=SUBJECT-1 T1=data/SUBJECT-1/t1/ DWI=data/SUBJECT-1/dwi/ tvb
```
Last keyword on this line is the *target* of the pipeline. In this case, `tvb` stands for the
TVB data set with connectomes and surfaces. Other targets that may be useful are `fs-recon` for
the FreeSurfer reconstruction only, `elec` for the depth electrode positions, or `seeg` for SEEG
recordings.
The pipeline job is submitted simply by running the following command:
```
~/soft/pipeline/cluster/run params.txt
```
The output of the command should show that for every line in the `params.txt` (not commented out)
a Slurm job was submitted.
## Examining the results
The status of the Slurm jobs on the cluster can be checked by
```
squeue -u USERNAME
```
If you have submitted a job and it is finished after only a few seconds, something probably
went wrong. Have a look at the logs in `fs/_logs/SUBJECT-1.*.stdout` and
`fs/_logs/SUBJECT-1.*.stderr`.
After the job has ended, examine the logs and the created subject directory
`fs/SUBJECT-1/`, especially the `tvb` subdirectory, where your desired data (TVB zipfiles,
surfaces, region mappings) should be.
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment