diff --git a/README.md b/README.md index 2943921657522a2691e2c2bd742242876993cda7..354c15a0a67e8f648c1f496f816edb353504ca6b 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# tvb-make +# pipeline This is a workflow for preparing and using TVB brain network models, comprised of three main components @@ -59,9 +59,27 @@ outputs. ### Targets -- `fs-recon` - FreeSurfer reconstruction, `recon-all -all ...` -- `resamp-anat` - Lower resolution cortical surfaces & annotations -- `conn` - Connectivity files +- `fs-recon`: FreeSurfer reconstruction. Consists mainly of running `recon-all -all`. + Uses `T1`. + +- `resamp-anat`: Lower resolution cortical surfaces & annotations + Uses `T1`. + +- `conn`: Connectivity matrices in text format. + Uses `T1` and `DWI`. + +- `tvb`: TVB zipfile, cortical and subcortical surfaces in TVB formats, region mappings. + Uses `T1` and `DWI`. + +- `elec`: Positions of the contacts of depth electrodes and gain matrices. + Uses `T1`, `DWI`, `ELEC`, and either `ELEC_ENDPOINTS` or `ELEC_POS_GARDEL`. + +- `seeg`: Conversion of SEEG recordings to FIF format and plotting the recordings. + Uses `SEEGRECDIR`, `XLSX` and everything that `elec` uses. + +- `ez`: Extraction of the epileptogenic zone from the patient Excel file. + Uses `XLSX` and everything that `elec` uses. + _TODO_ more details & help on this @@ -107,13 +125,21 @@ environment variable. ### Marseille Cluster -The `cluster/run` script assists in running the pipeline on the Marseille -cluster through two modes. First, invoke with typical arguments +For quick introduction look at the basic [step-by-step tutorial](doc/TutorialCluster.md). + +There are two options for running the pipeline on the cluster: non-interactive and interactive. +For running the full pipeline, non-interactive mode is recommended due to the large time requirements. +For small updates and testing the interactive mode might be more suitable. + +#### Non-interactive mode + +In the non-interactive regime, you prepare the data and submit the job(s), and the scheduler takes cares of the execution. +The `cluster/run` script assists in running the pipeline on the Marseille cluster through two modes. + First, invoke with typical arguments ```bash -cluster/run SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon +<PIPELINE_DIR>/cluster/run SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon ``` -for a single run in a single OAR job, or for many subjects, -create a file `params.txt` with multiple lines of arguments, e.g. +for a single run in a single SLURM job. If you have many subjects, create a file `params.txt` with multiple lines of arguments, e.g. ``` SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon SUBJECTS_DIR=fs SUBJECT=bar T1=data/T2.nii.gz fs-recon conn @@ -121,15 +147,41 @@ SUBJECTS_DIR=fs SUBJECT=baz T1=data/T3.nii.gz conn ``` then ``` -cluster/run params.txt +<PIPELINE_DIR>/cluster/run params.txt +``` +Each line will result in the pipeline running a SLURM job for every line. You can comment out a line if you prepend it with a `#` sign, +``` + # SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon ``` -Each line will result in the pipeline running once for the arguments -on a given line, and an OAR job. NB You need to provide a custom, valid FreeSurfer `SUBJECTS_DIR`, since the default directories on the cluster (`/soft/freesurfer*/subjects`) are not writeable by users. +#### Interactive mode + +First, request a computational node in the interactive mode +``` +srun --pty bash +``` +which should give you the interactive node if there is one available. + +If you need to run the reconstruction and tractography in the interactive mode +(although that is discouraged), you need to request full node with enough memory: +``` +srun -N 1 -n 1 --exclusive --mem=60G --pty bash +``` + +Then setup your working environment by loading the environment file, +``` +source <PIPELINE_DIR>/cluster/env +``` +and run `make` by hand: +``` +make -f <PIPELINE_DIR>/Makefile SUBJECTS_DIR=fs SUBJECT=foo T1=data/T1.nii.gz fs-recon +``` + + ## Special Cases ### JPEG encoded images @@ -181,4 +233,4 @@ which generate or use files in the `stan` subfolder of the subjects' folder - `$(sd)/stan/{model_name}.samp.pkl` - posterior samples found during fit - `$(sd)/stan/{model_name}.png` - visualization produced by `stan/{model_name}.vis.py` -See the [`stan`](stan) folder for an example, to be completed. \ No newline at end of file +See the [`stan`](stan) folder for an example, to be completed. diff --git a/doc/TutorialCluster.md b/doc/TutorialCluster.md new file mode 100644 index 0000000000000000000000000000000000000000..9cb4afa29710449490c121dae228c0eb755426d9 --- /dev/null +++ b/doc/TutorialCluster.md @@ -0,0 +1,101 @@ + + + +# Running the pipeline on INS cluster + + +Brief tutorial on how to run the reconstruction pipeline on the INS cluster. + +## Setting up the environment (do once) + +Clone the pipeline repository. Note that you need your SSH key generated on the cluster to be added +in the Gitlab interface to be able to clone the repo from the cluster. +``` +cd ~/soft # Or wherever you want the code to be +git clone git@gitlab.thevirtualbrain.org:tvb/pipeline.git +``` + +## Preparing the data + +By default, the pipeline uses following data structure: +``` +data/SUBJECT-1/ + SUBJECT-2/ + SUBJECT-3/ + ... +fs/SUBJECT-1/ + SUBJECT-2/ + SUBJECT-3/ + ... +``` +where `data/` contains the raw data, and `fs/` contains the processed results. You should prepare +the contents of the `data/` directory; the contents of the `fs/` directory is filled by the pipeline. +If needed, you can change the the names of the raw data directory by setting the make variable `DATA` +(see below on how to), and the `fs/` directory by the variable `SUBJECTS_DIR`. + + +First, let's create this main directory structure in `~/reconstruction` +``` +cd +mkdir -p reconstruction/data reconstruction/fs +``` + + +Then start with a single patient `SUBJECT-1`. +For the basic TVB dataset, you need at least T1 and DWI scans. Then place the T1 and DWI scans under `t1/` +and `dwi/` directories under the patient directory: +``` +data/SUBJECT-1/dwi/ + t1/ +``` + + +## Running the pipeline + +You need to create a file containing the subject specification. In your working directory, create +a file `params.txt` and insert a single line inside: +``` +SUBJECT=SUBJECT-1 T1=data/SUBJECT-1/t1/t1.nii.gz DWI=data/SUBJECT-1/dwi/dwi.nii.gz tvb +``` +Now what this means? Every line specifies a single subject, so here we have specified a single +subject named `SUBJECT-1`. +By setting the variables `T1` and `DWI` we have told the pipeline where the raw data are. +If you have the raw data in DICOM format (many .dcm files in `t1` and `dwi` +directories) and not single files, simply point to the directories: +``` +SUBJECT=SUBJECT-1 T1=data/SUBJECT-1/t1/ DWI=data/SUBJECT-1/dwi/ tvb +``` + +Last keyword on this line is the *target* of the pipeline. In this case, `tvb` stands for the +TVB data set with connectomes and surfaces. Other targets that may be useful are `fs-recon` for +the FreeSurfer reconstruction only, `elec` for the depth electrode positions, or `seeg` for SEEG +recordings. + + +The pipeline job is submitted simply by running the following command: + +``` +~/soft/pipeline/cluster/run params.txt +``` +The output of the command should show that for every line in the `params.txt` (not commented out) +a Slurm job was submitted. + +## Examining the results + +The status of the Slurm jobs on the cluster can be checked by +``` +squeue -u USERNAME +``` +If you have submitted a job and it is finished after only a few seconds, something probably +went wrong. Have a look at the logs in `fs/_logs/SUBJECT-1.*.stdout` and + `fs/_logs/SUBJECT-1.*.stderr`. + +After the job has ended, examine the logs and the created subject directory +`fs/SUBJECT-1/`, especially the `tvb` subdirectory, where your desired data (TVB zipfiles, + surfaces, region mappings) should be. + + + + + +