CLI Reference

CLI Reference [OPTIONS] COMMAND [ARGS]...

bt2-log-to-csv

Converts Bowtie2 alignment statistics to csv format.

CLI Reference bt2-log-to-csv [OPTIONS]

Options

-l, --log <log>

Bowtie2 log file. [required]

-n, --name <name>

Name of the experiment which will be the column label. [required]

-p, --prefix <prefix>

Pre-appended to row names. [required]

-o, --out <out>

Output csv file.

compile-step-stats

Puts statistics coming from various steps into one file.

Merges cutadapt, and alignment statistics coming from Bowtie2 or Hisat from the given samples into one file. This is done by summing up the corresponding counts and calculating the percentages. This version is implemented for single-end reads only. So it won’t work for paired-end statistics yet.

For convenience, we are providing percentages of these statistics. We are rounding up all percentages to integers for simplicity. If you want higher precision, you can re-calculate using the counts given in these tables.

CLI Reference compile-step-stats [OPTIONS]

Options

-o, --out <out>

Output file. The stats are written in csv formatto this file. [required]

-c, --cutadapt <cutadapt>

Cutadapt log file. [required]

-f, --filter <filter>

Filter alignment log file. [required]

-t, --trans <trans>

Transcriptome alignment log file. [required]

-q, --quality <quality>

Transcriptome alignment quality file. [required]

-d, --dedup <dedup>

Deduplication count file. This file should only contain one number. [required]

-n, --name <name>

Name of the experiment. [required]

dedup

Removes duplicate entries in a given SORTED bed file.

Two entries are called duplicate if and only if they map to the same location and have the same length. Clearly, this can be checked by computing the

  1. Reference names
  2. Start and stop (end) positions
  3. Strand

The number of reported reads (unique entries) are printed.

CLI Reference dedup [OPTIONS]

Options

-i, --inbed <inbed>

Input bed file

-o, --outbed <outbed>

Output bed file

merge

Merges logs and csv files.

CLI Reference merge [OPTIONS] COMMAND [ARGS]...

bowtie2-logs

Merge alignment statistics coming from Bowtie2 or Hisat2.

This is done by summing up the corresponding counts and calculating the percentages. This version is implemented for single-end reads only. So it won’t work for paired end statistics yet. Though it is not hard to extend this script to paired-end read case.

CLI Reference merge bowtie2-logs [OPTIONS] [INPUT_LOG_PATHS]...

Options

-o, --out <out>

Arguments

INPUT_LOG_PATHS

Optional argument(s)

concat-csv

Concatenates the given csv files

Concatenates the given csvs in the given order and writes the output is written in csv format. The concatenation is done using pandas so the column names must be compatible in the given csv files.

CLI Reference merge concat-csv [OPTIONS] [INPUT_CSVS]...

Options

-o, --out <out>

Arguments

INPUT_CSVS

Optional argument(s)

overall-stats

Combine individual stats coming from separate files into one.

This script takes the overall alignment stats files (in csv format) where each file is coming from one sample only. It merges these files in one big table where each column corresponds to one experiment.

CLI Reference merge overall-stats [OPTIONS] [INPUT_STATS]...

Options

-o, --out <out>

Arguments

INPUT_STATS

Optional argument(s)

stats-percentage

Add percentages values to the alignment statistics

CLI Reference stats-percentage [OPTIONS]

Options

-i, --inputstats <inputstats>
-o, --out <out>

sum-stats

Combines given stats into one by summing up the corresponding values.

This script takes the overall alignment stats files (in csv format) where each file is coming from one sequencing run (fastq file) only. It aggregates these files by summation and outputs the result in one big table where each column corresponds to one sample.

CLI Reference sum-stats [OPTIONS] [INPUT_STATS]...

Options

-o, --out <out>

output stats file

-n, --name <name>

Column Name [required]

Arguments

INPUT_STATS

Optional argument(s)