LOTUS USAGE EXAMPLES
Usage Example 1: 454 sequencing
Assume that the output of two separate 454 runs are stored in/Users/Tomas/data
and that the mapping file was formatted to include the fasta and quality file associated to each barcode (as described in the tutorial).
LotuS can be started using the following command:
perl lotus.pl -i /Users/Tomas/data -m /Users/Tomas/data/map.txt -o /Users/Tomas/results/example
perl lotus.pl -i /Users/Tomas/data -m /Users/Tomas/data/map.txt -o /Users/Tomas/results/example -s /Users/Tomas/data/example_opt.txt
Usage Example 2: miSeq paired-end sequencing
The -barcode option is implemented specifically for miSeq and hiSeq paired end preprocessed sequencer output files, which consists of three files:- the forward and reverse paired sequences which can be passed to LotuS using the "-i fwdPair.fastq,revPair.fastq" command
- one file with the barcodes separated from the sequences, this file is passed to LotuS using the "-barcode barcodes.fastq" command
perl lotus.pl -i /Users/Tomas/data/fwdMiSeq.fastq,/Users/Tomas/data/revMiSeq.fastq -barcode /Users/Tomas/data/MIDs.fastq -m /Users/Tomas/data/map.txt -o /Users/Tomas/results/example -s /Users/Tomas/data/sdm_miSeq.txt
You can observe this difference in quality in the lotus log files on quality distribution, that separate first from second read.
Usage Example 3: raw sequence files demultiplexed into single files
In case the fasta / fastq files from the sequencing facility are already demultiplexed into single files, LotuS can still process these by setting up the mapping file. Basically we want our samples to have a specific name, but the Barcodes has been removed and one file represents each Sample. The mapping file should look like this:
#SampleID BarcodeSequence LinkerPrimerSequence fastqFile Description
bl9 file_with_bl9.fastq FVB
bl10 file_with_bl10.fastq FVB
...
bl36 file_with_b36.fastq FVB
Note that "BarcodeSequence" and "LinkerPrimerSequence" are just empty fields, separated by a tab character. It is important that they are just empty so LotuS doesn't start looking for e.g. a Space character (" ") as Primer or Barcode. Also, check that in the sdm_XX.txt (sdm option file) the following key is set to F (False): RejectSeqWithoutFwdPrim F.
During LotuS.pl class the absolute or relative path to the folder with all the file_with_bl9.fastq.. files in it is given as -i argument.
In case paired end files are being used, the column fastqFile has to contain the comma-separated two fastq file. I.e. instead of "file_with_bl10.fastq" this would be the two paired end files: "file_with_bl10.1.fastq,file_with_bl10.2.fastq" Same applies for separate fasta and quality files and their respective columns (separate pair1,pair2 by comma).
perl lotus.pl -i /Users/Tomas/data/dir_with_single_files/ -m /Users/Tomas/data/map.txt -o /Users/Tomas/results/example -s /Users/Tomas/data/sdm_XX.txt
Usage Example 4: demultiplex with fasta/fastq header
In some cases (especially short read archieves), the sample identity is stored with a small string in the fasta or fastq header. If this string is known (and this string is uniquely identifiable), the mapping file can be configured to split the sequences by an ID within their header:
#SampleID BarcodeSequence LinkerPrimerSequence ReversePrimer SampleIDinHead Description
bl9 GTGCCAGCAGCCGCGGTAA bl9ID882 FVB
bl10 GTGCCAGCAGCCGCGGTAA bl10IDXXY FVB
...
bl36 GTGCCAGCAGCCGCGGTAA some_other_string FVB
and execute lotus:
perl lotus.pl -i /Users/Tomas/data/data/Reads.fastq -m /Users/Tomas/data/map.txt -o /Users/Tomas/results/example -s /Users/Tomas/data/sdm_XX.txt
Usage Example 5: Only sequences without quality information available
I do not recommend this
I.e. the richness and diversity estimates will be unreliable. Composition itself should be fine, but I never investigated this in detail. Since some databases (MG-RAST) do in some cases not supply quality files, this option is available in LotuS, but you have to proceed at your own risk.The setup is very similar to all the examples above, but the quality file destination is left empty. E.g. lotus.pl -i XX.fna or indicate in the mapping file the file location for single fna file (column header "fnaFile") and leave the "qualFile" column empty.
Usage Example 6: Eukaryotic LSU amplicons
Adapt the demultiplexing as decribed above and then execute LotuS with the following added commands:
perl lotus.pl -i .. -m .. -o .. -s .. -amplicon_type LSU -tax_group fungi
Note that using -simBasedTaxo 1/2 you can classify much more than just fungi (although it say tax_group fungi). The restriction is from RDP, that only has fungal LSU training sets included, but this is overriden by Silva annotations in the described case.
Usage Example 7: Fungal ITS amplicons
As example 6, but:
perl lotus.pl -i .. -m .. -o .. -s .. -amplicon_type ITS
See more advanced options for LotuS usage in the commandline documentation.
Learn more about setting up the quality filter for you reads in the sdm configuration.