Sequence Quality Control Studio (SeQCoS) is an open source .NET software suite designed to perform quality control of massively parallel sequencing reads.
- SeQCoS generates a series of standard plots to illustrate the quality of the input data. These plots (saved in JPEG file format) provide information on commonly observed measurements, such as GC content and distribution of quality scores at position-specific and sequence-specific levels.
- Basic trimming and discarding functions are provided to manipulate sequence files, according to:
- Minimum read length; or
- Minimum base quality score; or
- Pattern matching by regular expression
- (Experimental) As an optional step, SeQCoS can invoke NCBI BLAST (standalone blast+ toolkit) to search the input against a BLAST-formatted database. A pre-formatted database of NCBI UniVec, a repository of vector sequences, adapters, linkers and PCR primers that are used in DNA sequencing, is provided here; however, users are free to supply their own database for searching.
SeQCoS was written in C# using .NET 4.0. One of the goals of this project was to develop an application that integrates with the .NET Bio
, an open-source bioinformatics library. Hence, this application takes advantage of functionality offered by .NET Bio for handling sequence data as well as Sho
, a data analysis and visualization application. The SeQCoS GUI was developed using Windows Presentation Foundation 4.0.
Currently, the input and output formats supported by SeQCoS are limited to FASTA and FASTQ. For FASTA, only sequence-level analysis is performed. Future support for other sequence formats (e.g. BAM/SAM) are planned.
* Windows 7 or newer (older versions of Windows should work but has not been tested)
* Microsoft .NET 4.0
* Sho 2.0.5
or higher (tested on 2.0.5)
* Standalone blast+
(NCBI Blast for Windows) - required for executing BLAST searches, otherwise it is optional
* 4 GB RAM or more
for the latest release or view the Source Code
- Download UniVec pre-formatted BLAST databases
Please visit the Documentation
page for more info. Please report any bugs using the Issue Tracker
SeQCoS was written by Kevin Ha, currently a PhD candidate in the Department of Molecular Genetics, University of Toronto, Toronto, Canada.E-mail (requires reCAPTCHA)
The SeQCoS project was initiated by the Microsoft Biology Initiative group. Kevin started and developed the project while interning at Microsoft Research (Redmond, WA) in the summer of 2011. SeQCoS is an open source project licensed under Apache 2.0 (see License