"Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction" by Ekaterina Starostina

Selected Works of Stephen O'Brien

Follow Contact

Article

Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction

bioRxiv

Ekaterina Starostina, Bioinformatics Institute - Russia; St. Petersburg State University - Russia
Gaik Tamazian, St. Petersburg State University - Russia
Pavel Dobrynin, St. Petersburg State University - Russia
Stephen J. O'Brien, St. Petersburg State University - Russia; Nova Southeastern University
Aleksey Komissarov, St. Petersburg State University - Russia

Link Find in your library

Document Type

Article

Publication Date

8-14-2015

Disciplines

Abstract

Motivation: Kmer-based analysis is a powerful method used in read error correction and implemented in various genome assembly tools. A number of read processing routines include extracting or removing sequence reads from the results of highthroughput sequencing experiments prior to further analysis. Here we present a new approach to sorting or filtering of raw reads based on a provided list of kmers.

Results: We developed Cookiecutter — a computational tool for rapid read extraction or removing according to a provided list of k-mers generated from a FASTA file. Cookiecutter is based on the implementation of the Aho-Corasik algorithm and is useful in routine processing of high-throughput sequencing datasets. Cookiecutter can be used for both removing undesirable reads and read extraction from a user-defined region of interest.

Availability: The open-source implementation with user instructions can be obtained from GitHub: https://github.com/ ad3002/Cookiecutter

Comments

The copyright holder for this preprint is the author/funder. It is made available under a CC-BY 4.0 International license.

ORCID ID

0000-0001-7353-8301

ResearcherID

N-1726-2015

DOI

10.1101/024679

Citation Information

Ekaterina Starostina, Gaik Tamazian, Pavel Dobrynin, Stephen J. O'Brien, et al.. "Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction" bioRxiv (2015) p. 1 - 6
Available at: http://works.bepress.com/stephen-obrien/163/