加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
This is the README file for MCclassifier

Copyright (C) 2016 Pawel Gajer (pgajer@gmail.com) and Jacques Ravel (jravel@som.umaryland.edu)

Permission to use, copy, modify, and distribute this software and its
documentation with or without modifications and for any purpose and
without fee is hereby granted, provided that any copyright notices
appear in all copies and that both those copyright notices and this
permission notice appear in supporting documentation, and that the
names of the contributors or copyright holders not be used in
advertising or publicity pertaining to distribution of the software
without specific prior permission.

THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE DISCLAIM ALL
WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE
CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY SPECIAL, INDIRECT
OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE
OR PERFORMANCE OF THIS SOFTWARE.


Introduction
************

MC classifier is a super fast 16S rRNA gene fragment species level
classifier. For each 16S rRNA amplicon region a reference set of high quality
sequences known to be found in a given environment is used to generate a higher
order Markov chain models for each taxonomic ranks starting from reference
specie, corresponding genera, up to phylum level. A query sequence is first
classified at the phylum level and then through lower taxonomic ranks. At each
level the most probable model is accepted if the log odds of the probability of a
sequence being generated by the model is above certain threshold. Due to high
speed of MC classifier, the classification can be done on all reads even in a
large project.


Installation
************

Assuming that you are in the top level directory (containing this README file),
run

cd src
make

This will generated an executable 'classify' in the bin subdirectory of the top
level directory. Copy the executable to a directory included in your PATH
variable or add the path of the bin directory to your PATH variable.


Usage
*****

Here is an example of running classify on a fasta file of 10,000 vaginal Illumina
sequences from the V3-V4 region

   classify --rev-comp -i test10k.fa -d vaginal_319_806_rc_MCo7p2 -o mcDir

A count_tbl.pl scrip (located in the bin directory) can be used to generate a
sample x count contingency table

   count_tbl.pl -i mcDir/MC.order7.results.txt -o mcDir/spp.count.tbl.txt



The MC classifier can be run in a mode where all sequences are classified to
the species level by suppressing taxon error thresholding. To run MC classifier
in this mode, use --skip-err-thld flag

   classify --skip-err-thld -i test10k.fa -d vaginal_319_806_rc_MCo7p2 -o mcDir


To get more info about the classifier's options run

   classify -h


To build MC models for a specific environment, please contact the author.

空文件

简介

暂无描述 展开 收起
Perl 等 5 种语言
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化