Sunday, June 14, 2009

Clusters of Orthologous Groups

COGs
Phylogenetic classification of proteins encoded in complete genomes
Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain.
66 genomes
38 orders
28 classes
14 phyla

Unicellular clusters FTP Initial

version
Science 1997 Oct 24;278(5338):631-7,
BMC Bioinformatics 2003 Sep 11;4(1):41.
Euryarchaeota
Methanobacteriales Mth
Methanococcales Mja
Halobacteriales Hbs
Thermoplasmatales Tac Tvo
Thermococcales Pho Pab
Archaeoglobales Afu
Methanopyrales Mka
Methanosarcinales Mac

Crenarchaeota
Thermoproteales Pya
Sulfolobales Sso
Desulfurococcales Ape

Ascomycota
Saccharomycetales Sce
Schizosaccharomycetales Spo

Microsporidia
Apansporoblastina Ecu
Aquificae
Aquificales Aae

Thermotogae
Thermotogales Tma

Cyanobacteria
Nostocales Nos
Chroococcales Syn

Deinococcus-Thermus
Deinococcales Dra

Fusobacteria
Fusobacterales Fnu

Spirochaetes
Spirochaetales Tpa Bbu

Chlamydiae
Chlamydiales Ctr Cpn
Actinobacteria
Actinomycetales Cgl Mtu MtC Mle

Firmicutes
Clostridiales Cac
Bacillales Sau Lin Bsu Bha
Lactobacillales Lla Spy Spn
Mycoplasmatales Uur Mpu Mpn Mge

Proteobacteria
Pseudomonadales Pae
Enterobacteriales Eco EcZ Ecs Ype Sty Buc
Xanthomonadales Xfa
Vibrionales Vch
Pasteurellales Hin Pmu
Burkholderiales Rso
Neisseriales Nme NmA
Campylobacterales Hpy jHp Cje
Caulobacterales Ccr
Rhizobiales Atu Sme Bme Mlo
Rickettsiales Rpr Rco

Upcoming microbial genomes
genomes genera orders classes phyla
261 126 63 33 17
[N] Nano
[A] Euryarchaeota (8)

* Methanobacteria * Methanococci
* Methanomicrobia * Halobacteria
* Thermoplasmata * Thermococci
* Archaeoglobi * Methanopyri
[R] Creno (3)
[D] Deinococcus (2)
[T] Actinobacteria (3)
[P] Proteobacteria (26)

α
(6)
β
(5)
γ
(10)
δ
(4)
ε
(1)
[O] Other (9)
*
Bacteroidetes
*
Chlorobi
*
Fusobacteria
*
Aquificae
*
Chloroflexi
*
Thermotogae
*
Planctomycetes
*
Spirochaetes
*
Chlamydiae
[F] Firmicutes (7)

Mollicutes (3)
Bacilli (2)
Clostridia (2)
[C] Cyanobacteria (4)

* Gloeobacteria
* Nostocali
* Prochlorali
* Chroococcali






Eukaryotic Clusters FTP
Code Name Abbreviation
A Arabidopsis thaliana
(thale cress)
ath
C Caenorhabditis elegans
(worm)
cel
D Drosophila melanogaster
(fruit fly)
dme
H Homo sapiens
(human)
hsa
Y Saccharomyces cerevisiae
(baker yeast)
sce
P Schizosaccharomyces pombe
(fission yeast)
spo
E Encephalitozoon cuniculi
(Microsporidia)
ecu

Upcoming eukaryotic genomes
O Oryza sativa
(rice)
osa
Q Anopheles gambiae
(mosquito)
aga
Z Pan troglodytes
(chimpanzee)
ptr
W Canis familiaris
(dog)
cfa
M Mus musculus
(mouse)
mmu
R Rattus norvegicus
(rat)
rno
Ascomycota genomes including
L Magnaporthe grisea mgr
N Neurospora crassa ncr


No comments:

Post a Comment