Protein
|
Each protein is identified by a combination of trivial names, gene identifications, and locus_tags. In complement a standard name following suggested nomenclature may be found in bold. The gene identifications, locus_tags and standard names [1] are issued by the authors or by researchers that realized further characterizations of the corresponding protein. The CAZy team may edit protein names for harmonization purposes. Examples are:
|
| |
| Trivial name | -amylase |
| Gene Name | AmyA |
| Locus_Tag name | TM1840 |
| Standard name | Amy13A |
|
| |
| [1] Henrissat B , Teeri TT, Warren RA (1998) A scheme for designating enzymes that hydrolyse the polysaccharides in the cell walls of plants. FEBS Lett 425:352-354 [PMID:] |

|
EC
|
Proteins having a biochemical characterization will have an attributed EC number according to NC-IUBMB rules. In the absence of appropriate EC numbers, EC categories will be issued. As EC numbers are abusively attributed to proteins merely based on sequence similarity, as often the case for sequences issued from genomic efforts, attributed EC numbers are often removed until experimental evidence is described in the literature or in other reliable support. The CAZy team tries to ensure that EC descriptions reflect unambiguous experimental characterization. The EC information is linked to IntEnz.
|
| |
Full EC: 3.2.1.1 (as in "regular" -amylase) |
| Partial EC: 3.2.1.- (as in sucrose hydrolase) |
|

|
Organism
|
| The systematic name of organisms are attributed according to the NCBI Taxonomy and eventually complemented by extra information (strain, serovar, variant, etc). This information is linked directly to the appropriate Tax_id, or for the case of fully sequenced genomes analyzed by CAZy, to the CAZy genome summary. The latter are identified in bold. Information from the same protein in different organisms will be issued in separate entries. The CAZy team makes changes in Tax_id for singular proteins if this is justified by the scientific or patent literature. We will pursue efforts to follow the significant rearrangements in Tax_id numbers issued by th NCBI. |

|
GenBank / GenPept
|
| Access to GenBank (or EMBL, DDBJ, all nucleotide accessions) codes are provided in addition to GenPept (or PID - Protein Idendification Number) codes. These entries are issued from our daily analysis of protein sequences released publicly by the NCBI. As individual protein sequences may vary in length, quality and content, the "best" model is identified in bold. Sequences issued from patents and other external information may have no corresponding GenBank (or nucleotide) code. Ocasionnaly sequences issued from RefSeq are integrated here, but such entries only paliate delays in the release of genomic information. Sequences deposited with no identification of coding sequence or issued from pseudogenes may have no corresponding PID. |

|
UniProt
|
| Accessions from the UniProt database (covering SwissProt, TrEMBL and PIR) are given for convenience purposes, only. The many singular differences in Taxonomy from our main (GenPept) reference and frequent merger or splitting of entries results in irregular updates of this data by the CAZy team. Efforts to paliate this state may be undertaken in the future if significant external interest are undertaken to justify the effort. |

|
PDB / 3D
|
| Information on the Structural Biology efforts to characterize CAZy proteins are summarized here. Accession PDB codes will be provided for proteins having characterized structures corresponding to the present family. This information reflects the weekly analysis of new PDB releases, as well as other information dealing with pending proteins already possessing an accession code. If PDB codes are available for the same protein but do not correspond to the family presently diaplayed they will be not provided here. They may be made available if corresponding to other CAZy families. Information is provided on public protein crystalization reports (cryst). |
 |