Genomic and transcriptomic analysis of the Asian honeybee Apis cerana provides novel insights into honeybee biology



        Apis cerana is endemic to Asia with six morphologically distinct subspecies distributed throughout a series of climatic zones on the Asian landscape and has been used for pollination and commercial beekeeping over thousands of years in Asia. As close relatives, A. cerana and A. mellifera are very similar in morphology and behavior. Nevertheless, A. cerana has several distinct biological characteristics compared with A. mellifera.

Here we present a high-quality assembly and annotation of the southern strain of A. cerana genome. We sequenced 23.7 giga-bases (Gb) of Illumina pair-end reads, 8.7 Gb of Illumina mate-pair reads, and 636 mega-bases (Mb) of 454 GS FLX shotgun sequences using genomic DNA extracted from two haploid drone pupae from one A. cerana colony. A total of 9,902 contigs of 209.2 Mb, ranging from 500 bp to 535,036 bp with N50 of 67,909 bp, were assembled. All contigs were constructed into 890 scaffolds with a total length of 229.5 Mb and N50 of 1.5Mb.

Transcriptome sequencing of mixed brains of A. cerana workers produced 469,162 ESTs with average length 390 bp and 98% of the ESTs could be mapped on A. cerana genome, suggesting the A. cerana genome covered most genes. By combining a gene prediction program based on EST alignments (Gnomon) and three ab initio prediction programs (GeneMark.hmm, Augustus and Snap) based on the A. mellifera model, we identified 10,182 protein-encoding genes in the A. cerana genome, with an average gene size of 7,577 bp and an average CDS size of 1,695 bp.


File Description:

Acc-Genome.fas                   The 890 scaffolds of Apis cerana genome

Acc-contigs.fas                     The 9,902 contigs of Apis cerana genome

Acc-Gene.fas                        The nucleotide sequence of predicted genes

Acc-Pr.fas                             The protein sequence of predicted genes

Acc-Gene.gff                        The gff file of predicted genes

Acc-Annotation.xlsx             The annotation file of predicted genes

Acc-cDNA.fas                      469,162 ESTs of A. cerana workers