We (the medaka genome sequencing project) herein make the draft assembly
of the medaka (Oryzias latipes) Hd-rR genome publicly available by
providing BLAST search function before scientific publication. We reserve the
exclusive right to publish, in a timely manner, the assembly and sequence of the
medaka genome along with an initial genome-scale analysis. Reserved analyses
include the identification of complete sets of genomic features such as genes,
gene families, regulatory elements, repeat structures, GC content, etc., and the
identification of regions of evolutionary conservation across the entire genome.
All users may search and use the draft assembly freely under the restrictions of
the previous paragraph. Since the current version is still a preliminary
one and may contain mistakes, users should use the data at their own risk and
are not allowed to redistribute or repackage the data. When users publish the
analysis of individual genes and genomic regions using the data of this site,
they should include the acknowledgement "The data has been provided freely by
the National Institute of Genetics and the University of Tokyo for use in this
Finally, we are continuing to improve the assembly, therefore, any feedback
information on the assembly from the users should be highly welcomed.
Blast search with a long sequence query may cause considerably long time to
obtain the result. We recommend that users minimize the length of query
The current assembly was made by the newly developed RAMEN assembler using 6.7X
shotgun reads (mostly 2Kb library). The comparison of the assembly with the
finished sequence of some BAC clones confirmed the accuracy of the assembly and
suggested the genome coverage of 91 % to 99 %.
Now we are adding the reads from longer insert libraries (10Kb, 40Kb etc.), with
which much better assembly will be made in the next version. Furthermore, the
next version will be presented on a newly developed genome browser, which will
integrate various information and enable much faster search.
As one of the important targets of the group grant project "Genome Science"
(Grant-in-Aid for Scientific Research on Priority Areas supported by the
Ministry of Education, Culture, Sports, Science and Technology of Japan), we
have started the sequencing of the medaka genome (ca. 800Mb) at the Academia
Sequencing Center of the National Institute of Genetics (NIG) in mid 2002. The
strain we choose is a southern inbred strain, Hd-rR, and sequencing is being
conducted by the whole-genome shotgun strategy. Our initial plan is to assemble
6-8X coverage of 2Kb shotgun libraries together with longer insert libraries
(10Kb, 40Kb etc.) to produce a set of high quality scaffolds. For this purpose,
we also develop a brand-new genome assembler as well as a new genome browser.
Furthermore, we prepare 3000 or more SNP markers and map them genetically, with
which most of the scaffolds is expected to be arranged properly on the genome.
By this fashion, we try to establish an extremely high quality draft sequence of
the medaka genome by the end of 2004.
Project core members
Yuji Kohara (Shotgun sequencing)
Center for Genetic Resource Information
National Institute of Genetics
Shinichi Morishita (Genome assembler/browser and informatics)
Graduate School of Frontier Sciences,
University of Tokyo
Hiroyuki Takeda (Materials, SNP and EST mapping, and medaka biology)
Graduate School of Science,
University of Tokyo