NIG DNA Sequencing Center -medaka site-


Medaka genome data release policy

We (
the medaka genome sequencing project) herein make the draft assembly of the medaka (Oryzias latipes) Hd-rR genome publicly available by providing BLAST search function before scientific publication. We reserve the exclusive right to publish, in a timely manner, the assembly and sequence of the medaka genome along with an initial genome-scale analysis. Reserved analyses include the identification of complete sets of genomic features such as genes, gene families, regulatory elements, repeat structures, GC content, etc., and the identification of regions of evolutionary conservation across the entire genome.
All users may search and use the draft assembly freely under the restrictions of the previous paragraph. Since the current version is still a preliminary one and may contain mistakes, users should use the data at their own risk and are not allowed to redistribute or repackage the data. When users publish the analysis of individual genes and genomic regions using the data of this site, they should include the acknowledgement "The data has been provided freely by the National Institute of Genetics and the University of Tokyo for use in this publication/correspondence only."
Finally, we are continuing to improve the assembly, therefore, any feedback information on the assembly from the users should be highly welcomed.


Blast search with a long sequence query may cause considerably long time to obtain the result. We recommend that users minimize the length of query sequences.

Contact information:

Status and schedule

The current assembly was made by the newly developed RAMEN assembler using 6.7X shotgun reads (mostly 2Kb library). The comparison of the assembly with the finished sequence of some BAC clones confirmed the accuracy of the assembly and suggested the genome coverage of 91 % to 99 %.
Now we are adding the reads from longer insert libraries (10Kb, 40Kb etc.), with which much better assembly will be made in the next version. Furthermore, the next version will be presented on a newly developed genome browser, which will integrate various information and enable much faster search.

The medaka genome sequencing project

As one of the important targets of the group grant project "Genome Science" (Grant-in-Aid for Scientific Research on Priority Areas supported by the Ministry of Education, Culture, Sports, Science and Technology of Japan), we have started the sequencing of the medaka genome (ca. 800Mb) at the Academia Sequencing Center of the National Institute of Genetics (NIG) in mid 2002. The strain we choose is a southern inbred strain, Hd-rR, and sequencing is being conducted by the whole-genome shotgun strategy. Our initial plan is to assemble 6-8X coverage of 2Kb shotgun libraries together with longer insert libraries (10Kb, 40Kb etc.) to produce a set of high quality scaffolds. For this purpose, we also develop a brand-new genome assembler as well as a new genome browser. Furthermore, we prepare 3000 or more SNP markers and map them genetically, with which most of the scaffolds is expected to be arranged properly on the genome. By this fashion, we try to establish an extremely high quality draft sequence of the medaka genome by the end of 2004.

Project core members

Yuji Kohara (Shotgun sequencing)
Center for Genetic Resource Information
National Institute of Genetics

Shinichi Morishita (Genome assembler/browser and informatics)
Graduate School of Frontier Sciences,
University of Tokyo

Hiroyuki Takeda (Materials, SNP and EST mapping, and medaka biology)
Graduate School of Science,
University of Tokyo