Analysis Tools

HOME  >  Software Resources  >  Databases

Human Genome Center have constructed and provided various kinds of biological databases. In addition some major databases which are constructed by other institutes can also be searched through entry retrieval system developed in HGC.

Followings are available.


The Cell System Markup Language (CSML) is an XML format for modeling, visualizing and simulating biopathways. CSML supports to represent several pathway types including metabolic, signaling, and genetic regulatory pathways. This project aims to facilitate the exchange of biopathway data in different formats. Effort has been made for data conversion from other XML formats. In addition, to allow extensible and flexible features of CSML, the Cell System Ontology (CSO) has been developed.


Database of human transcriptional start sites and full-length cDNAs (Prof. Sugano and Prof. Nakai)

Full length cDNA

Full length cDNA database supported by NEDO (Prof. Sugano's group)

Aberrant Splicing Database

An old collection of aberrant splicing (i.e., abnormal splicing caused mostly by point mutations and revealed as hereditary diseases) by Prof. Nakai (HGC)


Hintdb: Database of homologous, experimentally determined, protein-protein interactions across 9 species.

HitPredict: Database of predicted true protein-protein interactions in high-throughput interaction datasets.


eF-site is a database for molecular surface of proteins along with the electrostatic potential and functional site information. It's named after "electrostatic surface of functional site". 


Rat Genome Map

radiation hybrid map of OLETF rat by Otsuka GEN Res. Inst., Otsuka Pharm. Co.,Ltd and others.


Full-malaria is the database of full-length cDNAs of parasites. 5'-end-one-pass sequences of the cDNA libraries produced from the erythrocytic malaria parasites and the tachyzoites of toxoplasma parasites are mapped onto the genome sequences. It also contains 5'-end-one-pass sequences of the cyst of Echinococcus multilocularis.


ATTED-II is a database for gene coexpression networks in Arabidopsis. The networks are constructed using publicly available microarray data. 


COXPRESdb is a database for gene coexpression networks in human, mouse and rat. The networks are constructed using publicly available microarray data.


Database of transcription factors and promoters of Bacillus subtilis from literatures (Prof. Nakai et al.)


Database of tunicate promoters, transcription factors and conserved regulatory regions.


MBGD is a database for comparative analysis of completely sequenced microbial genomes, the number of which is now growing rapidly. The aim of MBGD is to facilitate comparative genomics from various points of view such as ortholog identification, paralog clustering, motif analysis and gene order comparison. MBGD is now maintained at National Institute for Basic Biology.


Bacillus subtilis ORF DB by JAFAN (Japan Functional Analysis Network of B. subitlis)