Dataresource

DATARESOURCE

› Database

› Software

SDRF2GRAPH

  • DESCRIPTION: SDRF2GRAPH is an application to produce graphical image of investigation design graph (IDG) based on SDRFs written in a MAGE-tab formatted spreadsheet(*.xlsx).
  • REQUIREMENTS: Ruby, rexml, rubyzip, GraphViz.
  • LICENSE: Ruby's license
  • CITEATION:
    • Hideya Kawaji et al., "SDRF2GRAPH - a visualization tool of a spreadsheet-based description of experimental processes". BMC Bioinformatics 2009, 10:133
  • AVAILABILITY: http://fantom.gsc.riken.jp/4/sdrf2graph

Nexalign

  • DESCRIPTION: Nexalign is a program to align millions of short reads from next-generation sequencing data sets to reference genomes.
  • REQUIREMENTS: Unix / Linux.
  • LICENSE: GNU General Public License
  • SOFTWARE DOWNLOAD: nexalign-1.3.5.tgz
  • CONTACT: timolassmann@gmail.com

TagDust

  • DESCRIPTION: TagDust is a program to eliminate artifactual reads from next-generation sequencing data sets.
  • REQUIREMENTS: Unix / Linux.
  • LICENSE: GNU General Public License
  • CITEATION:
    • Lassmann T., et al. (2009) TagDust - A program to eliminate artifacts from next generation sequencing data. Bioinformatics.
  • SOFTWARE DOWNLOAD: tagdust.tgz
  • CONTACT: timolassmann@gmail.com

EdgeExpressDB (eeDB)

  • DESCRIPTION: EdgeExpressDB (eeDB) is a federated data abstraction system designed for integrating, interpreting, and visualizing very large biology datasets. It is designed for scaling beyond Petabytes and 10^13 objects. For those interested in installing your own instances of EEDB the source code is available via CPAN and is being further developed within the Omics Science Center by Jessica Severin.
  • REQUIREMENTS: Perl DBI/DBD, MySQL, SQLite.
  • LICENSE: BSD License
  • CITEATION:
    • Jessica Severin, et.al. FANTOM4 EdgeExpressDB: an integrated database of genes, microRNAs, their promoters, expression dynamics and regulatory interactions. Genome Biology, 10:R39, 1-9 (2009)
  • AVAILABILITY: http://sourceforge.net/projects/eedb/, available via CPAN (http://search.cpan.org/~jms/EdgeExpressDB_0.953h/).

MuMRescueLite

  • DESCRIPTION: MuMRescueLite is the software that enable to use the tag sequencies of mapped to multiple loci to the genome, for the expression analysis. At the mapping of short sequence tags of CAGE or ChIP-Seq to the genome, sequence tags that map to multiple genomic loci (multi-mapping tags or MuMs), are routinely omitted from further analysis, leading to experimental bias and reduced coverage. MuMRescueLite probabilistically reincorporates multi-mapping tags into mapped short read data with acceptable computational requirements.
  • REQUIREMENTS: Python2.4 or later; platform is same to the Python itself.
  • LICENSE: The MIT License.
  • CITATION:
    • Faulkner, G.J., et al. (2008) A rescue strategy for multi-mapping short sequence tags refines surveys of transcriptional activity by CAGE, Genomics.
    • Hashimoto, T., et al. (2009) Probabilistic resolution of multi-mapping reads in massively parallel sequencing data using MuMRescueLite, Bioinformatics.
  • SOFTWARE DOWNLOAD: MuMRescueLite_090522.tar.gz
  • SAMPLE DATASET: MuMRescueLite_test_data.tsv.gz

Cross-mapping correction software

  • DESCRIPTION: Modern high-throughput technologies enable deep sequencing of non-coding RNA species, such as miRNAs, on an unprecedented scale. When mapping such small RNAs to the genome, cross-mapping may occur, in which RNA sequences originating from one genomic locus are inadvertently mapped to a different locus. This may give rise to spurious novel RNAs, as well as spurious editing sites in known miRNAs. The cross-mapping correction software is a Python script that aims to correct for such cross-mapping effects.
  • REQUIREMENTS: Python 2.4; Numerical Python (NumPy) version 1.3 or later.
  • LICENSE: The Python License.
  • CITATION:
    • De Hoon, M.J.L., et al. (2010): Cross-mapping and the identification of editing sites in mature microRNAs in high-throughput sequencing libraries. Genome Research 20: 257-264 (2010).
  • SOFTWARE DOWNLOAD: cmc.tar.gz
  • SAMPLE DATASET: A sample data set is included with the software package.

› Sequence data

All published RIKEN sequence data have been submitted to the public databases.

1. Use of RIKEN sequence data

Note: Please note that RIKEN has assembled some data using some EST data from other sources. Because DDBJ cannot provide RIKEN with accession numbers for data which include data sources outside of RIKEN, some of the total FANTOM3 data cannot be submitted to DDBJ and are not included in the RIKEN data posted at DDBJ.

All sequence data files are compressed in the RIKEN Mouse Encyclopedia sequence data archive and can be downloaded via HTTP. Compressed files can be expanded to their original form using "uncompress". When you download these files please email to us at as we would like to keep track of users.

Note: Depending on the browser you use, you may have some troubles for downloading the files. If you do, please select the "Save link as" function on your browser as this should alleviate any problems.

2. Contents of Download Archive

Please refer to the "README.txt" file. This is citing resource information. Please use the citing formats in referring to the RIKEN Mouse Encyclopedia in your publication materials.

› Clones

1. The FANTOM3 clones

102,801 fully annotated mouse cDNA clones, are available through the RIKEN's designated distributor.

2. Other clone(s)

Please contact for other clones.

› Contact us

If you should need additional information regarding clones and data, please do not hesitate to contact us via email.

E-mail:

RIKEN Omics Science Center
1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 230-0045, Japan
Tel: +81 45-503-9222
Fax: +81 45-503-9216