Appendix 2 â Data Submission Checklist
Appendix 2 â Data Submission Checklist
Appendix 2 â Data Submission Checklist
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Please note that EMBL stopped accepting email submissions on January<br />
1, 2003.<br />
If you will produce a large volume of genome sequence over an extended<br />
period of time, please contact the EMBL database administrators at<br />
datasubs@ebi.ac.uk<br />
2.2 GenBank<br />
<strong>Submission</strong>s to GenBank can be done using the BankIt web submission<br />
tool (http://www.ncbi.nlm.nih.gov/BankIt/) or the Sequin tool<br />
(http://www.ncbi.nlm.nih.gov/Sequin/index.html). For simple submissions,<br />
BankIt is recommended (Dennis A. et al. 2005). Sequin is available on<br />
Bio-Linux.<br />
2.3 Expressed Sequence Tag (EST) sequence<br />
EST sequences can be submitted to the public EST repository dbEST<br />
(http://www.ncbi.nlm.nih.gov/dbEST/). The trace2dbEST software<br />
developed by the EGTDC and available on Bio-Linux can be used for EST<br />
processing and direct submission to dbEST<br />
(http://envgen.nox.ac.uk/est.html)<br />
3. Transcriptomics <strong>Data</strong><br />
3.1 Microarray Experiments<br />
Microarray experiment descriptions and results should be annotated to<br />
MIAME standard and submitted to a public repository such as ArrayExpress.<br />
(http://www.ebi.ac.uk/arrayexpress/). Further details on the MIAME standard<br />
can be found at http://envgen.nox.ac.uk/miame/index.html and on the<br />
MIAME/Env data standard at<br />
http://envgen.nox.ac.uk/miame/miame_env.html.<br />
We recommend the maxdLoad2 software, developed by the EGTDC and<br />
installed on Bio-Linux for annotation and preparation of a file in MAGEML<br />
format suitable for submission to ArrayExpress.<br />
The EGTDC works closely with ArrayExpress. As of March 2005, the EGTDC<br />
recommends that microarray data be submitted via the EGTDC and a copy of<br />
the annotated data will be held at the data centre as well as in ArrayExpress.<br />
Reasons for this include:<br />
• Functions for data retrieval and searching across datasets held in<br />
ArrayExpress are still under development and holding the data locally<br />
enables us to provide accessibility and functionality not currently<br />
supported by the public repository<br />
• Partial datasets can be submitted and held<br />
• Potential for searching across datasets of other types held in<br />
compatible databases being developed by the EGTDC<br />
2