MarpoDB - About

Description and Motivation

MarpoDB is a gene-centric database for Marchantia polymorpha genetic parts designed for the purposes of genetic engineering and synthetic biology. Our motivation to develop this resource emerged from the need to handle and facilitate access to annotated sequences in the most simple, clean and straightforward manner possible.

Datasets

Cam Dataset

The Cam dataset is derived from the Marchantia polymorpha Cam-1 & Cam-2 strains isolated by Prof. Jim Haseloff in Cambridge, UK.

DNA and RNA data was obtained by performing 100 bp paired-end Illumina sequencing and generating a de novo genome assembly using the Meraculous 2.0 and a de novo transcriptome assembly by using Bridger Assembler. See more information about the methods in MarpoDB: An Open Registry for Marchantia Polymorpha Genetic Parts.

The ORF predictions were produced using Transdecoder, guided by the protein motifs identified by InterpoScan and homologous protein sequences found by Blast to a Viridiplantae dataset of Uniprot.

Tak Dataset

The Tak dataset is produced by processing the version 3.1 of Marchantia polymorpha genome and transriptome data available at Phytozome. More information about the genome can be found in Insights into Land Plant Evolution Garnered from the Marchantia polymorpha Genome.

Annotation

Currently, we only hold CDS annotations, and cross-references to other Marchantia polymorpha resources.

Interpro

The complete Interpro analysis of all putative protein sequences is performed, and forms the basis for the keyword search. In addition, the protein motifs are displayed in the Details section for each gene in the Interpro html format.

Uniprot

We have selected all sequences from the Viridiplantae clade that have evidence at the protein level from the Uniprot database and blasted MarpoDB data against them. This allows to search for close homologs in MarpoDB by supplying either a protein description or a Uniprot ID as a search term at the main page. In addition, the BLAST results can be viewed in the Details section for each gene.

Phytozome

We appreciate all the hard work behind the Marchantia polymorpha genome sequencing project and the official genome resource hosted at Phytozome. We hope that our complimentary approaches to storing genomic data will give rise to a broad range of Marchantia polymorpha associated projects. We store cross-links to the Phytozome, so you can use Phytozome ID for the keyword search, and have a direct link to the Phytozome record from the Details section. All genes from the Tak dataset have an associated Mapoloy ID. Some genes from the Cam dataset have no Phytozome association, because we couldn't find a gene record with a similar sequence in the Tak dataset.

MarpolBase

Marchantia Gene Nomenclature, which aims to provide a consistent and organized nomenclature system for Marchantia genes, stores the name registry for Marchantia polymorpha gene names. You can use the registered names for the search queries, and will be presented with the registered name of the gene in the Details section.

Technical details

The backend is served by a Flask server, enhanced by Biopython and SQLAlchemy. The database is stored as a PostgreSQL instance. If you are interested in building your own gene-centric resource, check out our source code: MarpoDB and PartsDB

The frontend is written in the old-school JavaScript + jQuery with the help of Scribl, SeqViewer and Clipboard libraries.

People

MarpoDB was developed by Bernardo Pollak (BP)* and Mihails Delmans(MD)* under the supervision of Prof. Jim Haseloff at the Department of Plant Sciences of the University of Cambridge. The project was funded by the OpenPlant initiative. BP performed extraction experiments, carried out high-throughput sequencing and assembled the Cam genome and transcriptome. BP in conjunction with MD did the downstream analyses, prepared the datasets and programmed the frontend. MD developed the backend and the database models.

* Equal contribution.

Disclaimer

This software is provided by the authors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. The entire risk as to the quality and performance of the program is with you. In no event shall the authors, be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of data or data being rendered inaccurate; loss of use or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.