About

Covalent DNA modifications have been found in numerous organisms and more are continually being discovered and characterized, as detection methods improve. Many of these modifications can affect the conformation of the DNA double helix, often resulting in downstream effects upon transcription factor binding. Some of these modifications have been demonstrated to be stable, while others are viewed as merely transient.

DNAmod catalogues information on known DNA modifications, of which the well-known 5-methylcytosine is only one. It aims to profile modifications' properties, building upon data contained within the Chemical Entities of Biological Interest (ChEBI) database. It also provides literature citations and includes curated annotations on mapping techniques and natural occurrence information.

We regularly update DNAmod and manually curate the modifications verified to occur in vivo.

Citation

If you use DNAmod in your work, please cite

Sood AJ, Viner C, Hoffman MM. 2019. DNAmod: the DNA modification database. J Cheminform, 11:30.

Methods

DNAmod is comprised of this static website and a backing SQLite database. The database is created using Python, including the SOAP client suds and Biopython. The website is also created using Python, makes use of Open Babel via its Python wrapper, Pybel, and uses the Jinja2 templating engine. It uses the elasticlunr.js JavaScript module to provide search functionality.

DNAmod uses ChEBI to import potential covalent DNA modifications. It initially imports (via ChEBI Web Services) all children of any of the nucleobases, as indicated by ChEBI’s ontology (specifically, any entry which ChEBI 'has functional parent' icon has functional parent ∈ {A, C, G, T, U}).

We filter these putative modifications using ChEBI’s star-based rating system and retain only those which have been assigned three stars (indicating the highest level of manual curation). We import citations for references provided by ChEBI (via PubMed IDs) using the Biopython Entrez package. We created a list of DNA modifications, that we call the "verified" set and import all non-blacklisted descendents of these modifications, defined as those having either a ChEBI 'has functional parent' icon has functional parent or ChEBI 'is a' icon is a relationship with their parent. All candidate modifications that were not verified are placed into the "unverified" category, pending curation or novel functional insights. We import all information into the SQLite database.

To create individual webpages that make up this static website, the SQLite database is imported into Python and processed using Jinja2 templates. The chemical structures displayed on the modification pages are created by converting ChEBI-provided simplified molecular-input line-entry system (SMILES) information into vector images, via Pybel.

License

All source code and web assets are licensed under a GNU General Public License, version 2 (GPLv2). DNAmod's data is licensed under a Creative Commons Attribution International license (CC BY 4.0).

Availability

The DNAmod source code is available on GitHub.

The backing SQLite database is available.