G-NOME

G-NOME - Nuclear Organisation Modelling Environment for Hi-C data written in Python

G-NOME Logo

A Python code base which allows for easy and dynamic inference of Hi-C data into 3D geometries for radiation based simulations. Featured as PLOS Computional Biology’s December 2020 journal cover.

Warning

This code base is continuously being developed to add new features. For the newest version, changelog, documentation and support please check updates of the code repository.

Motivation

This project provides a simple interface to producing and optimising polymer based models of the genome from Hi-C data. Due to Python’s interpreted nature it is possible to dynamically adjust objectives during the optimisation process to achieve the user’s requirements. Furthermore, this project has been designed specifically to provide the geometric detail required for track-structure radiation simulation toolkits, such as GEANT4-DNA and TOPAS-nbio.

Dependencies

Installation

G-NOME requires Python 3.6.0+ to run and it is recommended to use a virtual enviroment. Please install the requirements using:

pip install -r requirements.txt

Then build the necessary Cython files using:

python setup.py build_ext --inplace

Code Example

For an example of how to make run scripts for G-NOME see the gnome.py file. this is an example run file which allows for command line inputs to be parsed. To test the setup of G-NOME run the following:

python gnome.py -g gtrack/HMEC.gtrack -n 100

This should create a .vert.txt and .cmm files which are the ouput geometries from G-NOME.

Using G-NOME

We highly recommend using the example gnome.py script to develop an understanding of how to use the G-NOME package. This script is also the fastest way to use G-NOME in order to create structures for subsequent radiobiological models. Below is a list of all flags which can be used with the gnome script:

Required flags:

  • -g input the path to the desired gTrackFile which describes the Hi-C data for structual modelling.

Optional flags:

  • -o set save directory including the name of the output files (default: “”)
  • -s set random seed for solving the geometry (default: 0)
  • -r set the desired nuclear radius in micrometres (default: 5.0)
  • -n set the number of iterations you wish to use to optimise the geomery (default: 1e6)
  • -t set maximum temperature for simulated annealing (deault: 1)
  • -c set cool rate for simulated annealing (deault: 0)

Note: If cool rate is set to 0 the optimisation will effectively be using the Metropolis–Hastings algorithm

  • -v set the DNA occupancy volume of the cell nucleus (default: 0)

Note: If occupancy volume set to 0 the beads will not be scaled and the bead size will be taken from the radius column of the gTrackFile

  • -l set the logging iteration for recording of features from other flags and to print out the optimisation score to console, if 0 this feature is disabled (default: 0)
  • -m set model name which is used to name the output files (default: chrombuild)
  • –PrintStructures output *.cmm files each of the logging iterations which can be used for visualisation (default: False)
  • –ConstrainNucleus adds nuclear constraints to every bead in the system, this will apply additional cost to beads positioned outside the user defined nuclear radius (default: False)
  • –TwoPhaseOpt will optimise in two phases, the first phase optimises contact contraints and the second phase optimises both contact and nuclear contraints (default: False)
  • –MoveExclusion add bead movement types to exclude from the optimisation, moves to exlude are “crankshaft”, “microcrank”, “armwiggle”, “armrotate”, “translate” and “rotate” (Default: [])

Whilst the provided gnome.py script is a good starting point for the majority of users. The package is very versitile and can be used for more custom needs. This can be achieved by users make thier own run scripts which use the classes provided in the G-NOME package.

Contribute

We welcome anyone wishing to expand the functionality of the G-NOME project. Therefore, if you wish to contribute to the project please see our contributing guidelines.

Credit

This project started as a re-implementation of the Chrom3D package, but has since evolved to add several features which are useful to the radiobiological community.

License

This package is licensed open-source under the GNU GENERAL PUBLIC LICENSE Version 3.

Cite

If you use this software in your own work it would be appreciated if you could cite the following:

Ingram et al., Hi-C implementation of genome structure for in silico models of radiation-induced DNA damage. PLoS Comput. Biol. https://doi.org/10.1371/journal.pcbi.1008476.