CCK-14 Hackathon-2: 13 December 2018

TL;DR: CCK-14 Hackathon-2:  Thursday, December 13th, 2018, from 10 am – 5 pm in the Computer Suite, Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, will be followed by a wrap up in Biochemistry’s Seminar Room 5-6 pm. Bring your own ideas and programming problems, and join experts from the Cambridge Structural Database to learn about their Python API. Pizza provided. Hackathon tickets are limited, so please book by December 11th, 2018. Hackathon tickets are limited, so please book by December 11th, 2018.

 

Dear Friends and Colleagues,

Please join us for our next “Comp Chem Kitchen” CCK Hackathon, 10 am – 5 pm on Thursday, December 13th, 2018, in the Computer Suite, Department of Biochemistry, South Parks Road, Oxford. Cambridge Crystallographic Data Centre will be present to help out on projects using the the CSD Python API.

Registration now open: places are limited; please register before December 11th, 2018.

The hackathon will follow this format:

In advance: assembly of ideas and teams via website. On the day:

  • 10.00 am: Introduction. 
    • Assemble 4-8 teams with some expertise in each.
  • 10.30 am – 12.30 pm: Hacking – design and tools
  • 12.30 – 1.00 pm: Five min updates / requests for input and help
  • 1.00 – 2.00 pm: Lunch
  • 2.00 – 5.00 pm: Hacking
  • 5.00 – 6.00 pm: Lightning summaries of hacked projects.

Pizza and refreshments will be provided.

 

We would like to thank the University of Oxford MPLS Network and Interdisciplinary Fund for making CCK possible.

About CCK

Comp Chem Kitchen is a regular forum and seminar series to hear about and discuss computational methods for tackling problems in chemistry, biochemistry and drug discovery. It focuses principally on cheminformatics, computational chemistry, and molecular modelling, and overlaps with neighboring areas such as materials properties and bioinformatics.

We’re keen to encourage people involved in coding and methods development (i.e. hackers, in the original untarnished sense of the word) to join us. Our hope is that we will share best practices, even code snippets and software tools, and avoid re-inventing wheels.

In addition to local researchers, we invite speakers from industry and non-profits from time to time, and occasionally organize software demos and tutorials.

If you’re interested in giving a talk, here are some possible topics:

  • Software development (e.g.: Python, C, C++, CUDA, shell, Matlab);
  • Optimizing force field parameters & EVB models;
  • Cheminformatics (e.g.: RDKit);
  • X-ray and NMR crystallography, including small molecule and macromolecular;
  • Protein & RNA modeling, including Molecular Dynamics;
  • Virtual screening and Docking;
  • Machine Learning;
  • Quantum Methods, including DFT.

Bring your laptops, by the way, if you have something you’d like to show!

Please pass this message on to friends, colleagues, and students who may be interested too!

The main CCK web site is: http://compchemkitchen.org/
Follow us on Twitter: @CompChemKitchen
See you soon! We’re looking forward to seeing and hearing about the diverse range of computational molecular science that you’re cooking up…

—Garrett, Richard, Phil

garrett.morris@stats.ox.ac.uk
richard.cooper@chem.ox.ac.uk
philip.biggin@bioch.ox.ac.uk

CCK-3

Our third  meeting, CCK-3, on Tuesday, July 12th, 2016, will be a little different: it will be a two-parter, CCK-3.1 and CCK-3.2:

The Cambridge Crystallographic Data Centre (CCDC) has been hosting and distributing the Cambridge Structural Database (CSD) for 50 years, and its latest release, CSD 2016, is the “world’s essential database of crystal structures, [with] over 800,000 entries”.


CCK-3.1

1:30-3:30 pm
IT Teaching Laboratory (LG.02) of the Department of Statistics, 24-29 St Giles’, Oxford, OX1 3LB:

Andrew Maloney & Peter Wood (CCDC): CSD Python API Workshop and Hackathon. (We are planning to provide hands-on access to the API, and since  there are only 48 desktops, so you should make sure to book a ticket.)

Register for CCK-3.1.


CCK-3.2

5.00 – 6:00 pm
Talks and refreshments/beer, will again be held in the Abbot’s Kitchen in the Inorganic Chemistry Laboratory, South Parks Road, Oxford, OX1 3QR:

Register for CCK-3.2.

If you have a 5-minute Lightning Talk you’d like to give, get in touch!

Also, if you have some code you’d like demo, bring your laptop/mobile device.
Refreshments will be provided, including beer. (Thank you to Prof. Phil Biggin and the MRC Proximity to Discovery Fund for supporting CCK.)

CSD Python API example

Number of non-H atoms in molecules reported in the Cambridge Structural Database

An unexplained phenomenon in the CSD collection of molecular crystal structures is shown below.

The code below is included in Jerome Wicker’s talk from CCK-1 as a simple example of how to iterate through and extract information from the CSD using the Python API. The phenomenon has been noted many times by CCDC researchers.

The CSD Python API is used to retreive each crystal structure entry from the database using the EntryReader() iterator. The number of heavy atoms (non-hydrogen atoms) in the heaviest molecular component of organic crystal structure is appended to a list heavy_atoms.

Finally a histogram of these heavy atom counts shows that molecules with even numbers of heavy atoms are observed more frequently than those with odd numbers in the same range.

In [6]:
%pylab inline
Populating the interactive namespace from numpy and matplotlib

This page is a copy of an interactive Python 2 notebook exported from Jupyter. In Jupyter, the %pylab inline command above loads numerical and plotting libraries and ensures that plots appear in the notebook instead of in a separate window. It isn’t required if running python from the command line, and the required libraries (numpy and matplotlib) are reimported below for convenience.

In [7]:
from ccdc.io import EntryReader
from matplotlib import pyplot as plt
import numpy as np

csd_reader = EntryReader('CSD')
heavy_atoms = []

for entry in csd_reader:
    if entry.is_organic:
        try:
            mol = entry.molecule.heaviest_component
        except:
            continue
        heavy_atoms.append(len(mol.heavy_atoms))
        
plt.figure()
plt.hist(heavy_atoms,bins=np.max(heavy_atoms))
plt.xlabel('Number of heavy atoms',fontsize=20)
plt.ylabel('Hits in CSD', fontsize=20)
plt.show()

Limiting the x-axis range:

In [11]:
plt.figure()
plt.hist(heavy_atoms,bins=np.max(heavy_atoms))
plt.xlabel('Number of heavy atoms',fontsize=20)
plt.ylabel('Hits in CSD', fontsize=20)
plt.xlim([10,60])
plt.show()
In [ ]:

First meeting

CCK-1 (Tuesday 24th May)

In the spirit of the name, our inaugural meeting, CCK-1 will be held in the Abbott’s Kitchen in the Inorganic Chemistry Laboratory at 5pm on Tuesday May 24th 2016 (5th Week).

Refreshments will be provided.

Jerome Wicker from Chemistry will be speaking about “Machine learning for classification of solid form data extracted from CSD and ZINC”. The software tools discussed include RDKit, CSD, and scikit-learn.

There will also be 2 lightning talks, each ~5 minutes long. Hannah Patel from the Department of Statistics will speak on “Novelty Score: Prioritising compounds that potentially form novel protein-ligand interactions and novel scaffolds using an interaction centric approach”. Software covered will include Django and RDKit. Dr Michael Charlton from InhibOx will also speak on his latest research.