Our third meeting, CCK-3, on Tuesday, July 12th, 2016, will be a little different: it will be a two-parter, CCK-3.1 and CCK-3.2:
The Cambridge Crystallographic Data Centre (CCDC) has been hosting and distributing the Cambridge Structural Database (CSD) for 50 years, and its latest release, CSD 2016, is the “world’s essential database of crystal structures, [with] over 800,000 entries”.
IT Teaching Laboratory (LG.02) of the Department of Statistics, 24-29 St Giles’, Oxford, OX1 3LB:
Andrew Maloney & Peter Wood (CCDC): “CSD Python API Workshop and Hackathon“. (We are planning to provide hands-on access to the API, and since there are only 48 desktops, so you should make sure to book a ticket.)
5.00 – 6:00 pm
Talks and refreshments/beer, will again be held in the Abbot’s Kitchen in the Inorganic Chemistry Laboratory, South Parks Road, Oxford, OX1 3QR:
- Andrew G. P. Maloney & Peter Wood (Cambridge Crystallographic Data Centre): “Harnessing the power of the Cambridge Structural Database in your own way: the CSD Python API“.
- Laura Domicevica (Biggin Group, Department of Biochemistry): Lightning Talk, “LINTools: time-resolved graphical representation of ligand-protein interactions“.
- Tim Dudgeon (Informatics Matters): Lightning Talk, “The Squonk Computational Notebook – making cheminformatics and comp chem accessible to normal people!“.
If you have a 5-minute Lightning Talk you’d like to give, get in touch!
Also, if you have some code you’d like demo, bring your laptop/mobile device.
Refreshments will be provided, including beer. (Thank you to Prof. Phil Biggin and the MRC Proximity to Discovery Fund for supporting CCK.)
Here are the slides from Fernanda Duarte’s talk “Using Valence Bond Theory to Model (Bio)Chemical Reactivity” at CCK-2.
Mike Bodkin (Vice President, Research Informatics, Evotec) will be speaking about “Chemical space and how to warp drive discovery”. Mike says, “I’ll talk about the some of the cheminformatics algorithms and tools we’ve developed for molecule design and discovery.” Should be good!
Our second meeting, CCK-2 will be held in the Abbott’s Kitchen in the Inorganic Chemistry Laboratory, South Parks Road, Oxford, OX1 3QR at 5 pm on Tuesday June 14th 2016 (8th Week). Free tickets are available.
- Mike Bodkin (Vice President, Research Informatics, Evotec), “Chemical space and how to warp drive discovery”.
- Jonathan Yates (Department of Materials, University of Oxford), Lightning talk, “A brief introduction to the Collaborative Computational Project for NMR Crystallography (CCP-NC)”.
- Jonny Brooks-Bartlett (Elspeth Garman Group, Department of Biochemistry); Lightning talk: “The Julia Programming Language”.
- Fernanda Duarte (Rob Paton Group, Department of Chemistry, University of Oxford): Lightning talk: “Exploring biochemical systems using the Empirical Valence Bond (EVB) approach”.
- Matteo Degiacomi (Justin Benesch Group, Department of Chemistry, University of Oxford): Lightning talk: “The Python package BiobOx: a collection of data structures, tools and methods for biomolecular modelling” BiobOx is used for manipulation, measurement, analysis and assembly of atomistic and super coarse-grain structures as well as EM maps.
Talks will take place between 5pm and 6pm, please stay for refreshments and chat afterwards.
We would like to thank Prof. Philip Biggin and the MRC Proximity to Discovery Fund for supporting CCK.
Here are the slides from Jerome Wicker, Department of Chemistry, that were presented at CCK-1:
Here are the slides from Michael Charlton, InhibOx, that were presented at CCK-1:
Number of non-H atoms in molecules reported in the Cambridge Structural Database¶
An unexplained phenomenon in the CSD collection of molecular crystal structures is shown below.
The code below is included in Jerome Wicker’s talk from CCK-1 as a simple example of how to iterate through and extract information from the CSD using the Python API. The phenomenon has been noted many times by CCDC researchers.
The CSD Python API is used to retreive each crystal structure entry from the database using the EntryReader() iterator. The number of heavy atoms (non-hydrogen atoms) in the heaviest molecular component of organic crystal structure is appended to a list heavy_atoms.
Finally a histogram of these heavy atom counts shows that molecules with even numbers of heavy atoms are observed more frequently than those with odd numbers in the same range.
Populating the interactive namespace from numpy and matplotlib
This page is a copy of an interactive Python 2 notebook exported from Jupyter. In Jupyter, the %pylab inline command above loads numerical and plotting libraries and ensures that plots appear in the notebook instead of in a separate window. It isn’t required if running python from the command line, and the required libraries (numpy and matplotlib) are reimported below for convenience.
from ccdc.io import EntryReader from matplotlib import pyplot as plt import numpy as np csd_reader = EntryReader('CSD') heavy_atoms =  for entry in csd_reader: if entry.is_organic: try: mol = entry.molecule.heaviest_component except: continue heavy_atoms.append(len(mol.heavy_atoms)) plt.figure() plt.hist(heavy_atoms,bins=np.max(heavy_atoms)) plt.xlabel('Number of heavy atoms',fontsize=20) plt.ylabel('Hits in CSD', fontsize=20) plt.show()
Limiting the x-axis range:
plt.figure() plt.hist(heavy_atoms,bins=np.max(heavy_atoms)) plt.xlabel('Number of heavy atoms',fontsize=20) plt.ylabel('Hits in CSD', fontsize=20) plt.xlim([10,60]) plt.show()
CCK-1 will be held in the Abbot’s Kitchen in the Inorganic Chemistry Laboratory on South Parks Road. Volunteers should be available before the start of the meeting to direct you to the Abbot’s Kitchen; the document below describes the route from South Parks Road to the meeting room.