Here are the slides from Jerome Wicker, Department of Chemistry, that were presented at CCK-1:
Author Archives: Richard Cooper
CSD Python API example
Number of non-H atoms in molecules reported in the Cambridge Structural Database¶
An unexplained phenomenon in the CSD collection of molecular crystal structures is shown below.
The code below is included in Jerome Wicker’s talk from CCK-1 as a simple example of how to iterate through and extract information from the CSD using the Python API. The phenomenon has been noted many times by CCDC researchers.
The CSD Python API is used to retreive each crystal structure entry from the database using the EntryReader() iterator. The number of heavy atoms (non-hydrogen atoms) in the heaviest molecular component of organic crystal structure is appended to a list heavy_atoms.
Finally a histogram of these heavy atom counts shows that molecules with even numbers of heavy atoms are observed more frequently than those with odd numbers in the same range.
%pylab inline
This page is a copy of an interactive Python 2 notebook exported from Jupyter. In Jupyter, the %pylab inline command above loads numerical and plotting libraries and ensures that plots appear in the notebook instead of in a separate window. It isn’t required if running python from the command line, and the required libraries (numpy and matplotlib) are reimported below for convenience.
from ccdc.io import EntryReader
from matplotlib import pyplot as plt
import numpy as np
csd_reader = EntryReader('CSD')
heavy_atoms = []
for entry in csd_reader:
if entry.is_organic:
try:
mol = entry.molecule.heaviest_component
except:
continue
heavy_atoms.append(len(mol.heavy_atoms))
plt.figure()
plt.hist(heavy_atoms,bins=np.max(heavy_atoms))
plt.xlabel('Number of heavy atoms',fontsize=20)
plt.ylabel('Hits in CSD', fontsize=20)
plt.show()
Limiting the x-axis range:
plt.figure()
plt.hist(heavy_atoms,bins=np.max(heavy_atoms))
plt.xlabel('Number of heavy atoms',fontsize=20)
plt.ylabel('Hits in CSD', fontsize=20)
plt.xlim([10,60])
plt.show()
Directions
CCK-1 will be held in the Abbot’s Kitchen in the Inorganic Chemistry Laboratory on South Parks Road. Volunteers should be available before the start of the meeting to direct you to the Abbot’s Kitchen; the document below describes the route from South Parks Road to the meeting room.
Register
For further updates follow us on Twitter: @CompChemKitchen
First meeting
CCK-1 (Tuesday 24th May)
In the spirit of the name, our inaugural meeting, CCK-1 will be held in the Abbott’s Kitchen in the Inorganic Chemistry Laboratory at 5pm on Tuesday May 24th 2016 (5th Week).
Refreshments will be provided.
Jerome Wicker from Chemistry will be speaking about “Machine learning for classification of solid form data extracted from CSD and ZINC”. The software tools discussed include RDKit, CSD, and scikit-learn.
There will also be 2 lightning talks, each ~5 minutes long. Hannah Patel from the Department of Statistics will speak on “Novelty Score: Prioritising compounds that potentially form novel protein-ligand interactions and novel scaffolds using an interaction centric approach”. Software covered will include Django and RDKit. Dr Michael Charlton from InhibOx will also speak on his latest research.
Announcing Comp Chem Kitchen
TL;DR: A new series of meetings for computational chemists, cheminformaticians, and molecular modelers is starting.
Dear Colleagues,
Rob Paton, Richard Cooper and I are launching “Comp Chem Kitchen“, a regular forum and seminar series in Oxford to hear about and discuss computational methods for tackling problems in chemistry, biochemistry and drug discovery. It will principally focus on cheminformatics, computational chemistry, and molecular modelling, and may overlap with neighboring areas such as materials properties and bioinformatics. The first meeting will be at 5 pm in the Abbott’s Kitchen on Tuesday of 5th week (May 24th. 2016).
We’re keen to encourage people involved with coding and methods development (i.e. hackers, in the original untarnished sense of the word) to join us. Our hope is that we will share best practices, even code snippets and software tools, and avoid re-inventing wheels.
In addition to local researchers, we will invite speakers from industry and non-profit from time to time, and occasionally organize software demos and tutorials.
Here are some possible future topics:
• Software development (e.g.: Python, C, C++, CUDA, shell, Matlab, Gnuplot);
• Optimizing force field parameters & EVB models;
• Cheminformatics (e.g.: RDKit);
• X-ray and NMR crystallography, including small molecule and macromolecular;
• Protein & RNA modeling, including Molecular Dynamics;
• Virtual screening and Docking;
• Machine Learning;
• Quantum Methods, including DFT.
CCK-1 (Tuesday 24th May)
In the spirit of the name, our inaugural meeting, CCK-1 will be held in the Abbott’s Kitchen in the Inorganic Chemistry Laboratory at 5pm on Tuesday May 24th 2016 (5th Week).
Refreshments will be provided.
Jerome Wicker from Chemistry will be speaking about “Machine learning for classification of solid form data extracted from CSD and ZINC”. The software tools discussed include RDKit, CSD, and scikit-learn.
There will also be 2 lightning talks, each 5 minutes long. Hannah Patel from the Department of Statistics will speak on “Novelty Score: Prioritising compounds that potentially form novel protein-ligand interactions and novel scaffolds using an interaction centric approach”. Software covered will include Django and RDKit. Dr Michael Charlton from InhibOx will also speak on his latest research.
CCK-2 (Tuesday 14th June)
The second meeting in Trinity term, CCK-2, will also be in the Abbott’s Kitchen on June 14th, 2016 (Tuesday of 8th Week), at 5 pm. Fernanda Duarte from Chemistry will be speaking (Title TBA).
If you have ideas for speakers, or would like to give a talk, let us know. We also invite lightning talks of 5 minutes (or fewer) from attendees, so if you have some cool code you’ve been working on and would like to demo, bring your laptop, smartphone, tablet, (wearable?) and tell us all about it.
Please pass this message on to friends, colleagues, and students who may be interested too!
See you soon: we are looking forward to seeing the diverse range of science that you’re computationally cooking up.
Garrett, Richard, and Rob