Abstract

The success of machine learning has been demonstrated time and time again in classification, generative modelling, and reinforcement learning. This revolution in machine learning has largely been in domains with at least one of two key properties: (1) the input space is continuous, and thus classifiers and generative models are able to smoothly model unseen data that is ‘similar’ to the training distribution, or (2) it is trivial to generate data, such as in controlled reinforcement learning settings such as Atari or Go games, where agents can re-play the game millions of times. Unfortunately there are many important learning problems in chemistry, physics, materials science, and biology that do not share these attractive properties, problems where the input is molecular or material data.

Accurate prediction of atomistic properties is a crucial ingredient toward rational compound design in chemical and pharmaceutical industries. Many discoveries in chemistry can be guided by screening large databases of computational molecular structures and properties, but high level quantum-chemical calculations can take up to several days per molecule or material at the required accuracy, placing the ultimate achievement of in silico design out of reach for the foreseeable future. In large part the current state of the art for such problems is the expertise of individual researchers or at best highly-specific rule-based heuristic systems. Efficient methods in machine learning, applied to the prediction of atomistic properties as well as compound design and crystal structure prediction, can therefore have pivotal impact in enabling chemical discovery and foster fundamental insights.

Because of this, in the past few years there has been a flurry of recent work towards designing machine learning techniques for molecule and material data [1-39]. These works have drawn inspiration from and made significant contributions to areas of machine learning as diverse as learning on graphs to models in natural language processing. Recent advances enabled the acceleration of molecular dynamics simulations, contributed to a better understanding of interactions within quantum many-body system and increased the efficiency of density based quantum mechanical modeling methods. This young field offers unique opportunities for machine learning researchers and practitioners, as it presents a wide spectrum of challenges and open questions, including but not limited to representations of physical systems, physically constrained models, manifold learning, interpretability, model bias, and causality.

The goal of this workshop is to bring together researchers and industrial practitioners in the fields of computer science, chemistry, physics, materials science, and biology all working to innovate and apply machine learning to tackle the challenges involving molecules and materials. In a highly interactive format, we will outline the current frontiers and present emerging research directions. We aim to use this workshop as an opportunity to establish a common language between all communities, to actively discuss new research problems, and also to collect datasets by which novel machine learning models can be benchmarked. The program is a collection of invited talks, alongside contributed posters. A panel discussion will provide different perspectives and experiences of influential researchers from both fields and also engage open participant conversation. An expected outcome of this workshop is the interdisciplinary exchange of ideas and initiation of collaboration.

Schedule

08:30 Opening remarks Brooks Paige
08:40 Invited talk Boltzmann Generators – Sampling Equilibrium States of Many-Body Systems with Deep Learning Frank Noé
09:00 Invited talk Deep Generative Models for Knowledge-Free Molecular Geometry Kyunghyun Cho
09:20 Contributed talk Band gap prediction for large organic crystal structures with machine learning
09:30 Contributed talk Uncertainty quantification of molecular property prediction using Bayesian neural network models
09:40 Contributed talk Powerful, transferable representations for molecules through intelligent task selection in deep multitask networks
09:50 Contributed talk Incomplete Conditional Density Estimation for Fast Materials Discovery
10:00 Poster session
10:30 Coffee break
11:00 Invited talk Generative deep models for predicting the effects of mutations John Ingraham
11:20 Invited talk Tensor Field Networks: rotation-, translation-, and permutation-equivariant convolutional NNs for 3D points Tess Smidt
11:40 Invited talk Deep Reinforcement Learning for de-novo Drug Design Olexandr Isayev
12:00 Lunch
14:00 Invited talk A translation approach to molecular graph optimization Wengong Jin
14:20 Invited Talk Predicting Electron-Ionization Mass Spectrometry using Neural Networks Jennifer Wei
14:40 Invited talk Statistical Perspective on Chemical Space with Quantum Mechanics and Machine Learning Alexandre Tkatchenko
15:00 Coffee break
15:30 Invited talk Application of graph neural networks in molecule design Alex Gaunt
15:50 Invited talk Design of Coarse-grained Molecular Models with Machine Learning Cecilia Clementi
16:10 Invited talk Covariant neural network architectures for learning physics Risi Kondor
16:30 Contributed talk Learning protein structure with a differentiable simulator
16:40 Contributed talk Generating equilibrium molecules with deep neural networks
16:50 Contributed talk Molecular Transformer for Chemical Reaction Prediction and Uncertainty Estimation
17:00 Contributed talk Steerable Wavelet Scattering for 3D Atomic Systems with Application to Li-Si Energy Prediction
17:10 Closing remarks Brooks Paige
17:20 Poster session

Accepted Papers

Graph-Based Network using Attention Mechanism for Predicting Molecular Properties
Amir H. K. Ahmadi, Parsa Moradi, Babak H. Khalaj
Efficient prediction of 3D electron densities using machine learning [arXiv]
Mihail Bogojeski, Felix Brockherde, Leslie Vogt-Maranto, Li Li, Mark E. Tuckerman, Kieron Burke, Klaus-Robert Müller
Spotlight Talk Generating equilibrium molecules with deep neural networks [arXiv]
Niklas W. A. Gebauer, Michael Gastegger, Kristof T. Schütt
Spotlight Talk Powerful, transferable representations for molecules through intelligent task selection in deep multitask networks [arXiv]
Clyde Fare, Lukas Turcani, Edward O. Pyzer-Knapp
Spotlight Talk Incomplete Conditional Density Estimation for Fast Materials Discovery [GitHub]
Phuoc Nguyen, Truyen Tran, Sunil Gupta, Santu Rana, Svetha Venkatesh
Spotlight Talk Learning protein structure with a differentiable simulator
John Ingraham, Adam Riesselman, Chris Sander, Debora Marks
Spotlight Talk Predicting Electron-Ionization Mass Spectrometry using Neural Networks
Jennifer N. Wei, David Belanger, Ryan P. Adams, D. Sculley
Spotlight Talk Band gap prediction for large organic crystal structures with machine learning [arXiv]
Bart Olsthoorn, R. Matthias Geilhufe, Stanislav S. Borysov, Alexander V. Balatsky
Spotlight Talk Uncertainty quantification of molecular property prediction using Bayesian neural network models [arXiv]
Seongok Ryu, Yongchan Kwon, Woo Youn Kim
Batched Stochastic Bayesian Optimization via Combinatorial Constraints Design
Kevin K. Yang, Yuxin Chen, Alycia Lee, Yisong Yue
Spotlight Talk Steerable Wavelet Scattering for 3D Atomic Systems with Application to Li-Si Energy Prediction [arXiv]
Xavier Brumwell, Paul Sinz, Kwang Jin Kim, Yue Qi, Matthew Hirn
Chemical Structure Elucidation from Mass Spectrometry by Matching Substructures [arXiv]
Jing Lim, Joshua Wong, Minn Xuan Wong, Lee Han Eric Tan, Hai Leong Chieu, Davin Choo, Neng Kai Nigel Neo
Design by Adaptive Sampling
David H. Brookes, Jennifer Listgarten
Spotlight Talk Molecular Transformer for Chemical Reaction Prediction and Uncertainty Estimation [chemRxiv]
Philippe Schwaller, Teodoro Laino, Théophile Gaudin, Peter Bolgar, Costas Bekas, Alpha A. Lee
Descriptor for Separating Base-material and Additive in Machine Learning of Thermoelectric Material Property Prediction
Reiko Hagawa, Hiromasa Tamaki, Koji Morikawa
Convolutional models of RNA energetics [bioRxiv]
Michelle J. Wu
PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Molecules [arXiv]
Payel Das, Kahini Wadhawan, Oscar Chang, Tom Sercu, Cicero dos Santos, Matthew Riemer, Vijil Chenthamarakshan, Inkit Padhi, Aleksandra Mojsilovic
Transferrable End-to-End Learning for Protein Interface Prediction [arXiv]
Raphael J. L. Townshend, Rishi Bedi, Ron O. Dror
Pre-training Graph Neural Networks with Kernels [arXiv]
Nicolò Navarin, Dinh V. Tran, Alessandro Sperduti
DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation [arXiv]
Rim Assouel, Mohamed Ahmed, Marwin H. Segler, Amir Saffari, Yoshua Bengio
TorchProteinLibrary: A computationally efficient, differentiable representation of protein structure
Georgy Derevyanko, Guillaume Lamoureux
CheMixNet: Mixed DNN Architectures for Predicting Chemical Properties using Multiple Molecular Representations [arXiv]
Arindam Paul, Dipendra Jha, Reda Al-Bahrani, Wei-keng Liao, Alok Choudhary, Ankit Agrawal
Generative Modeling for Multimodal Structure-Based Drug Design
Miha Skalic, Davide Sabbadin, Gianni De Fabritiis
N-Gram Graph, A Novel Molecule Representation [GitHub]
Shengchao Liu, Thevaa Chandereng, Yingyu Liang
Neural Reasoning for Chemical-Chemical Interaction [GitHub]
Trang Pham, Truyen Tran, Svetha Venkatesh
MT-CGCNN: Integrating Crystal Graph Convolutional Neural Network with Multitask Learning for Material Property Prediction [arXiv]
Soumya Sanyal, Janakiraman Balachandran, Naganand Yadati, Abhishek Kumar, Padmini Rajagopalan, Suchismita Sanyal, Partha Talukdar
Dataset Bias in the Natural Sciences: A Case Study in Chemical Reaction Prediction and Synthesis Design [chemRxiv]
Ryan-Rhys Griffiths, Philippe Schwaller, Alpha A. Lee
Bayesian Optimization of High Transparency, Low Haze, and High Oil Contact Angle Rigid and Flexible Optoelectronic Substrates
Sajad Haghanifar, Sooraj Sharma, Luke M. Tomasovic, and Paul. W. Leu, Bolong Cheng
Fast classification of small X-ray diffraction datasets using physics-based data augmentation and deep neural networks
Felipe Oviedo, Zekun Ren, Shijing Sun, Charlie Settens, Zhe Liu, Giuseppe Romano, Tonio Buonassisi, Ramasamy Savitha, Siyu I.P. Tian, Brian L. DeCost, Aaron Gilad Kusne
PaccMann: Prediction of anticancer compound sensitivity with multi-modal attention-based neural networks [arXiv]
Ali Oskooei, Jannis Born, Matteo Manica, Vigneshwari Subramanian, Julio Sáez-Rodríguez, María Rodríguez Martínez
Multiple-objective Reinforcement Learning for Inverse Design and Identification
Haoran Wei, Mariefel Olarte, Garrett B. Goh
Efficient nonmyopic active search with applications in drug and materials discovery [arXiv]
Shali Jiang, Gustavo Malkomes, Benjamin Moseley, Roman Garnett
Inference of the three-dimensional chromatin structure and its temporal behavior [arXiv]
Bianca-Cristina Cristescu, Zalán Borsos, John Lygeros, María Rodríguez Martínez, Maria Anna Rapsomaniki
Neural Message Passing with Edge Updates for Predicting Properties of Molecules and Materials [arXiv]
Peter Bjørn Jørgensen, Karsten Wedel Jacobsen, Mikkel N. Schmidt
Independent Vector Analysis for Data Fusion Prior to Molecular Property Prediction with Machine Learning [arXiv]
Zois Boukouvalas, Daniel C. Elton, Peter W. Chung, Mark D. Fuge
Optimizing Interface/Surface Roughness for Thermal Transport
Shenghong Ju, Thaer M. Dieb, Koji Tsuda, Junichiro Shiomi
Graph Convolutional Neural Networks for Polymers Property Prediction [arXiv]
Minggang Zeng, Jatin Nitin Kumar, Zeng Zeng, Ramasamy Savitha, Vijay Ramaseshan Chandrasekhar, Kedar Hippalgaonkar
Generative Model for Material Experiments Based on Prior Knowledge and Attention Mechanism [arXiv]
Mincong Luo, X. He, Li Liu
Physics-aware Deep Generative Models for Creating Synthetic Microstructures [arXiv]
Rahul Singh, Viraj Shah, Balaji Pokuri, Soumik Sarkar, Baskar Ganapathysubramanian, Chinmay Hegde
Modelling Non-Markovian Quantum Processes with Recurrent Neural Networks [arXiv]
Leonardo Banchi, Edward Grant, Andrea Rocchetto, Simone Severini
Optimizing Photonic Nanostructures via Multi-fidelity Gaussian Processes [arXiv]
Jialin Song, Yury S. Tokpanov, Yuxin Chen, Dagny Fleischman, Kate T. Fountaine, Harry A. Atwater, Yisong Yue
Interpretable deep learning for guided structure-property explorations in photovoltaics [arXiv]
Balaji Sesha Sarath Pokuri, Sambuddha Ghosal, Apurva Kokate, Baskar Ganapathysubramanian, Soumik Sarkar
Analysis of Atomistic Representations Using Weighted Skip-Connections [arXiv]
Kim A. Nicoli, Pan Kessel, Michael Gastegger, Kristof T. Schütt
Predicting thermoelectric properties from crystal graphs and material descriptors – first application for functional materials [arXiv]
Leo Laugier, Daniil Bash, Jose Recatala, Hong Kuan Ng, Savitha Ramasamy, Chuan-Sheng Foo, Vijay R. Chandrasekhar, Kedar Hippalgaonkar
3D Deep Learning with voxelized atomic configurations for modeling atomistic potentials in complex solid-solution alloys [arXiv]
Rahul Singh, Aayush Sharma, Onur Rauf Bingol, Aditya Balu, Ganesh Balasubramanian, Duane D. Johnson, Soumik Sarkar
High Quality Protein Q8 Secondary Structure Prediction by Diverse Neural Network Architectures [arXiv]
Iddo Drori, Isht Dwivedi, Pranav Shrestha, Jeffrey Wan, Yueqi Wang, Yunchu He, Anthony Mazza, Hugh Krogh-Freeman, Dimitri Leggas, Kendal Sandridge, Linyong Nan, Kaveri Thakoor, Chinmay Joshi, Sonam Goenka, Chen Keasar, Itsik Pe’er
Deep Learning and Density Functional Theory [arXiv]
Kevin Ryczko, David Strubbe, Isaac Tamblyn
Spectral Multigraph Networks for Discovering and Fusing Relationships in Molecules [arXiv]
Boris Knyazev, Xiao Lin, Mohamed R. Amer, Graham W. Taylor
Leveraging Sequence Embedding and Convolutional Neural Network for Protein Function Prediction
Wei-Cheng Tseng, Po-Han Chi, Jia-Hua Wu, Min Sun

Contact

Please direct any questions to nips2018moleculesworkshop@gmail.com.


References

[1]
Behler, J., Lorenz, S., Reuter, K. (2007). Representing molecule-surface interactions with symmetry-adapted neural networks. J. Chem. Phys., 127(1), 07B603.
[2]
Behler, J., Parrinello, M. (2007). Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett., 98(14), 146401.
[3]
Kang, B., Ceder, G. (2009). Battery materials for ultrafast charging and discharging. Nature, 458(7235), 190.
[4]
Bartók, A. P., Payne, M. C., Kondor, R., Csányi, G. (2010). Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett., 104(13), 136403.
[5]
Behler, J. (2011). Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys, 134(7), 074106.
[6]
Behler, J. (2011). Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations. Phys. Chem. Chem. Phys., 13(40), 17930-17955.
[7]
Rupp, M., Tkatchenko, A., Müller, K.-R., von Lilienfeld, O. A. (2012). Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett., 108(5), 058301.
[8]
Snyder, J. C., Rupp, M., Hansen, K., Müller, K.-R., Burke, K. (2012). Finding density functionals with machine learning. Phys. Rev. Lett., 108(25), 253002.
[9]
Montavon, G., Rupp, M., Gobre, V., Vazquez-Mayagoitia, A., Hansen, K., Tkatchenko, A., Müller, K.-R., von Lilienfeld, O. A. (2013). Machine learning of molecular electronic properties in chemical compound space. New J. Phys., 15(9), 095003.
[10]
Hansen, K., Montavon, G., Biegler, F., Fazli, S., Rupp, M., Scheffler, M., Tkatchenko, A., Müller, K.-R. (2013). Assessment and validation of machine learning methods for predicting molecular atomization energies. J. Chem. Theory Comput., 9(8), 3404-3419.
[11]
Bartók, A. P., Kondor, R., Csányi, G. (2013). On representing chemical environments. Phys. Rev. B, 87(18), 184115.
[12]
Schütt K. T., Glawe, H., Brockherde F., Sanna A., Müller K.-R., Gross E. K. U. (2014). How to represent crystal structures for machine learning: towards fast prediction of electronic properties. Phys. Rev. B., 89(20), 205118.
[13]
Ramsundar, B., Kearnes, S., Riley, P., Webster, D., Konerding, D., Pande, V. (2015). Massively multitask networks for drug discovery. arXiv preprint arXiv:1502.02072.
[14]
Rupp, M., Ramakrishnan, R., & von Lilienfeld, O. A. (2015). Machine learning for quantum mechanical properties of atoms in molecules. J. Phys. Chem. Lett., 6(16), 3309-3313.
[15]
V. Botu, R. Ramprasad (2015). Learning scheme to predict atomic forces and accelerate materials simulations., Phys. Rev. B, 92(9), 094306.
[16]
Hansen, K., Biegler, F., Ramakrishnan, R., Pronobis, W., von Lilienfeld, O. A., Müller, K.-R., Tkatchenko, A. (2015). Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space. J. Phys. Chem. Lett, 6(12), 2326-2331.
[17]
Alipanahi, B., Delong, A., Weirauch, M. T., Frey, B. J. (2015). Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. ‎Nat. Biotechnol., 33(8), 831-838.
[18]
Duvenaud, D. K., Maclaurin, D., Aguilera-Iparraguirre, J., Gomez-Bombarelli, R., Hirzel, T., Aspuru-Guzik, A., Adams, R. P. (2015). Convolutional networks on graphs for learning molecular fingerprints. NeurIPS, 2224-2232.
[19]
Faber F. A., Lindmaa A., von Lilienfeld, O. A., Armiento, R. (2016). Machine learning energies of 2 million elpasolite (A B C 2 D 6) crystals. Phys. Rev. Lett., 117(13), 135502.
[20]
Gomez-Bombarelli, R., Duvenaud, D., Hernandez-Lobato, J. M., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P., Aspuru-Guzik, A. (2016). Automatic chemical design using a data-driven continuous representation of molecules. arXiv preprint arXiv:1610.02415.
[21]
Wei, J. N., Duvenaud, D, Aspuru-Guzik, A. (2016). Neural networks for the prediction of organic chemistry reactions. ACS Cent. Sci., 2(10), 725-732.
[22]
Sadowski, P., Fooshee, D., Subrahmanya, N., Baldi, P. (2016). Synergies between quantum mechanics and machine learning in reaction prediction. J. Chem. Inf. Model., 56(11), 2125-2128.
[23]
Lee, A. A., Brenner, M. P., Colwell L. J. (2016). Predicting protein-ligand affinity with a random matrix framework. Proc. Natl. Acad. Sci., 113(48), 13564-13569.
[24]
Behler, J. (2016). Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys., 145(17), 170901.
[25]
De, S., Bartók, A. P., Csányi, G., Ceriotti, M. (2016). Comparing molecules and solids across structural and alchemical space. Phys. Chem. Chem. Phys., 18(20), 13754-13769.
[26]
Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K.-R., Tkatchenko, A. (2017). Quantum-chemical insights from deep tensor neural networks. Nat. Commun., 8, 13890.
[27]
Segler, M. H., Waller, M. P. (2017). Neural‐symbolic machine learning for retrosynthesis and reaction prediction. ‎Chem. Eur. J., 23(25), 5966-5971.
[28]
Kusner, M. J., Paige, B., Hernández-Lobato, J. M. (2017). Grammar variational autoencoder. arXiv preprint arXiv:1703.01925.
[29]
Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H., Jensen K. F. (2017). Prediction of organic reaction outcomes using machine learning. ACS Cent. Sci., 3(5), 434-443.
[30]
Altae-Tran, H., Ramsundar, B., Pappu, A. S., Pande, V. (2017). Low data drug discovery with one-shot learning. ACS Cent. Sci., 3(4), 283-293.
[31]
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., Dahl, G. E. (2017). Neural message passing for quantum chemistry. arXiv preprint arXiv:1704.01212.
[32]
Chmiela, S., Tkatchenko, A., Sauceda, H. E., Poltavsky, Igor, Schütt, K. T., Müller, K.-R. (2017). Machine learning of accurate energy-conserving molecular force fields. Sci. Adv., 3(5), e1603015.
[33]
Ju, S., Shiga T., Feng L., Hou Z., Tsuda, K., Shiomi J. (2017). Designing nanostructures for phonon transport via bayesian optimization. Phys. Rev. X, 7(2), 021024.
[34]
Ramakrishnan, R, von Lilienfeld, A. (2017). Machine learning, quantum chemistry, and chemical space. Reviews in Computational Chemistry, 225-256.
[35]
Hernandez-Lobato, J. M., Requeima, J., Pyzer-Knapp, E. O., Aspuru-Guzik, A. (2017). Parallel and distributed Thompson sampling for large-scale accelerated exploration of chemical space. arXiv preprint arXiv:1706.01825.
[36]
Smith, J., Isayev, O., Roitberg, A. E. (2017). ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci., 8(4), 3192-3203.
[37]
Brockherde, F., Li, L., Burke, K., Müller, K.-R. By-passing the Kohn-Sham equations with machine learning. Nat. Commun., 8, 872.
[38]
Schütt, K. T., Kindermans, P. J., Sauceda, H. E., Chmiela, S., Tkatchenko, A., Müller, K. R. (2017). MolecuLeNet: A continuous-filter convolutional neural network for modeling quantum interactions. NeurIPS 2017.
[39]
Chmiela, S., Sauceda, H. E., Müller, K.-R., Tkatchenko, A. Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields. Nat. Commun., 9(1), 3887.