18-06-2019 : ASOS Group Project Particpants
Intent Modelling from Natural Language
We study the performance of customer intent classifiers to predict the most popular intent received through ASOS customer care, namely Where is my order?. We conduct extensive experiments to compare the accuracy of two popular classification models: logistic regression via N-grams that account for sequences in the data, and recurrent neural networks that perform the extraction of sequential patterns automatically. A Mann-Whitney U test indicated that F1 score on a representative sample of held out labelled messages was greater for linear N-grams classifiers than for recurrent neural networks classifiers (M1= 0.828, M2=0.815 ; U = 1,196, P = 1.46e−20), unless all neural layers including the word representation layer were trained jointly on the classification task (M1= 0.831, M2=0.828, U = 4,280, P = 8.24e−4). Overall our results indicate that using simple linear models in modern AI production systems is a judicious choice unless the necessity for higher accuracy significantly outweighs the cost of much longer training times.
11-06-2019 : Simon Arridge (Institute of Inverse Problems, UCL)
Combining learned and model based approaches for inverse problems
Deep Learning (DL) has become a pervasive approach in many machine learning tasks and in particular in image processing problems such as denoising, deblurring, inpainting and segmentation. Such problems can be classified as inverse problems where the forward operator is a mapping from image to image space. More generally, inverse problems (IPs) involve the inference of solutions from data obtained by measurements in a data-space with quite different properties to the image and result from a forward operator that may have spectral and range constraints. Inverse problems are typically ill-posed, exhibiting one or more of the characteristic difficulties : existance, uniqueness and/or instability, as described by Hadamard's original classification. Thus the application of DL within inverse problems is less well explored because it is not trivial to include Physics based knowledge of the forward operator into what is usually a purely data-driven framework. In addition some inverse problems are at a scale much larger than image or video processing applications and may not have access to sufficiently large training sets. Some approaches to this idea consist of i) fully learned (end-to-end) sytems mapping data directly into a solution, ii) postprocessing methods which perform a straightforward solution method such as back-projection (adjoint operation) followed by "de-artefacting" to enhance the solution by treating artefacts as noise with a particular structure, iii) iterative methods that unroll a variational solution and apply networks as a generalisation of a proximal operator, iv) learned regularisation where training sets are used to construct an equivalent prior distribution, followed by classical variational methods. Finally there are a class of methods in which the forward operator is learned, either by correcting a simple and computationally cheap operator by learning in the data domain, or by learning a physical model by interpreting the kernels of a feed-forward network as a generalisation of a PDE with the layers representing time-evolution.
In this talk I will present some of our work within this framework. I will give examples from cardiac magnetic resonance imaging (MRI), photoacoustic tomography (PAT) and non-linear image diffusion, amongst others applications.
Joint work with : Marta Betcke, Andreas Hauptmann, Felix Lucka.
04-06-2019 : Dr Hao Ni (Mathematics, UCL)
Learning to predict the effects of data streams using Logsig-RNN model
Supervised learning problems using streamed data (a path) as input are important due to various applications in computer vision, e.g. automatic character identification based on the pen trajectory (online handwritten character recognition) and gesture recognition in videos. Recurrent neural networks (RNN) are one kind of very popular neural networks, which have strength in supervised learning on the path space and have been a success in various computer vision applications like gesture recognition. Stochastic differential equations (SDEs) are the foundational building blocks in the derivatives pricing theory, an area of huge financial impact. Motivated by the numerical approximation theory of SDEs, we propose a novel and effective algorithm (Logsig-RNN model) to tackle this problem by combining the log signature feature set and RNN. The log-signature serves a top-down description of data stream to capture its effects economically, which further improves the performance of RNN significantly as a feature set. Compared with a RNN based on raw data alone, the proposed method achieves better accuracy, efficiency and robustness on various data sets (synthetic data generated by a SDE, UCI Pen-Digit data and gesture recognition ChaLearn2013 data). In ChaLearn 2013 data (skeleton data only), the proposed method achieves state-of-the-art classification accuracy
21-05-2019 : Shirley Ho (Flatiron Institute)
Machine Learning the Universe: Opening the Pandora Box
Scientists have always attempted to identify and document analytic laws that underlie physical phenomena in nature. The process of finding natural laws has always been a challenge that requires not only experimental data, but also theoretical intuition. Often times, these fundamental physical laws are derived from many years of hard work over many generation of scientists. Automated techniques for generating, collecting, and storing data have become increasingly precise and powerful, but automated discovery of natural laws in the form of analytical laws or mathematical symmetries have so far been elusive. Over the past few years, the application of deep learning to domain sciences – from biology to chemistry and physics is raising the exciting possibility of a data-driven approach to automated science, that makes laborious hand-coding of semantics and instructions that is still necessary in most disciplines seemingly irrelevant. The opaque nature of deep models, however, poses a major challenge. For instance, while several recent works have successfully designed deep models of physical phenomena, the models do not give any insight into the underlying physical laws. This requirement for interpretability across a variety of domains, has received diverse responses. In this talk, I will present our analysis which suggests a surprising alignment between the representation in the scientific model and the one learned by the deep model.
14-05-2019 : Sofia Olhede (Statistics, UCL)
Detecting spatial and point process associations
Point processes are challenging to analyse because of all spatial processes they contain the least information. Understanding their pattern then becomes an exercise in balancing the complexity of any model versus the tractability of evaluating any proposed likelihood function. Testing for associations is equally challenging, and if many tests need to be implemented, it becomes challenging to ballance different types of errors. I will discuss both likelihood approximations, and the intricacies of testing in this setting.
16-04-2019 : Sofia Vallecorsa (CERN openlab)
Generative Models in High Energy Physics
Theoretical and algorithmic advances, availability of data, and computing power are driving AI. Specifically, in the Deep Learning (DL) domain, these advances have opened the door to exceptional perspectives for application in the most diverse fields of science, business and society at large, and notably in High Energy Physics (HEP). The HEP community has a long tradition of using Machine Learning methods to solve tasks mostly related to efficient selection of interesting events against the overwhelming background produced at colliders. Today, many HEP experiments are working on integrating Deep Learning into their workflows for different applications: from data quality assurance, to real-time selection of interesting collision events, simulation and data analysis. In particular, Generative Models are being developed as fast alternatives to Monte Carlo based simulation. Generative models are among the most promising approaches to analyse and understand the amount of information next generation detectors will produce.
Training of such models has been made tractable thanks to algorithmic improvement and the advent of dedicated hardware, well adapted to tackle the highly-parallelizable task of training neural networks. High performance storage and computing (HPC) technologies are often required by these kind of projects, together with the availability of HPC multi-architecture frameworks (ranging from large multi-core systems to hardware accelerators like GPUs and FPGAs). Thanks to its unique role as a catalyst for collaborations between our community, leading ICT companies and other research organisations, CERN openlab is involved in a large set of Deep Learning and AI projects within the HEP community and beyond. This talk will present an overview of these activities.