UCL

Physics and Astronomy »

Centre for Doctoral Training in Data Intensive Science

24 Apr 2024

STFC review of DIS CDTs

Recently STFC conducted a review of the DIS CDTs nationally. The outcome of the review was communicated recently. We are pleased to announce that the UCL DIS CDT was one of only two CDTs to be ranked in the highest band (A).

Midterm review of Industry Group Projects

On 8th March, the CDT had mid-term presentations from the first year students on their industry group projects. This year's group project industry partners are: ASOS, UKAEA, The Economist, ValTech and WorldRemit. These were all interesting updates on findings so far and aims for the next month up until the project finishes in April 2019.The ASOS project is to develop a machine learning algorithm to classify whether a query received through customer care is of the type "Where is my order" (WISMO). Using natural language processing techniques and a neural network architecture students have so far achieved an accuracy of 91% and a precision of 85% on identifying a WISMO query. They aim to improve this further with their end goal being to deliver a fine-tuned state of the art transformer architecture cast as a multiclass intent classifier for a set of 20 popular customer care queries, using a smaller version of the GPT-2 model released by OpenAI in February 2019. For the project with the UKAEA, our students have been working on automatically calibrating images from monitoring cameras that operate in the very challenging (and shaky!) conditions of nuclear fusion Tokamaks. So far, they have found an algorithm that can reliably identify common features in pairs of images and they are able to map the transformations between them. How well this algorithm works, however, is dependent on the quality of the images, so a lot of work is going into pre-processing the images. The group will spend the next month working on this pre-processing and on stabilising videos from inside the tokamaks.

CDT seminars

The CDT DIS is rolling out a series of CDT seminars with the first of these confirmed to take place on Tuesday 16th April 4pm to 5pm in Pearson building G22 Lecture Theatre. We are pleased to announce that the first speaker is confirmed to be Sofia Vallecorsa (CERN openlab) speaking about: Generative Models in High Energy Physics

Industry Group Project Presentations & networking event

Industry Group Project Presentations & networking event - will take place on 12 Apr 2019. Project presentations will start at 13h45 in the Harrie Massey Lecture Theatre and the networking event will start at 5pm in the Jeremy Bentham Room. Everyone is welcome to join, please register on the Eventbrite link.

STFC UCL Summer School 2018 Guest Lecture videos

We are pleased to announce that the Guest Lecture videos are now available on our CDT website and can be found here.

Profile of current CDT student: Davide Piras

My research project is in cosmology, with a sprinkling of machine learning. The idea behind my research is to run simulations of our universe. The simplest ones are called N-body simulations, which is where you get a box, some particles, and see what happens when you switch on gravity. At some point, some structures will form - there will be a peculiar shape to them. This is what we're looking for, because it's what our universe looks like. Since we need as many of these simulations as we can, the idea is to introduce machine learning; this is the novel approach to the problem. The idea is to say "I have an algorithm that can look at your simulations and learn how they are done, what their structure is".

We are entering the era of big data. We know the traditional methods and we know they work, but we need to boost them - and this is why data intensive science is probably the most promising way to deal with this amount of data, especially in cosmology and astrophysics. Even people doing traditional PhDs are dealing with this problem, but the chance to training specifically in this area, and attend conferences at the intersection of physics, computer science, and statistics has really given me the opportunity to look at these problems from many difference perspectives. Only the CDT in Data Intensive Science could have given me these opportunities.

For people who want to start a PhD, I would say - especially if you have an interest in cosmology, astronomy, physics - we need new tools, we need new ideas, we need new minds. Having a PhD in this field is probably the best way to help and to gain knowledge of data science in general.

Upcoming networking event-12 April

We would like to invite you to our CDT event in the afternoon of 12 April which will comprise student-industry group project presentations from the 1st year cohorts, pitches from industry partners on placement project offerings and drinks, pizza & networking in the JBR 5pm-7pm. Please register here.