Skip to content

Example reports from group & individual coursework during my Masters' in Data Science & Machine Learning 2020/21 at UCL

Notifications You must be signed in to change notification settings

ciaran-coleman/MSc-DSML-coursework

Repository files navigation

UCL MSc Data Science & Machine Learning Coursework

This repository contains a selection of coursework (group and individual) from my Master's course at UCL in 2020/2021. These were selected as all had a formal(ish) report element as a main component of the marking criteria.

COMP0087 (Statistical Natural Language Processing)

This group coursework had a self-selected research topic. We chose to look at NLP in citation recommendation - specifically, the use of the BERT pre-trained language models to encode the citation context.

We proposed to improve upon the (then) current state of the art through simple ideas such as using a domain-adapted BERT, SciBERT, to encode both context as well as metadata. This showed significant improvement in precision and recall metrics.

I contributed significantly to adapting the codebase from previous work; running the experiments; as well as writing and proof-reading the report; and providing a short demo notebook on Google Colab comparing our results to the previous SoTA.

Code was written in Python.

COMP0118 (Computational Modelling for Biomedical Imaging)

This coursework looks at respiratory motion modelling through use of skin surface surrogate signals. The aim was to investigate correspondence models could best map the movement of the surrogate signal to the internal motion of the lungs.

The coursework had both group and individual components to it. The simple tasks were discussed and worked on as a group, with further 'stretch' goals worked on individually. The report is completely individual.

I investigated a number of models, including a rarely seen (or perhaps novel) intuition to also include the surrogate's acceleration as a model parameter. This gave the best results when determining the quality of the model.

Code was written in in Matlab.

STAT0029 (Design of Experiments)

This was a group coursework in which we had to conceptualise, run and analyse an experiment of our choosing. We focussed on a few (of the many) variables to answer which variables affected actual acidity, and not just perceived acidity.

This coursework emphasised the basic principles for sound experimental design and checking that the modelling was suitable to the design.

Code was written in R.

STAT0032 (Introduction to Statistical Data Science)

This was a group coursework in which we were assigned the task of helping a hypothetical small wine shop understand how wine acidity may impact sales.

The project drilled the importance of normality testing and correct selection of hypothesis test based on this.

Code was written in R.

About

Example reports from group & individual coursework during my Masters' in Data Science & Machine Learning 2020/21 at UCL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published