In this project, you will demonstrate your understanding and mastery of programming in Python using data science tools, in addition to your understanding of the different research methods that use data science. What you learnt so far should cover almost everything you will need, so if you are stuck then read through the notebooks again. Some of your tutorials and exercises may be also relevant. For Problem 4, you may need to consult the “Bit by Bit” book1 . If you are still unsure, then have a look online. Google and Stack OverFlow are your friends. You must use Python only; MATLAB and Excel are not acceptable. For Problem 3, using command line for preprocessing is acceptable, but you will need to provide your script in a .sh file! The grade of this project will be calculated out of 100, and it will contribute 70% towards your overall grade in the module. The following aspects need to be shown: ? Manipulation of different data structures including dictionaries and data frames. ? Preparing and preprocessing data. ? Doing a basic plot, and changing plot markers, colors, etc. ? Improving and extending analysis. ? Showing an understanding of different research methods and designs for using data science. ? Showing the ability to critique some scientific approaches. Your submission will be a compressed file (.zip) containing: 1. A copy of your Python script named solution_code.ipynb (done in Jupyter Notebook). Your scripts must be sufficient to reproduce your answers to Problems 1-3. 2. A PDF file named P4_answers.pdf that contains your answers to Problem 4. The deadline is Tuesday 28th April at 3:00 PM (UTC).