The Sustainable Development Goals (SDGs) or Global Goals are a collection of 17 interlinked global goals designed to be a "blueprint to achieve a better and more sustainable future for all". The SDGs were set up in 2015 by the United Nations General Assembly and are intended to be achieved by the year 2030. The SDGs were adopted by the United Nations in 2015 as a universal call to action to end poverty, protect the planet, and ensure that by 2030 all people enjoy peace and prosperity. You are required to select any datasets that is related to any of these SDGs that contains at least 1000 observations and at least FIVE (5) attributes from any reliable source. From the chosen dataset, identify and use attributes that are suitable to be used to develop Multiple Linear Regression (MLR) model. Justify your choices in selecting the attributes by citing any material from reliable sources (journal, books, conference papers or any online information). Perform detailed analyses by considering the assumptions, the attributes criteria, and characteristics of MLR and anything relevant while developing the model. Please also demonstrate the capability of model to predict the dependent variable by choosing any value from your dataset.
NOTES:
• The link and the description of the selected dataset should be provided, and the dataset should NOT have been used in the lectures or labs of the course.
• Describe data set information such as number of instances/ features/ attributes/ columns, number of dataset/rows, area/ domain/ field, and/or missing value(s) if any.
• Any preprocessing method (e.g. removal or filling of empty cells) performed on the original data needs to be fully described and shown.
• Your analyses shall include the descriptions of your Python codes and plots.