Describe the data you plan to work with. Make sure to provide a link to the original data source.
The dataset we plan to work with is: https://databank.worldbank.org/source/world-development-indicators. This dataset contains measures of development indexes through various different methods, including common econometric values such as fertility rates but also has further detailed measures (such as percentage of child population attending educational institutes). This is a vast dataset, spanning across many different developed/developing countries.
Describe why and how the data was originally collected.
Our main objective was to explore a dataset which can help us link various economic measures to the development of countries, particularly for developing economies. Naturally, our primary source became the World Bank database, which contains an exhaustive list of important economic metrics which measure development levels in different countries across the globe. The word bank collects this data to better understand the economic state of the world in order to make policy decisions.
Are you able to load/clean the data?
We are able to load the data. Given the size of the dataset, we anticipate that a large part of our report will consist of data cleaning/wrangling work combined with some exploratory data analysis. After this, we will produce visualisations and graphics which help us answer some of our many questions including, as an example, the impact of increased foreign direct investment on the development (growth rate) of a developing country.
What are the main questions you hope to address?
We hope to explore the main determinants of development levels in developing countries. As we explore the dataset, we hope to narrow the various different measures to a concise list which can help us focus on some particular metrics which in turn will lead us to analyse the main drivers of growth. In particular, we would also like to focus on the extent of added investment expenditure into a developing country on its growth levels. To do this, we anticipate using measures of GDP (where investment is a large component) and values of FDI into the country in focus. We hope to present a thorough analysis which highlights the main drivers for growth in a developing economy.
What challenges do you foresee?
We are fortunate to have a dataset exhaustive of all the possible measure we may want to explore. However, our main problem/concern may originate from this. Dealing with a large dataset involves a lot of time spent in the preparation stage. We hope that our knowledge in R thus far will help us filter out the dataset and focus on relevant values however, not having tons of experience with such large datasets may be a problem.