List of Data Sets and Challenge Problems
Resources
Software:
We will use Stata, Exploratory, and RStudio for this course. Stata will be used in the first, second, and the third sessions, and we will learn how to clean data sets using Stata, RStudio, and Exploratory in the second session. Here are more information about those software.
Follow these steps to install Exploratory on your computer
Stata:
We will use Stata for small data sets for this class. Do this to purchase Stata (see instructions from Stata office):
Order via
http://www.surveydesign.com.au/buygradplan.html. You will need to pay upfront (direct deposit, Paypal or credit card). From receipt of your order and payment, delivery takes 1-2 business days (usually only overnight - except for weekend orders which are not sent out until Tuesday morning). This lag is due to the fact that their ordering is all handled via the US company.
Exploratory and RStudio
For exploratory data analysis and preprocessing class, we will use Exploratory. You can download free Exploratory software from the following website:
The RStudio will be hosted on the University of Canterbury servers, and we shall provide you with the URL. You will need to log in with your UC username provided you by the UC IT department. You will not need to purchase or download RStudio for this class.
Visit the RStudio to learn more about RStudio
Websites, Journal Articles and Books
Articles
Wickham, H. (2014). Tidy data.
Journal of Statistical Software.
Full Text LinkBlock I:
Harris, A., Reeder, R. N., & Hyun, J. K. (2009). Common statistical and research design problems in manuscripts submitted to high-impact public health journals.
… Open Public Health Journal. (
Full Text Link)
Ellis, T. J., & Levy, Y. (2008). Framework of problem-based research: A guide for novice researchers on the development of a research-worthy problem.
Informing Science: International …. (
Full Text Link)