Objectives and competences
The objective of this course is that the student learns statistical methods and how to use them in data science.
Content (Syllabus outline)
• Introduction to data (Research study design, hypothesis, research questions, types of variables, exploratory data analysis, statistical inference)
• Probability and distributions (Defining probability, Conditional probability, Normal distribution, Binomial distribution)
• Foundations for inference (Variability in estimates and the Central Limit Theorem, Confidence intervals, Hypothesis tests, Decision errors, significance, and confidence)
• Inference for numerical variables (t-inference, Power, Comparing three or more means (ANOVA), Repeated measure (MANOVA))
• Inference for categorical variables (Comparing two proportions, Comparing three or more proportions (Chi-square))
• Introduction to linear regression (Relationship between two numerical variables, Linear regression with a single predictor, Outliers in linear regression, Inference for linear regression)
• Multiple linear regression (Regression with multiple predictors, Inference for multiple linear regression, Model selection and diagnostic)
• Bayesian and frequentist inference (Bayes probability, influence of prior beliefs, sequential statistics)
Learning and teaching methods
• predavanja,
• računalniške vaje.
Intended learning outcomes - knowledge and understanding
Knowledge and understanding:
• understand the principles of statistical analysis and its use in data science
• select appropriate statistical methods and tools and employ them for both data analysis and validation of data science analysis outputs
Transferable/Key skills and other attributes:
• Communication skills: effectively present the results of statistical analysis and their interpretation to target groups
• Use of programming tools: use of statistical software and languages.
• Problem-solving in data science: design a research study, set and prove the hypothesis, and set and answer a research question.
Readings
• Peter Bruce, Andrew Bruce, Practical Statistics for Data Scientists: 50 Essential Concepts. O'Reilly Media, May 2017
• Bradley Efron and Trevor Hastie, Computer Age Statistical Inference: Algorithms, Evidence, and Data Science, Institute of Mathematical Statistics Monographs, Jul 2016
• Douglas A. Wolfe and Grant Schneide, Intuitive Introductory Statistics, Springer, Oct 2017
• Tina Štemberger, Univariatne in bivariatne statistične metode v edukaciji, Univerza na Primorskem, 2016
Prerequisits
No prerequisites.
Additional information on implementation and assessment The exam may be replaced by written midterm examinations in the weight of 50%.