Page 92 - Handbook Bachelor Degree of Science Academic Session 20212022
P. 92
Faculty of Science Handbook, Academic Session 2021/2022
SIT3018 NON-PARAMETRIC STATISTICS
SIT3016 GENERALIZED LINEAR MODELS
Introduction to hypothesis testing, sign test and signed rank
Introduction to generalized linear model based on the
exponential family. For example, multiple linear regression test, Mann-Whittney test, Kruskal-Wallis test, runs test,
for normal data, logistic regression for binary data, Poisson contingency tables, median test, goodness of fit test,
Smirnov
test,
test, Kolmogorov
Spearman's
rank
regression for counts, log linear for contingency table, and permutation test, kernel density estimation, spline regression
gamma regression for continuous non-normal data.
estimation.
Study the theory of GLM including estimation and inference.
Assessment:
Continuous Assessment: 40%
Introduction to fitting GLM in R.
Final Examination: 60%
Focus on the analysis of data: binary, count and continuous,
model selection, model evaluation, interpretation, prediction References:
and residual analysis. 1. Sprent, P. & Smeeton, N.C. (2007). Applied
Nonparametric Statistical Methods, 4th Edition,
Chapman & Hall/CRC.
Assessment: 2. Myles, H., Douglas, A. W., Eric, C. (2014).
Continuous Assessment: 40%
Final Examination: 60% Nonparametric Statistical Methods, 3rd Edition, John
Wiley & Sons.
3. Daniel, W. W. (1990). Applied Nonparametric
References: Statistics, 2nd Edition, Boston: PWS-Kent Publishing
1. Dobson, A.J. & Barnett, A.G. (2008). An Introduction to Company.
Generalized Linear Models. 3rd Ed., Chapman &
Hall/CRC. 4. Mayer, A. & Philip, L. H. Y. (2018). A Parametric
2. McCullagh P. & Nelder J.A. (1989). Generalized Linear Approach to Nonparametric Statistics, 1st Edition,
Springer.
Models. 2nd Ed., Chapman & Hall.
3. Myers R.H., Montgomery D.C., Vining G.G., Robinson
T.J. (2010). Generalized Linear Models: with
Applications in Engineering and the Sciences. 2nd Ed., SIT3019 INTRODUCTION TO BAYESIAN STATISTICS
John Wiley & Sons.
4. Dunn P. & Smyth G. (2018). Generalized Linear Models Bayes' Theorem. Bayesian framework and terminology.
Bayesian inference. Prior formulation. Implementation via
with Examples in R. Springer-Verlag.
posterior sampling. Bayesian decision theory. Hierarchical
models. Application to real-world problems.
SIT3017 STATISTICAL LEARNING AND DATA
MINING Assessment:
Continuous Assessment: 40%
Final Examination: 60%
This course prepares students for applied work in data
science by building on students’ foundations of data science
skills. Students will learn advanced methods in statistical References:
1.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B.,
learning and data mining, using appropriate computing tools Vehtari, A., & Rubin, D. B. (2014). Bayesian data
such as R. The strengths of the diversity of approaches are analysis. Chapman and Hall/CRC.
illustrated through analyses of real world data sets covering
commonly encountered data types. 2. Hoff, P. D. (2009). A first course in Bayesian statistical
methods. Springer.
3. Turkman, M. A. A., Paulino, C. D., & Müller, P. (2019).
Exploratory analyses: dimensional reduction methods such Computational Bayesian statistics: an introduction (Vol.
as principal components analysis and linear discriminant
analysis. Feature selection. 11). Cambridge University Press.
4. Lee, P. M. (1997). Bayesian statistics: an introduction.
Oxford University Press.
Supervised learning: artificial neural networks, k-nearest
neighbours, logistic regression, naïve-Bayes, classification
and regression trees, or support vector machine. Ensemble SIT3020 PYTHON FOR DATA SCIENCE
methods: bagging, random forest, and boosting.
Unsupervised learning: K-means and hierarchical clustering.
Description: Introduction to Python programming; Control
statement and program development; Python data
Assessment: structures, strings and files; Functions; Lists and Tuples;
Continuous Assessment: 50%
Final Examination: 50% Dictionaries and sets; Array-oriented programming with
NumPy; Pandas series and DataFrame; Data wrangling;
Object-oriented programming; Python libraries for data
References: analysis such as Jupyter Notebook, SciPy, mglearn and
1. Flach, P. (2012). Machine Learning: The Art and matplotlib.
Science of Algorithms that Make Sense of Data.
Cambridge: Cambridge University Press.
2. Irizarry, R. (2019). Introduction to Data Science: Data Data science: Basic descriptive statistics; Simulation and
static/dynamic visualisation; data mining tools such as
Analysis and Prediction Algorithms with R. Boca principal component analysis and discriminant analysis.
Raton, FL: CRC Press.
3. Witten, I.H., Frank, E., Hall, M.A. & Pal, C.J. (2017).
Data Mining: Practical Machine Learning Tools and Big Data and Cloud case study: Deep learning; convolutional
Techniques (4th ed.), Cambridge, MA: Morgan and recurrent neural networks; Reinforcement learning;
Kaufmann. Network analysis.
4. Hand, D., Mannila, H. & Smyth, P. (2001). Principles Assessment:
of Data Mining. Cambridge, MA: MIT Press.
Continuous Assessment: 50%
Final Examination: 50%
91