Page 61 - handbook 20162017
P. 61
Faculty of Science Handbook, Session 2016/2017
SIT3008 INTRODUCTION TO SURVEY SAMPLING Assessment:
Continuous Assessment: 40%
Techniques of statistical sampling with applications in the Final Examination: 60%
analysis of sample survey data. Topics include simple
random sampling, stratified sampling, systematic sampling, Medium of Instruction:
cluster sampling, two-stage sampling and ratio and English
regression estimates.
Humanity Skill:
Assessment: CS3, CT3, LL2
Continuous Assessment: 40%
Final Examination: 60% References:
1. Adriaans, P. and Zantige, D. (1996). Data Mining.
Medium of Instruction: Addison-Wesley.
English 2. Hand, D., Mannila, H. and Smyth, P. (2001). Principles
of Data Mining. MIT Press.
Humanity Skill: 3. Cios.K.J. et al. (2010). Data mining : a knowledge
CT4, LL2 discovery approach. New York : Springer-Verlag
References:
1. Scheaffer, R. L. (2006), Elementary Survey Sampling, SIT3011 BIOINFORMATICS
Duxbury (6 th ed.).
nd
2. Thompson, S. K. (2002), Sampling, Wiley, (2 ed.). Statistical modelling of DNA/protein sequences:
3. Lohr, Sharon L. (2010), Sampling: Design and Assessing statistical significance in BLAST using the
Analysis, Cengage Learning (2 ed). Gumbel distribution; DNA substitution models; Poisson and
nd
4. Cochran, W. (1977), Sampling Techniques, Wiley negative binomial models for gene counts; Hidden Markov
rd
(3 ed.). Model.
Algorithms for sequence analysis and tree
SIT3009 STATISTICAL PROCESS CONTROL construction: Dynamic programming for sequence
alignment and Viterbi decoding; neighbour-joining,
Methods and philosophy of statistical process control. UPGMA, parsimony and maximum likelihood tree-building
Control charts for variables and attributes. CUSUM and methods.
EWMA charts. Process capability analysis. Multivariate
control charts. Acceptance sampling by attributes and Analysis of high-dimensional microarray / RNA-Seq
variables. gene expression data: Statistical tests for detecting
differential expression, feature selection, visualization, and
Assessment: phenotype classification.
Continuous Assessment: 40%
Final Examination : 60% Assessment:
Continuous Assessment: 40%
Medium of Instruction: Final Examination: 60%
English
Medium of Instruction:
Humanity Skill: English
CS3, CS3, TS2, LL2
Humanity Skill:
References: CS3, CT3, LL2
1. D. C. Montgomery, Introduction to Statistical Quality
Control, 6th ed., Wiley, 2009. References:
2. R. S. Kenett and S. Zacks, Modern Industrial Statistics: 1.
Design and control of quality and reliability, Duxbury Jones, N.C. & Pevzner, P.A. (2004). An Introduction to
Press, 1998. Bioinformatics Algorithms. Massachusetts: MIT Press.
3. A. J. Duncan, Quality Control and industrial Statistics, 2. Durbin, R., Eddy, S., Krogh, A. & Mitchison, G. (1998).
5th ed., Irwin, 1986. Biological Sequence Analysis: Probabilistic Models of
Proteins and Nucleic Acids. Cambridge: Cambridge
University Press.
SIT3010 INTRODUCTION TO DATA MINING 3. Ewens, W.J. & Grant, G.R. (2005). Statistical Methods
in Bioinformatics: An Introduction (2nd ed.). New York:
Description: Introduction to statistical methods and tools for Springer.
analysis of very large data sets and discovery of interesting 4. Pevsner, J. (2009). Bioinformatics and Functional
and unexpected relationships in the data. Genomics (2nd ed.). New York: Wiley-Blackwell.
Data preprocessing and exploration: data quality and data
cleaning. Data exploration: summarizing and visualizing SIT3012 DESIGN AND ANALYSIS OF EXPERIMENTS
data; principal component, multidimensional scaling. Data
analysis and uncertainty: handling uncertainty; statistical Philosophy related to statistical designed experiments.
inference; sampling. Analysis of variance. Experiments with Blocking factors.
Factorial experiments. Two level factorial designs. Blocking
Statistical approach to data mining and data mining and confounding system for two-level factorials. Two-level
algorithms: Regression, Validation; classification and fractional factorial designs.
clustering: k-means, CART, decision trees; Artificial Neural
Network; boosting; support vector machine; association Assessment:
rules mining. Modelling: descriptive and predictive Continuous Assessment: 40%
modelling. Data organization. Final Examination: 60%
57