The Master of Data Science (MDS) is a 12-month coursework program offered by the Department of Mathematics and Statistics that trains individuals to become computationally skilled and ethically minded data analysts. Students become well versed in key technologies in data science, including data wrangling, data mining, data integrity, visualization, machine learning, predictive modelling, and spatial-temporal methods. Through hands-on training, students analyze big data independently and collaboratively such that graduates are primed to help organizations translate data into knowledge and actionable insights. The program features in-class experiential learning opportunities, including how to address and describe complex problems relevant to industry partners, as well as how to explore ethical considerations of privacy, data security, objective analysis and visualization.
Director and Graduate Program Coordinator
Ayesha Ali (509 MacNaughton, Ext. 53896)
Graduate Program Assistant
Susan McCormick (440 MacNaugton, Ext. 56553/52155)
R. Ayesha Ali
B.Sc. Western Ontario, M.Sc. Toronto, PhD Washington - Associate Professor
Daniel A. Ashlock
B.Sc. Kansas, PhD CalTech - Professor and Chair
B.Sc. Guelph, M.A.Sc., Waterloo, PhD York - Associate Professor
BA, M.Sc. Bucharest, PhD Queen's - Professor
B.Sc. Shahid Teheshti, M.Sc. Guelph, PhD Waterloo - Associate Professor
B.Sc., M.Sc., PhD Guelph - Assistant Professor
BSE Azad, M.Sc., PhD Putra Malaysia - Associate Professor
Hermann J. Eberl
Dipl. Math (M.Sc.), PhD Munich Univ. of Tech. - Professor
B.Sc. York, MMath., PhD Waterloo - Professor
B.Sc., M.Sc., PhD Guelph - Associate Professor
B.Sc., M.Sc. Guelph, PhD Waterloo - Associate Professor
B.Sc. Mount Allison, BFA Nova Scotia College of Art & Design, MMath, PhD Waterloo - Professor
B.Sc. Western, MMath, PhD Waterloo - Professor
Anna T. Lawniczak
M.Sc. Wroclaw, PhD Southern Illinois - Professor
B.A.Sc. Nanjing, M.Sc. East China Normal, PhD Beijing, PhD Waterloo - Professor
B.Sc., M.Sc. Karachi, PhD Alberta - Assistant Professor
B.Math., Waterloo, PhD Courant Institute NYU - Assistant Professor
B.Sc. Dalhousie, PhD Calgary - Professor
B.Sc. Jilin (China), M.Sc. Academia Sinica (China), PhD Waterloo - Associate Professor
BE Changsha, M.Sc. Peking, PhD Waterloo - Professor
B.Ss., M.Sc. BUAA (Beijing), PhD British Columbia - Professor
Upon recommendation by the Department of Mathematics and Statistics, admission to the Master of Data Science may be granted to applicants who have completed an honour’s Bachelor’s degree or equivalent from an accredited institution with a minimum overall average of 70% (B-) in the last four semesters of study with:
1) a major or minor in data science, computer science, mathematics, statistics, or a related field; or
2) working knowledge of statistics and computer programming, as demonstrated through completion of university or college level degree credit courses equivalent to the U of G courses STAT*3240 Applied Regression Analysis and CIS*2500 Intermediate Programming.
Please note: prospective students with an Honour’s Bachelor’s degree in an unrelated field who do not meet the above requirements may gain entry to the program after completing the Diploma in Applied Statistics (or equivalent) with a minimum overall average of at least 70% (B-).
Successful applicants must also meet the University of Guelph’s English Proficiency requirements for admission. If an applicant’s first language is not English, an English Language Proficiency test will be required during the application phase.
All applications will be received and reviewed by the Data Science Program Committee. The program especially encourages applications from qualified members of under-represented groups, particularly from those who self-identify as women, visible minorities and Indigenous peoples.
Upon successful completion of the Master of Data Science program, graduates will have the capacity to:
- Exhibit a solid understanding of statistics and competency in computer programming;
- Demonstrate an in-depth understanding of the key technologies in data science: visualization, data mining, machine learning, and predictive modelling;
- Develop advanced skills in data acquisition, processing, and manipulation;
- Apply statistical methods and predictive modelling to answer queries, predict trends, and model real-world problems;
- Analyze big data using state-of-the-art software tools to draw meaningful conclusions;
- Communicate and translate data into actionable insights for diverse audiences;
- Create compelling narratives/presentations of data analysis results using appropriate data visualization and non-technical language; and
- Recognize, analyze and apply ethical practices in data science related to intellectual property, data security, integrity, and privacy throughout the full data life cycle, including collection, storage, processing, analysis, and deployment.
Students in the Master of Data Science program are required to complete a minimum of 4.00 graduate credits, consisting of four core courses (2.00 credits), two electives (1.00 credits), and either the two capstone courses or DATA*6700 Data Science Project (1.00 credits).
|DATA*6100||Introduction to Data Science||0.50|
|DATA*6200||Data Manipulation and Visualization||0.50|
|DATA*6300||Analysis of Big Data||0.50|
|DATA*6400||Machine Learning for Sequential Data Processing||0.50|
|DATA*6500||Analysis of Spatial-Temporal Data||0.50|
|DATA*6600||Applications of Data Science||0.50|
|CIS*6320||Image Processing Algorithms and Applications||0.50|
|ENGG*6140||Optimization Techniques for Engineering||0.50|
|ENGG*6400||Mobile Devices Application Development||0.50|
|PHIL*6400||Ethics of Data Science||0.50|
|STAT*6802||Generalized Linear Models and Extensions||0.50|
|STAT*6841||Computational Statistical Inference||0.50|
|STAT*6950||Statistical Methods for the Life Sciences||0.50|
The course includes an introduction to the methods of modern statistics such as splines, general additive models, principal components analysis, and classifiers. Students learn resampling methods such as bootstrap, cross-validation, boosting, and bagging. Methods of model selection include search-and-score and regularization, and students practice communicating technical ideas to a non-technical audience, including via data visualization.
This course provides a hands-on introduction to the manipulation and visualization of complex data sets using a programming language. Efficient techniques for importing and exporting data in various formats, data acquisition, data integrity, and good analysis practices are discussed. Several programming tools and libraries are introduced to restructure, transform and fuse disparate data types for visualization and data summaries in table format. Basics of manipulating space-time data is also covered.
This course introduces software tools and data science techniques for analyzing big data. It covers big data principles, state-of-the-art methodologies for large data management and analysis, and their applications to real-world problems. Modern and traditional machine learning techniques and data mining methods are discussed and ethical implications of big data analysis are examined. May be offered in conjunction with CIS*6180.
This course emphasizes machine learning for sequential data processing. It covers common challenges and pre-processing techniques for sequential data such as text, biological sequences, and time series data. Students are exposed to machine learning techniques, including classical methods and more recent deep learning models, so that they obtain the background and skills needed to confront real-world applications of sequential data processing. May be offered in conjunction with CIS*6190.
This course introduces software tools and data science techniques for analyzing big geospatial data. An overview of raster-based geographic information systems (GIS) for identifying patterns and clusters in spatial-temporal data using state-of-the-art software and programming languages is provided. Concepts such as kriging/Gaussian processes, variograms and autoregressive correlation structures are discussed. Data summaries and visualizations specific to spatial-temporal problems are introduced.
This interdisciplinary team-taught seminar course provides students the opportunity to synthesize information, research methods, and present cutting-edge applications of data science. Learning outcomes include identifying reliable sources, understanding and presenting relevant contemporary data science methods, thinking critically about practical implementations of data science, and effective peer collaboration. Emphasis is placed on effectively communicating technical content and insights to a non-technical audience.
This course is a one-semester research project course for students in the Master of Data Science program. In this course, students plan, develop, and write a faculty- or industry-led research paper, as well as present on their work. The project should advance knowledge or practice in data science or a closely related area, and address a real-world problem faced by industry. The project should focus on data science in the spatial and temporal dimension(s), to be approved by the course instructor.