Objectives and competences
The aim of the course is to train the students to gain insight into the issues of big data processing, and learn about approaches for analysing, and visualizing big data.
Content (Syllabus outline)
· Introduction: definitions, history, practical examples, challenges
· Basic data structures for representing big data (arrays, lists, hash tables, graph representation with list of edges and adjacency matrix).
· Big Data analysis algorithms (clustering, community detection, principal component analysis, complex networks, classification, regression)
· Space and time complexity in Big Data processing
· Algorithms and search data structures to speed up the processing of large volumes of data (binary search trees, quad trees, octree trees, KD trees, sampling)
· Distributed processing of large amounts of data: programming model MapReduce
· Big Data visualization: automatic placement of nodes on graphs, heatmaps, radial, count, violin, residual, regression, and interactive charts
Learning and teaching methods
· Lectures
· Computer exercises
· Individual work
Intended learning outcomes - knowledge and understanding
Knowledge and understanding:
Upon completion of this course, the student will be able to
· Understand the meaning of Big Data
· Understand the challenges of Big data processing
· Use appropriate algorithms and tools for processing big data
· Analyse large amounts of data
Transferable/Key skills and other attributes:
· Communication skills: oral manner of expression in laboratory exercises, writing report about lab work, written manner of expression in written examination.
· Use of information technology: use of appropriate algorithms and software tools to analyse and process large amounts of data.
· Numeracy skills: solving numerical problems.
· Problem solving: selecting appropriate tools and algorithms for analysing and processing large amounts of data, evaluating the suitability of the tools and algorithms used in terms of time and space complexity.
Readings
· B. Baesens: Analytics in a big data world: The essential guide to data science and its applications, John Wiley & Sons, New Jersey, USA, 2014.
· A. Bahga and V. Madisetti: Big Data Science & Analytics: A Hands-On Approach, Bahga & Madisetti,Georgia, India, 2016.
· N. Marz and J. Warren: Big Data: Principles and best practices of scalable realtime data systems, Manning Publications Co., New York, USA, 2015.