Course: ODL - Big Data Analytics & Technologies

Topic outline

BDAT - MODULE OVERVIEW
- This section contains IMPORTANT information about the Module as well as your Module lecturer. Please read BEFORE attempting the course.
- Welcome - Click here to know more Page
- Now that you have a broad understanding of the Course and its requirements, you can proceed with the lessons.
  You will start with Topic 1 and proceed sequentially after completing each topic.
  I wish you all the best 😊
TOPIC 1 - Big Data Analytics and Applications

Hello everyone, I hope you are feeling hyped about starting the module! To kick things off lets immerse ourselves (gently) into the subject matter by firstly establishing a baseline of understanding. We firstly need to understand and define a few key terms and technologies, so that's exactly where we will start. As with all the topics in this module, I am available via Teams Chat for any further clarification!
- LEARNING OUTCOMES
  Completing this topic will enable you to:
  Describe the key concepts of Big Data
  Explain the role of Big Data in Research
  Explain the difference between BI and Data Science
  CONTENTS
- TOPIC 1 - Big Data Analytics and Applications SCORM package
- TOPIC 1 - DISCUSSION Forum
- SUMMARY Page
TOPIC 2 - Big Data Types and Characteristics

Now that we have established an understanding of the key terms and concepts its time to go a little deeper and explore Big data characteristics a little further.
TOPIC 3 - Big Data Technologies & Tools - HADOOP

When people talk about Big Data tools and technologies the first name that comes to most people's minds is Hadoop! Why is that you ask? Well you're about to find out!
- LEARNING OUTCOMES
  
  Completing this topic will enable you to:
  
  Explain the Hadoop Features and assumptions
  
  Explain the core components of Hadoop
  
  Differentiate the categories of NoSQL
  
  Explain the difference between NoSQL v/s Relational database
- Topic 3 - Big Data Technologies & Toos - HADOOP SCORM package
- TOPIC 3 - DISCUSSION Forum
- SUMMARY Page
TOPIC 4 - Big Data Storage - Hadoop Distributed File System

I imagine most of us are accessing this content on either a Windows, Mac, iPad or Android system. Each of these platforms has its own way of storing and organising data (ie FAT and NFTS on Windows or Ext4 and Btrfs on Linux), and Hadoop is no different. Let's investigate the particular characteristics of Hadoop which facilitate the storage of (very) large volumes of data
- LEARNING OUTCOMES
  
  Completing this topic will enable you to:
  
  Explain the concept of Big Data Storage
  
  Explain the mechanisms behind Behind Data storage
  
  Explain the design concept of HDFS
  
  Evaluate the architecture of HDFS
  
  CONTENTS
- Topic 4 - Big Data Storage - Hadoop Distributed File System SCORM package
- TOPIC 4 - DISCUSSION Forum
- SUMMARY Page
TOPIC 5- Big Data Processing - Map Reduce

We know about the storage of very large volumes of data, let's now investigate the tools available to process this data. First up is MapReduce
- LEARNING OUTCOMES
  
  Completing this topic will enable you to:
  
  Explain the concept of Big Data Processing
  
  Explain the design concept of Map Reduce
  
  Describe the services involved in Map Reduce
  
  CONTENTS
- Topic 5- Big Data Processing - Map Reduce SCORM package
- TOPIC 5 - DISCUSSION Forum
- SUMMARY Page
TOPIC 6 - Hbase

A file system requires a mechanism by which to access, manage and manipulate the data. HBase happens to be that mechanism and is a column-oriented database management system that runs on top of HDFS. Our next topic looks to explore HBase in more detail, so as to build up our understanding of Hadoop
- LEARNING OUTCOMES
  
  Completing this topic will enable you to:
  
  Explain the key concepts of HBase
  
  Explain the features of HBase
  
  Differentiate between HBase and HDFS
  
  Explain the storage mechanism in HBase
  
  Explain the major components of HBase
  
  Describe the HBase Architecture
  
  CONTENTS
- Topic 6 - HBASE SCORM package
- TOPIC 6 - DISCUSSION Forum
- SUMMARY Page
TOPIC 7 - Spark

As with any technology there are always alternatives to consider. MapReduce, for all its capabilities is not perfect and has some key limitations of which we need to be aware. An alternative solution is available in the form of Spark, so let's investigate exactly what it has to offer!
- LEARNING OUTCOMES
  
  Completing this topic will enable you to:
  
  Explain the concepts of Spark
  
  Differentiate between Hadoop MapReduce and Spark
  
  Explain the features of Spark
  
  Explain the concepts of Resilient Distributed Datasets
  
  Analyze Spark Internal Architecture
  
  CONTENTS
- Topic 7 - Spark SCORM package
- TOPIC 7 - DISCUSSION Forum
- SUMMARY Page
TOPIC 8: Hive

Wow! We've almost reached the end of our journey through big data analytics and technologies. Only one topic remains, the underlying engine which powers the SQL queries in Hadoop.
- LEARNING OUTCOMES
  
  Completing this topic will enable you to:
  
  Explain the core concepts of Hive
  
  Explain the importance of Hive
  
  Implement a Times Series Analysis and Forecast on a sample dataset
  
  Analyse Hive architecture
  
  Differentiate between Hive and external tables
  
  Contents
- Topic 8 - Hive SCORM package
- TOPIC 8 - DISCUSSION Forum
- SUMMARY Page