1. About the Course
  2. Intended Audience
  3. Syllabus


The Data Science and Big Data Analytics course gives down to earth establishment level preparing that empowers prompt and successful interest in Big Data and different Analytics ventures. It incorporates a prologue to Big Data and the Data Analytics lifecycle to address business challenges that influence Big Data. The course gives establishing in essential and progressed systematic techniques and a prologue to Big Data Analytics innovation and instruments. Lab sessions offer chances to see how these strategies and devices might be connected to true business challenges by a rehearsing Data Scientist. This course gives an industry accreditation to business investigators, information distribution center specialists or different experts with comparative foundations to help them change into the universe of Data Science and Big Data Analytics that has extraordinary difficulties and opportunities.


Introduction to NoSQL

  • What Is Meant By NoSQL?
  • Distributed and Decentralized
  • Elastic Scalability
  • High Availability and Fault Tolerance
  • Brewer's CAP Theorem
  • Row-Oriented
  • Schema-Free
  • High Performance
  • Introduction to Cassandra

Cassandra: Introduction, Installation and Configuration

  • Describe Apache Cassandra
  • Common use cases - large deployments, lots of writes, statistics
  • Cassandra architecture
  • Select and install a Cassandra version
  • Configure Cassandra for a single node, multinode
  • Start and stop a Cassandra instance
  • Installing Cassandra on Windows, Mac, Ubuntu
  • Basic CLI Commands

Cassandra Data Model

  • Understand basics of data modeling
  • Key Space
  • Column Family, Column Family Options
  • Wide Rows, Skinny Row
  • Column Sorting
  • Super Columns
  • Counter Column Family
  • Composite Keys and Columns
  • Time To Live
  • Indexing in Cassandra: primary, secondary and custom
  • Secondary Indexes in Cassandra
  • Difference between Custom and Secondary Indexes
  • Difference between Relational Modeling and Cassandra Modeling
  • Patterns and Anti-Patterns in Cassandra Modeling

Understanding Cassandra Architecture

  • Understand replication
  • Understanding data partitioners
  • How nodes communicate - Peer-to-Peer Model
  • Anatomy of Read/Write operation
  • How are Deletes handled in Cassandra
  • Gossip and Failure Detection
  • Anti-Entropy and Read Repair
  • Memtables, SSTables, Commit Logs, Flushing, Row Merging, Cache (Key, Row)
  • Hinted Handoff
  • Compaction: Choose and implement compaction strategies
  • Bloom Filters, Tombstones
  • Managers and Services
  • VNodes
  • Indexes and Caches

Cassandra Monitoring and Administration

  • Tuning Cassandra
  • Backup and Recovery methods
  • Balancing
  • Bootstrapping
  • Node Tools Commands
  • Monitoring critical metrics
  • Configure nodes and clusters using CCM
  • Bulk Loading Data to Cassandra
  • Bulk Export of Data from Cassandra
  • Populate and test nodes using Cassandra-stress
  • Cassandra Security: Authentication, Authorization, Physical Security

MongoDB: Introduction, Installation and Configuration

  • Different deployment models
  • Installing MongoDB On Windows
  • Installing MongoDB On MAC
  • Installing MongoDB On Ubuntu
  • Starting and stopping MongoDB server
  • How the drivers work in general
  • Driver APIs with examples
  • Drivers Install Java

CRUD and the MongoDB Shell

  • Introduction to the MongoDB API
  • Performing Queries Overview
  • Performing Queries Using The Cursor
  • Performing Queries Query Modifications
  • Adding Information Database, Collection And Document
  • Adding Information Arrays
  • Adding Information Objects
  • Adding Information The _Id Field
  • Performing Modifications Basic Document Updates
  • Performing Modifications Updating Arrays And Fields
  • Performing Modifications Deleting Documents