Get in Touch

Course Outline

Quick Overview

  • Data Sources
  • Data Strategy and Curation
  • Recommender Systems
  • Target Marketing

Data Types

  • Structured vs. Unstructured Data
  • Static vs. Streaming Data
  • Attitudinal, Behavioral, and Demographic Data
  • Data-Driven vs. User-Driven Analytics
  • Data Validity
  • Volume, Velocity, and Variety of Data

Models

  • Model Construction
  • Statistical Models
  • Machine Learning

Data Classification

  • Clustering Techniques
  • k-Groups, k-means, and Nearest Neighbors
  • Biomimetic Approaches: Ant Colonies and Bird Flocking

Predictive Models

  • Decision Trees
  • Support Vector Machines
  • Naive Bayes Classification
  • Neural Networks
  • Markov Models
  • Regression Analysis
  • Ensemble Methods

ROI

  • Benefit-to-Cost Ratio
  • Software Costs
  • Development Costs
  • Potential Benefits

Building Models

  • Data Preparation (MapReduce)
  • Data Cleansing
  • Method Selection
  • Model Development
  • Model Testing
  • Model Evaluation
  • Model Deployment and Integration

Overview of Open Source and Commercial Software

  • Selection of R-Project Packages
  • Python Libraries
  • Hadoop and Mahout
  • Selected Apache Projects Related to Big Data and Analytics
  • Selected Commercial Solutions
  • Integration with Existing Software and Data Sources

Requirements

Participants should possess a foundational understanding of traditional data management and analysis methodologies, including SQL, data warehousing, business intelligence, and OLAP. Additionally, knowledge of basic statistics and probability concepts (such as mean, variance, probability, and conditional probability) is required.

 21 Hours

Number of participants


Price per participant

Testimonials (2)

Upcoming Courses

Related Categories