Course Outline
Introduction to Data Analysis and Big Data
- What Makes Big Data "Big"?- Velocity, Volume, Variety, Veracity (VVVV)
 
- Limits to Traditional Data Processing
- Distributed Processing
- Statistical Analysis
- Types of Machine Learning Analysis
- Data Visualization
Big Data Roles and Responsibilities
- Administrators
- Developers
- Data Analysts
Languages Used for Data Analysis
- R Language- Why R for Data Analysis?
- Data manipulation, calculation and graphical display
 
- Python- Why Python for Data Analysis?
- Manipulating, processing, cleaning, and crunching data
 
Approaches to Data Analysis
- Statistical Analysis- Time Series analysis
- Forecasting with Correlation and Regression models
- Inferential Statistics (estimating)
- Descriptive Statistics in Big Data sets (e.g. calculating mean)
 
- Machine Learning- Supervised vs unsupervised learning
- Classification and clustering
- Estimating cost of specific methods
- Filtering
 
- Natural Language Processing- Processing text
- Understaing meaning of the text
- Automatic text generation
- Sentiment analysis / topic analysis
 
- Computer Vision- Acquiring, processing, analyzing, and understanding images
- Reconstructing, interpreting and understanding 3D scenes
- Using image data to make decisions
 
Big Data Infrastructure
- Data Storage- Relational databases (SQL)- MySQL
- Postgres
- Oracle
 
- Non-relational databases (NoSQL)- Cassandra
- MongoDB
- Neo4js
 
- Understanding the nuances- Hierarchical databases
- Object-oriented databases
- Document-oriented databases
- Graph-oriented databases
- Other
 
 
- Relational databases (SQL)
- Distributed Processing- Hadoop- HDFS as a distributed filesystem
- MapReduce for distributed processing
 
- Spark- All-in-one in-memory cluster computing framework for large-scale data processing
- Structured streaming
- Spark SQL
- Machine Learning libraries: MLlib
- Graph processing with GraphX
 
 
- Hadoop
- Scalability- Public cloud- AWS, Google, Aliyun, etc.
 
- Private cloud- OpenStack, Cloud Foundry, etc.
 
- Auto-scalability
 
- Public cloud
Choosing the Right Solution for the Problem
The Future of Big Data
Summary and Next Steps
Requirements
- A general understanding of math
- A general understanding of programming
- A general understanding of databases
Audience
- Developers / programmers
- IT consultants
Testimonials (7)
How big data work, data programs, greater knowledge of how our current world works using data
Ozayr Hussain - Vodacom
Course - A Practical Introduction to Data Analysis and Big Data
The practical side of the training.
Patrick - Vodacom PTy Ltd
Course - A Practical Introduction to Data Analysis and Big Data
Interactive topics and the style used by the lecture to simplified the topics for the students
Miran Saeed - Sulaymaniyah Asayish Agency
Course - A Practical Introduction to Data Analysis and Big Data
the trainer and his ability to lecture
ibrahim hamakarim - Sulaymaniyah Asayish Agency
Course - A Practical Introduction to Data Analysis and Big Data
Practical exercises
JOEL CHIGADA - University of the Western Cape
Course - A Practical Introduction to Data Analysis and Big Data
R programming
Osden Jokonya - University of the Western Cape
Course - A Practical Introduction to Data Analysis and Big Data
Overall the Content was good.
 
                    