Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- The Data Science Lifecycle
- Roles and responsibilities of a Data Scientist
Setting Up the Development Environment
- Libraries, frameworks, languages, and tools
- Local development setup
- Collaborative web-based development
Data Acquisition
-
Types of Data
-
Structured
- Local databases
- Database connectors
- Common formats: xlxs, XML, Json, csv, ...
-
Unstructured
- Clicks, sensors, smartphones
- APIs
- Internet of Things (IoT)
- Documents, pictures, videos, sounds
-
Structured
- Case study: Continuously collecting large volumes of unstructured data
Data Storage Solutions
- Relational databases
- Non-relational databases
- Hadoop: Distributed File System (HDFS)
- Spark: Resilient Distributed Dataset (RDD)
- Cloud storage
Data Preparation
- Ingestion, selection, cleansing, and transformation
- Ensuring data quality - correctness, meaningfulness, and security
- Exception handling and reporting
Languages for Preparation, Processing, and Analysis
-
R language
- Introduction to R
- Data manipulation, calculation, and graphical display
-
Python
- Introduction to Python
- Manipulating, processing, cleaning, and crunching data
Data Analytics
-
Exploratory analysis
- Basic statistics
- Draft visualizations
- Gaining insights from data
- Causality
- Features and transformations
-
Machine Learning
- Supervised vs unsupervised learning
- Selecting the appropriate model
- Natural Language Processing (NLP)
Data Visualization
- Best practices
- Choosing the right chart for the data
- Color palettes
-
Enhancing visualization
- Dashboards
- Interactive visualizations
- Data storytelling
Summary and Conclusion
Requirements
- A general understanding of database concepts
- A basic understanding of statistics
35 Hours
Testimonials (1)
workshops, practical examples