Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Foundations of Audio Classification
- Types of sound events: environmental, mechanical, and human-generated
- Overview of use cases: surveillance, monitoring, and automation
- Differences between audio classification, detection, and segmentation
Audio Data and Feature Extraction
- Audio file types and formats
- Considerations for sampling rate, windowing, and frame size
- Extraction of MFCCs, chroma features, and mel-spectrograms
Data Preparation and Annotation
- Usage of UrbanSound8K, ESC-50, and custom datasets
- Labeling sound events and defining temporal boundaries
- Balancing datasets and applying audio augmentation
Building Audio Classification Models
- Utilizing convolutional neural networks (CNNs) for audio
- Model inputs: raw waveforms versus features
- Loss functions, evaluation metrics, and handling overfitting
Event Detection and Temporal Localization
- Frame-based and segment-based detection strategies
- Post-processing detections using thresholds and smoothing techniques
- Visualizing predictions on audio timelines
Advanced Topics and Real-Time Processing
- Transfer learning for low-data scenarios
- Model deployment using TensorFlow Lite or ONNX
- Streaming audio processing and managing latency
Project Development and Application Scenarios
- Designing a complete pipeline from ingestion to classification
- Developing a proof-of-concept for surveillance, quality control, or monitoring
- Implementing logging, alerting, and integration with dashboards or APIs
Summary and Next Steps
Requirements
- A solid understanding of machine learning concepts and model training
- Proficiency in Python programming and data preprocessing
- Familiarity with the fundamentals of digital audio
Audience
- Data scientists
- Machine learning engineers
- Researchers and developers specializing in audio signal processing
21 Hours