Workflow Element Store

  1. Data Bases - SQL
  2. WebScraping
  3. Mobile Applications or IoT Applications
  4. Data Collaboration and Partnerships
  5. APIs and Data Feeds
  6. Experiments (DoE)
  7. Surveys and Questionnaires
  8. Feedback Data
  9. Public Datasets
  10. Data bases - NoSQL
  11. Flat files
  1. Azure blob storage
  2. MongoDB
  3. ETL/ELT pipeline
  4. AWS RDS
  5. GCP BigQuery
  6. AWS Glue
  7. GCS
  8. Azure Streaming Analytics
  9. AWS Kinesis
  10. MySQL
  11. Azure ADF
  12. MS SQL server
  13. GCP Data Fusion
  14. PostgreSQL
  15. Azure Synapse
  16. RDBMS
  17. AWS Redshift
  18. GCP Dataflow
  19. Oracle DB
  20. Apache Kafka
  21. s3
  1. Feature Extraction from Images
  2. Auto-Preprocessing libraries
  3. Polynomial Features
  4. Augmentation
  5. Handling Time-Series Data
  6. Textual Feature Extraction
  7. Handling Noisy Data
  8. Annotation
  9. Data Transformations
  10. Handling Categorical Data
  11. Handling Missing Data
  12. Feature Selection
  13. AutoEDA libraries
  14. Interaction Features
  15. Binning / Discretization
  16. Domain-Specific Feature Engineering
  17. Dimensionality Reduction
  18. Dealing with Outliers
  19. Handling Imbalanced Classes
  20. Data Scaling and Normalization
  21. Data Partitioning - Train, Validation, & Test
  22. Time-Based Features
  1. Hyperparameter Tuning
  2. Batch Size Selection
  3. Natural Language Processing
  4. Weight Initialization
  5. Performance Visualization
  6. Association Rules
  7. Word Embeddings
  8. AutoML
  9. Transfer Learning
  10. Regression Analysis
  11. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  12. Cross-Validation
  13. Regularization
  14. Evaluation Metrics
  15. Network Analytics/ GeoSpatial Analytics
  16. Regularization Techniques
  17. Data Augmentation
  18. Multiclass Classification Techniques
  19. Batch Normalization
  20. Reinforcement Learning
  21. Clustering
  22. Cross-Validation
  23. Ensemble Techniques
  24. Model Comparison
  25. Recommendation Engine
  26. Model Interpretability
  27. Transfer Learning
  28. Learning Rate Scheduling
  29. Blackbox - Neural Network Models
  30. Regular Monitoring and Logging
  31. External Validation
  32. Early Stopping
  33. Binary Classification Techniques
  34. Forecasting Techniques
  1. Datawarehouse
  2. Data Preprocessing pipeline models
  3. Databases
  4. code repository
  5. model registry
  1. Data Drift Monitoring
  2. Feedback Collection
  3. Model Health Monitoring
  4. Alerting and Notification
  5. Cloud Deployment
  6. Flask
  7. Edge Deployment
  8. Model Versioning
  9. Performance Metrics
  10. Model Drift
  11. Containerization
  12. Model Serialization
  13. FastAPI
  14. Streamlit
  15. Concept Drift Detection
  16. Serverless Computing
  17. Bias and Fairness Assessment
  18. Prediction Logging
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference