Workflow Element Store

  1. Mobile Applications or IoT Applications
  2. Public Datasets
  3. Data bases - NoSQL
  4. Surveys and Questionnaires
  5. Data Collaboration and Partnerships
  6. Flat files
  7. Experiments (DoE)
  8. Feedback Data
  9. Data Bases - SQL
  10. WebScraping
  11. APIs and Data Feeds
  1. MySQL
  2. Azure ADF
  3. GCP BigQuery
  4. Oracle DB
  5. RDBMS
  6. MS SQL server
  7. GCS
  8. AWS Glue
  9. GCP Data Fusion
  10. s3
  11. MongoDB
  12. Azure blob storage
  13. AWS Kinesis
  14. GCP Dataflow
  15. Azure Synapse
  16. Azure Streaming Analytics
  17. PostgreSQL
  18. AWS RDS
  19. ETL/ELT pipeline
  20. Apache Kafka
  21. AWS Redshift
  1. Feature Extraction from Images
  2. Data Partitioning - Train, Validation, & Test
  3. Textual Feature Extraction
  4. Time-Based Features
  5. AutoEDA libraries
  6. Interaction Features
  7. Data Transformations
  8. Domain-Specific Feature Engineering
  9. Feature Selection
  10. Handling Imbalanced Classes
  11. Dimensionality Reduction
  12. Handling Time-Series Data
  13. Data Scaling and Normalization
  14. Handling Categorical Data
  15. Annotation
  16. Handling Noisy Data
  17. Binning / Discretization
  18. Dealing with Outliers
  19. Augmentation
  20. Auto-Preprocessing libraries
  21. Polynomial Features
  22. Handling Missing Data
  1. Hyperparameter Tuning
  2. Regularization Techniques
  3. Cross-Validation
  4. Word Embeddings
  5. Data Augmentation
  6. Weight Initialization
  7. Regular Monitoring and Logging
  8. Model Comparison
  9. Natural Language Processing
  10. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  11. External Validation
  12. Performance Visualization
  13. Model Interpretability
  14. Association Rules
  15. Forecasting Techniques
  16. Batch Normalization
  17. Binary Classification Techniques
  18. Recommendation Engine
  19. AutoML
  20. Blackbox - Neural Network Models
  21. Regression Analysis
  22. Ensemble Techniques
  23. Early Stopping
  24. Clustering
  25. Reinforcement Learning
  26. Transfer Learning
  27. Batch Size Selection
  28. Transfer Learning
  29. Multiclass Classification Techniques
  30. Cross-Validation
  31. Learning Rate Scheduling
  32. Network Analytics/ GeoSpatial Analytics
  33. Evaluation Metrics
  34. Regularization
  1. Databases
  2. code repository
  3. Data Preprocessing pipeline models
  4. Datawarehouse
  5. model registry
  1. Data Drift Monitoring
  2. Model Health Monitoring
  3. Containerization
  4. Model Serialization
  5. Model Drift
  6. Cloud Deployment
  7. Flask
  8. Bias and Fairness Assessment
  9. FastAPI
  10. Feedback Collection
  11. Alerting and Notification
  12. Streamlit
  13. Model Versioning
  14. Edge Deployment
  15. Concept Drift Detection
  16. Serverless Computing
  17. Prediction Logging
  18. Performance Metrics
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference