Workflow Element Store

  1. Data Pre-existing
  2. WebScraping
  3. Data Logging
  4. Public Datasets
  5. APIs and Data Feeds
  6. Surveys and Questionnaires
  7. Data Collaboration and Partnerships
  8. Crowdsourcing
  9. Unstructured data (Audio)
  10. Unstructured data (Images / Videos)
  11. Data Generation
  12. Mobile Applications or IoT Applications
  13. Structured Data (Tabular)
  1. GCP BigQuery
  2. GCS
  3. Azure blob storage
  4. Informatica
  5. Azure Data Warehouse
  6. MS SQL server
  7. AWS Redshift
  8. Oracle DB
  9. RDBMS
  10. PostgreSQL
  11. MySQL
  12. S3
  13. NoSQL DB
  1. Dealing with Outliers
  2. Logarithmic Transform
  3. Polynomial Features
  4. Auto-Preprocessing libraries
  5. Time-Based Features
  6. Handling Imbalanced Classes
  7. Feature Selection
  8. Handling Categorical Data
  9. Feature Extraction from Images
  10. Handling Missing Data
  11. Data Scaling and Normalization
  12. Binning
  13. Textual Feature Extraction
  14. Interaction Features
  15. Data Scaling and Normalization
  16. Handling Time-Series Data
  17. Dimensionality Reduction
  18. Domain-Specific Feature Engineering
  19. Dimensionality Reduction
  20. Encoding Categorical Variables
  21. AutoEDA libraries
  22. Handling Noisy Data
  1. Ensemble Techniques
  2. Unsupervised Learning
  3. Data Partitioning
  4. Blackbox Techniques
  5. Supervised Learning-Regression
  6. Supervised Learning-multiclass classification
  7. Time Series Anaysis
  8. Forecasting
  9. Supervised Learning-binary classification
  10. Train-Test Split
  1. Batch Normalization
  2. Train-Test Split
  3. Ensemble Methods
  4. Transfer Learning
  5. Gradient Clipping
  6. Hyperparameter Tuning
  7. Early Stopping
  8. Learning Rate Scheduling
  9. Data Partition-sequential
  10. Data Augmentation
  11. Regular Monitoring and Logging
  12. Regularization
  13. Cross-Validation
  14. Batch Size Selection
  15. Weight Initialization
  1. Performance Visualization
  2. Data Partitioning
  3. Model Interpretability
  4. Model Comparison
  5. External Validation
  6. Evaluation Metrics
  7. Hyperparameter Tuning
  8. Train-Test Split
  9. Regularization Techniques
  10. Cross-Validation
  1. Alerting and Notification
  2. Concept Drift Detection
  3. Feedback Collection
  4. Model Versioning
  5. Serverless Computing
  6. Containerization
  7. Model Drift
  8. Web APIs - Flask, FastAPI, etc.
  9. Model Health Monitoring
  10. Documentation and Reporting
  11. Monitoring and Logging
  12. A/B Testing
  13. Prediction Logging
  14. Cloud Deployment
  15. Bias and Fairness Assessment
  16. Model Registry
  17. Continuous Integration and Deployment (CI/CD)
  18. Performance Metrics
  19. Edge Deployment
  20. Model Serialization
  21. Security Considerations
  22. Documentation and API Documentation
  23. Model Monitoring and Maintenance
  24. Error Analysis
  25. Model Retraining and Updating
  26. Streamlit
  27. Data Drift Monitoring
  1. Mobile
  2. End User Machine
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model

Feature Store
(Online / Offline)

Data Sources

Data Warehouse/ Data Lake

EDA, Data Pre Processing & Feature Engineering

Model Selection

Model Training & Hyper Parameter Tuning

Model Evaluation

Model Deployment

End User Device

Model Registry