Senior Ml Engineer
ML engineering skill for productionizing models, building MLOps pipelines, and integrating LLMs. Covers model deployment, feature stores, drift monitoring, R...
ML engineering skill for productionizing models, building MLOps pipelines, and integrating LLMs. Covers model deployment, feature stores, drift monitoring, R...
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
Production ML engineering patterns for model deployment, MLOps infrastructure, and LLM integration.
Deploy a trained model to production with monitoring:
FROM python:3.11-slimCOPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt
COPY model/ /app/model/ COPY src/ /app/src/
HEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1
EXPOSE 8080 CMD ["uvicorn", "src.server:app", "--host", "0.0.0.0", "--port", "8080"]
| Option | Latency | Throughput | Use Case |
|---|---|---|---|
| FastAPI + Uvicorn | Low | Medium | REST APIs, small models |
| Triton Inference Server | Very Low | Very High | GPU inference, batching |
| TensorFlow Serving | Low | High | TensorFlow models |
| TorchServe | Low | High | PyTorch models |
| Ray Serve | Medium | High | Complex pipelines, multi-model |
Establish automated training and deployment:
from feast import Entity, Feature, FeatureView, FileSourceuser = Entity(name="user_id", value_type=ValueType.INT64)
user_features = FeatureView( name="user_features", entities=["user_id"], ttl=timedelta(days=1), features=[ Feature(name="purchase_count_30d", dtype=ValueType.INT64), Feature(name="avg_order_value", dtype=ValueType.FLOAT), ], online=True, source=FileSource(path="data/user_features.parquet"), )
| Trigger | Detection | Action |
|---|---|---|
| Scheduled | Cron (weekly/monthly) | Full retrain |
| Performance drop | Accuracy < threshold | Immediate retrain |
| Data drift | PSI > 0.2 | Evaluate, then retrain |
| New data volume | X new samples | Incremental update |
Integrate LLM APIs into production applications:
from abc import ABC, abstractmethod from tenacity import retry, stop_after_attempt, wait_exponentialclass LLMProvider(ABC): @abstractmethod def complete(self, prompt: str, **kwargs) -> str: pass
@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10)) def call_llm_with_retry(provider: LLMProvider, prompt: str) -> str: return provider.complete(prompt)
| Provider | Input Cost | Output Cost |
|---|---|---|
| GPT-4 | $0.03/1K | $0.06/1K |
| GPT-3.5 | $0.0005/1K | $0.0015/1K |
| Claude 3 Opus | $0.015/1K | $0.075/1K |
| Claude 3 Haiku | $0.00025/1K | $0.00125/1K |
Build retrieval-augmented generation pipeline:
| Database | Hosting | Scale | Latency | Best For |
|---|---|---|---|---|
| Pinecone | Managed | High | Low | Production, managed |
| Qdrant | Both | High | Very Low | Performance-critical |
| Weaviate | Both | High | Low | Hybrid search |
| Chroma | Self-hosted | Medium | Low | Prototyping |
| pgvector | Self-hosted | Medium | Medium | Existing Postgres |
| Strategy | Chunk Size | Overlap | Best For |
|---|---|---|---|
| Fixed | 500-1000 tokens | 50-100 | General text |
| Sentence | 3-5 sentences | 1 sentence | Structured text |
| Semantic | Variable | Based on meaning | Research papers |
| Recursive | Hierarchical | Parent-child | Long documents |
Monitor production models for drift and degradation:
from scipy.stats import ks_2sampdef detect_drift(reference, current, threshold=0.05): statistic, p_value = ks_2samp(reference, current) return { "drift_detected": p_value < threshold, "ks_statistic": statistic, "p_value": p_value }
| Metric | Warning | Critical |
|---|---|---|
| p95 latency | > 100ms | > 200ms |
| Error rate | > 0.1% | > 1% |
| PSI (drift) | > 0.1 | > 0.2 |
| Accuracy drop | > 2% | > 5% |
references/mlops_production_patterns.md contains:
references/llm_integration_guide.md contains:
references/rag_system_architecture.md contains:
python scripts/model_deployment_pipeline.py --model model.pkl --target staging
Generates deployment artifacts: Dockerfile, Kubernetes manifests, health checks.
python scripts/rag_system_builder.py --config rag_config.yaml --analyze
Scaffolds RAG pipeline with vector store integration and retrieval logic.
python scripts/ml_monitoring_suite.py --config monitoring.yaml --deploy
Sets up drift detection, alerting, and performance dashboards.
| Category | Tools |
|---|---|
| ML Frameworks | PyTorch, TensorFlow, Scikit-learn, XGBoost |
| LLM Frameworks | LangChain, LlamaIndex, DSPy |
| MLOps | MLflow, Weights & Biases, Kubeflow |
| Data | Spark, Airflow, dbt, Kafka |
| Deployment | Docker, Kubernetes, Triton |
| Databases | PostgreSQL, BigQuery, Pinecone, Redis |
No automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.