vLLM

Inference haute perf PagedAttention — cloné /opt/

→ Lancer vLLM ← Hub

Prod

Status

Wired

WEVAL

✓

Tested

24/7

Uptime

À propos

vLLM est un outil intégré dans l'écosystème WEVAL. Inference haute perf PagedAttention — cloné /opt/

Capacités

vLLM en production WEVAL — wired, monitored, prêt.

🔌

Intégration

vLLM est wired dans WEVAL ecosystem

📊

Production

Déployé et testé sur infra S204/S95

📚

Docs

Documentation officielle complète

👥

Communauté

Support communauté active

🌐

API

API REST + webhooks disponibles

🔄

Updates

Mises à jour régulières trackées

Spécifications

Catégorie

AI/ML Training & Modèles

Licence

OSS

GitHub

—

Status

Production

Intégration WEVAL

Comment vLLM est wired

vLLM est intégré dans l'écosystème souverain WEVAL: cascade WEVIA Master, mémoire Qdrant, observability Langfuse, monitoring Grafana. Accessible directement depuis services-hub sans authentification supplémentaire.

Prêt à utiliser ?

vLLM est déployé.

→ Ouvrir vLLM ← Retour Services Hub