Integrations.

AI providers, NVIDIA NIM voice, and YouTube / SoundCloud playback.

Section 03 of 5 6 min read ← All docs

AI providers

Jarvis talks to a pool of providers behind the scenes. You don't choose one - the bot picks the best available response for each turn.

The pool currently includes NVIDIA NIM, Mistral, Google Gemini, Ollama Cloud, OpenAI, and DeepSeek. Models and availability change over time. Requests sent to an external provider are processed under that provider's terms; locally hosted services remain on the Jarvis host.

Voice stack - NVIDIA NIM

Voice is its own pipeline. When you run /voice, Jarvis joins the channel and starts streaming audio chunks through:

Wake-word detection - local model listens for "jarvis" / "garmin" / your custom word
Speech-to-text - NVIDIA NIM Parakeet, post wake-word audio only
AI response - same provider pool used for text chat
Text-to-speech - NVIDIA NIM TTS, streamed back into the voice channel

Raw audio is never persisted. Only post-wake-word transcripts ever leave the box, and those follow the same 30-day memory retention as text chat.

Self-hosters: NIM access is optional. The bot runs without voice if you don't configure NIM credentials - it just won't respond to /voice.

Music sources

YouTube - searches, direct URLs, and playlists resolve through the local Lavalink node and begin streaming without a download step.
SoundCloud - public searches and track URLs resolve through the same local Lavalink node. Private or login-walled tracks are not available.
Direct file upload - attach up to 9 audio files (30 MB each) to one /play call. MP3, WAV, OGG, FLAC, M4A, and OPUS are accepted.

Uploaded codecs may be normalized with ffmpeg; YouTube and SoundCloud playback stays inside the local Lavalink pipeline.

Storage & portal

Per-server config, blacklists, warning logs, and encrypted memory records live in the local SQLite store. Memory payloads are encrypted with the configured master key.

The web portal uses Discord OAuth for sign-in. Sessions are HTTP-only cookies. Once signed in, you can manage automod lists, role/channel restrictions, and feature toggles from a UI instead of slash commands.

Webhooks & API

Jarvis exposes a small JSON API on /api/stats (public guild/user counts and 24h request volume) and an internal webhook receiver. The webhook endpoint signs incoming bodies and is intended for internal release automation, not third-party integrations.

Endpoints worth knowing:

GET /api/stats - public, returns { guildCount, userCount, requests24h }
GET /health - bearer-gated liveness
GET /metrics/commands - bearer-gated command usage breakdown
GET /sitemap.xml, GET /robots.txt - SEO surface

Still stuck?

Drop a question in the AGIS support channel. Most things get answered within the day - the maintainer reads every message.