Kinect mode: expects TD WebSocket depth feed on window.audioData.depthMap. MediaPipe mode: uses browser ML (CDN) — approximates NVIDIA Remove Background node result.
window.audioData.depthMap