"Silence" is a problematic term. For me, that word encompasses: squeaky chairs, typing on a loud keyboard, moving objects around on my table, etc. In a perfect world, Whisper —like a human— can easily distinguish a human voice from the din of my office, and only try and transcribe my voice.
Does anyone have solutions for clearing out "silence" from an audio file that works off something a bit more accurate than just "<= decibel x"?