Frequently Asked Questions
- Tap "Choose Source" on the main screen
- Select "Browse Files" to pick from Files app, or "Photo Library" for videos
- Choose your audio/video file
- VoxScript will automatically transcribe it (takes 1-2 minutes per minute of audio)
- View the transcription with timestamps and audio playback
VoxScript supports most common audio and video formats:
Audio
MP3, WAV, M4A, AAC, FLAC, OGG
Video
MP4, MOV, AVI, MKV (audio extracted automatically)
Transcription speed depends on:
- File length — Generally 1-2 minutes of processing per minute of audio
- Device model — Newer iPhones (iPhone 13+) are faster
- Model selection — Tiny model is faster, Base model is more accurate but slower
- First use — The AI model downloads on first transcription (40-140 MB)
Processing happens entirely on your device for privacy, which takes longer than cloud services but keeps your data secure.
- Tap the gear icon in the top-right corner
- Select your preferred model:
- Tiny — Fast, smaller download (~40MB), good accuracy
- Base — Slower, larger download (~140MB), better accuracy
- The model will download on your next transcription if not already installed
After transcription completes:
- Export as Text — Creates a .txt file with the transcription and timestamps
- Export as SRT — Creates a subtitle file (.srt) for video editing software
Both options use the iOS share sheet — you can save to Files, send via email, or share to other apps.
No internet required after setup. On your first transcription, VoxScript downloads the selected AI model (one-time download). After that, everything works completely offline. Your audio files never leave your device.
Yes, completely. VoxScript:
- Processes everything on your device
- Never uploads audio files to any server
- Doesn't collect any user data
- Doesn't require an account or login
- Has no analytics or tracking
See our Privacy Policy for full details.
VoxScript supports 80+ languages through the Whisper AI model, including:
- English, Spanish, French, German, Italian, Portuguese
- Chinese (Mandarin), Japanese, Korean
- Arabic, Hindi, Russian
- And many more...
Language is detected automatically from the audio.
Transcription accuracy depends on:
- Audio quality — Clear audio with minimal background noise works best
- Speaker clarity — Distinct speech is easier to transcribe
- Accents — Strong accents may reduce accuracy
- Technical terms — Specialized vocabulary may be misinterpreted
- Model choice — Try the Base model for better accuracy
For best results, use clear recordings in quiet environments.
No. VoxScript only transcribes pre-recorded audio and video files. It does not support:
- Live recording and transcription
- Real-time speech-to-text
- Phone call recording
You must first create an audio recording using another app (like Voice Memos), then import it into VoxScript.
- Export your transcription as SRT format
- Import the SRT file into your video editing software:
- Final Cut Pro — File → Import → Captions
- Premiere Pro — Import SRT as a caption track
- DaVinci Resolve — Import SRT in the subtitle panel
- YouTube — Upload SRT when adding subtitles
- Adjust timing if needed in your editor
Troubleshooting
Try these solutions:
- Restart the app
- Free up storage space (transcription models need 40-140MB)
- Try a shorter audio file first to test
- Switch to the Tiny model if using Base
- Restart your iPhone
- Ensure iOS is up to date (iOS 17.0 or later required)
- Check your internet connection
- Ensure you have enough free storage (40MB for Tiny, 140MB for Base)
- Try restarting the app and attempting transcription again
- If problem persists, delete and reinstall the app
- Make sure Files app has permission to save files
- Check that the receiving app (email, etc.) is properly installed
- Try exporting again after restarting VoxScript
- Free up storage space on your device
Still Need Help?
Can't find what you're looking for? Get in touch with our team.
When contacting support, please include
- iPhone model
- iOS version
- VoxScript version
- Description of issue