Voice-dictate into notes
When typing is the wrong tool — walking, driving, brainstorming with your hands free — dictate into ScaiWave.
1. Start recording#
In the notes panel:
- Open or create a note.
- Click the 🎤 microphone button below the editor (or the Voice panel header if it's collapsed).
- Speak. The recorder shows a level meter and elapsed time.
The recorder uses your browser's MediaRecorder API. Permission is
asked once per origin; you can revoke from your browser's site
settings.
2. Stop and review#
Click Stop. Three things happen:
- The audio is uploaded to MinIO/S3 (transient — purged after processing).
- The transcript is computed via ScaiGrid's speech-to-text model.
- A Confirm transcript modal opens showing the text.
You can edit the transcript in the modal. If you said:
"todo email alice the Q3 numbers high priority due Tuesday"
…the modal will show:
1 | |
(The parser maps "high priority" → !high, "due Tuesday" → the
next Tuesday's date.)
3. Insert#
- Insert at cursor — lands wherever the editor cursor was.
- Append — at the end of the note.
- As todos only — if the transcript was a list, only the checkbox lines are inserted (skips narrative).
- Cancel — discard.
4. Audio is preserved#
The original audio recording is attached to the note (under Audio in the note detail). Click to replay; download to keep a local copy.
This is useful for:
- Verifying you said what you meant.
- Long brainstorms where the transcript is a sketch and the audio has nuance.
- Compliance — having a primary source.
5. Bulk note creation#
In the Voice panel, a recording can also become a new note rather than insertion. Useful for walking-meeting capture:
- Record for 5 minutes.
- Modal shows the transcript.
- Click New note. ScaiWave creates a note, prefills the transcript, and (with AI assist) suggests a title + tags.
6. AI follow-up#
After dictating, ask the AI in the chat:
- "Summarise the note I just recorded."
- "Extract action items from
[[Voice 2026-05-17]]and add them as todos." - "Clean up the transcript — remove ums and false starts."
Limits#
- Max recording length: 30 minutes per file (chunked above).
- Audio formats: webm/opus (browsers), m4a (Safari), wav (fallback).
- Transcription languages: whatever ScaiGrid's speech model supports — typically the major locales. Pass a language hint via the panel's language picker if auto-detection picks wrong.
Where to go next#
- Todos from notes — what happens after the bulk-add.
- API: Media — programmatic upload + transcribe.