Voice-dictate into notes

json
{
  "title": "Voice-dictate into notes",
  "audience": "power_user",
  "summary": "Talk; ScaiWave transcribes and lets you insert into the active note.",
  "sort_order": 4
}

Voice-dictate into notes#

When typing is the wrong tool — walking, driving, brainstorming with your hands free — dictate into ScaiWave.

1. Start recording#

In the notes panel:

Open or create a note.
Click the 🎤 microphone button below the editor (or the Voice panel header if it's collapsed).
Speak. The recorder shows a level meter and elapsed time.

The recorder uses your browser's MediaRecorder API. Permission is asked once per origin; you can revoke from your browser's site settings.

2. Stop and review#

Click Stop. Three things happen:

The audio is uploaded to MinIO/S3 (transient — purged after processing).
The transcript is computed via ScaiGrid's speech-to-text model.
A Confirm transcript modal opens showing the text.

You can edit the transcript in the modal. If you said:

"todo email alice the Q3 numbers high priority due Tuesday"

…the modal will show:

markdown

1	`- [ ] Email Alice the Q3 numbers !high 📅 2026-05-20`

(The parser maps "high priority" → !high, "due Tuesday" → the next Tuesday's date.)

3. Insert#

Insert at cursor — lands wherever the editor cursor was.
Append — at the end of the note.
As todos only — if the transcript was a list, only the checkbox lines are inserted (skips narrative).
Cancel — discard.

4. Audio is preserved#

The original audio recording is attached to the note (under Audio in the note detail). Click to replay; download to keep a local copy.

This is useful for:

Verifying you said what you meant.
Long brainstorms where the transcript is a sketch and the audio has nuance.
Compliance — having a primary source.

5. Bulk note creation#

In the Voice panel, a recording can also become a new note rather than insertion. Useful for walking-meeting capture:

Record for 5 minutes.
Modal shows the transcript.
Click New note. ScaiWave creates a note, prefills the transcript, and (with AI assist) suggests a title + tags.

6. AI follow-up#

After dictating, ask the AI in the chat:

"Summarise the note I just recorded."
"Extract action items from [[Voice 2026-05-17]] and add them as todos."
"Clean up the transcript — remove ums and false starts."

Limits#

Max recording length: 30 minutes per file (chunked above).
Audio formats: webm/opus (browsers), m4a (Safari), wav (fallback).
Transcription languages: whatever ScaiGrid's speech model supports — typically the major locales. Pass a language hint via the panel's language picker if auto-detection picks wrong.

Where to go next#

Todos from notes — what happens after the bulk-add.
API: Media — programmatic upload + transcribe.