Research Study

Same audio. Four systems.
Very different records.

We ran a blind experiment to test which transcript services can actually handle real-world legal recordings. The results reveal critical differences in speaker accuracy, name capture, and meaning preservation.

See the Methodology Jump to Results

Systems tested

Test recordings

80+

Minutes analyzed

Unique speakers

The Problem

Transcript errors don't just lose words.
They lose people.

In legal research, jury studies, and focus groups, you're not just capturing what was said — you're tracking who said it. When a transcript system merges two speakers into one, or flips the meaning of a statement, the entire analysis becomes unreliable.

Speaker Identity

Each participant needs a consistent identity throughout the session. Merging speakers makes it impossible to track individual perspectives.

Name Capture

When participants introduce themselves, their names must be transcribed correctly. Mangled names break searchability and attribution.

Meaning Preservation

Legal positions must be captured accurately. "Wasn't negligent" vs "was negligent" is not a typo — it's a reversal of position.

Key Concepts

Understanding the Metrics

Before diving into results, here's what we measured and why each metric matters for legal and research transcripts.

Speaker Collision

Critical

When a transcript system incorrectly assigns statements from two or more different people to the same speaker ID. This makes it impossible to track individual perspectives.

2 people

1 speaker ID

Diarization

The process of partitioning an audio stream into segments according to speaker identity. Good diarization means each person gets their own consistent speaker ID.

Meaning Reversal

Critical

When a transcription error changes the fundamental meaning of a statement. "Was not negligent" becoming "was negligent" flips a defense position to a plaintiff position.

wasn'twas

Name Capture

Accurately transcribing participant names when they introduce themselves. Essential for searchability and attribution in the final transcript.

Shwankar

Shawan Wanka

Methodology

How We Ran the Experiment

A blind comparison using identical source audio across four leading transcript systems.

Source Audio

We selected two real multi-speaker recordings: a 35-minute US mock trial deliberation and a 45-minute UK jury deliberation over Zoom.

Blind Processing

The same audio files were submitted to four transcript services without any pre-processing or enhancement. No system received an advantage.

Analysis

We compared outputs for speaker collision, name capture accuracy, and meaning preservation — with timestamps for every finding.

Test 1: US Mock Trial

Controlled conditions

35 minutesClean room audioAmerican accents11 speakersFormal introductions

Test 2: UK Virtual Deliberation

Challenging conditions

45 minutesZoom audioBritish accents12 jurorsCompressed audioCrosstalk

No system is expected to be perfect; the goal was to compare intelligence-readiness for mock trial analysis.

Results

The Scorecard

We tested 4 categories across 2 test conditions. Here's how each system performed.

Secure Stream

Winner

8.7/10

Best for Mock Trial Intelligence

87% pass rate

Strongest overall performance across reviewed test samples.

Speaker count aligned with room structure
Persistent identities across the session
<1% observed speaker collisions
Names and introductions preserved
Built-in manual speaker cleanup

Competitor 2

1/8

Critical failures

13% pass rate

Speaker collision in both tests
Dropped juror name entirely
Reversed "wasn't negligent" to "was negligent"
Merged foreperson with another speaker

Competitor 1

4/8

Mixed results

50% pass rate

51 lines contaminated by speaker collision
"Shwankar" became "Shawan Wanka"
Created ghost speaker that didn't exist

Competitor 3

3/8

Inconsistent

38% pass rate

2 speaker collisions in Test 2
"Shwankar" became "Juanka"
"Ingrid" became "grid"
Only detected 10 of 12+ speakers

Why This Matters

Any system can perform well on clean audio. The real test is consistency under challenging conditions — Zoom compression, multiple accents, crosstalk. That's where most systems fail, and where the difference becomes critical for research validity.

Evidence

Detailed Findings

Click each finding to see the actual transcript excerpts and timestamps.

Test 1

US Mock Trial — Clean Audio

Competitor 200:04:57 — 00:07:10

Meaning Reversal + Speaker Merge

Competitor 100:06:45 — 00:08:58

Ghost Speaker + Name Failure

Competitor 300:06:49 + 00:08:29

Strong Speaker Separation, Broken Names

Test 2

UK Zoom Deliberation — Challenging Audio

Competitor 200:00:35 + 00:07:11 + 00:32:04

Foreperson Contaminated + Juror Fragmented

Competitor 300:02:21 + 00:29:37

Juror Merged with Court Staff

Reviewable Confidence

Built for review, not blind trust.

Transcript systems perform differently depending on audio quality, crosstalk, accents, room acoustics, microphone placement, and source recordings. SecureStream is built to expose uncertainty, preserve speaker structure, and keep findings tied back to source audio — so teams can review the moments that matter instead of blindly trusting a flat transcript.

Lower-confidence segments can be reviewed
Speaker identities can be corrected in workflow
Findings stay linked to timestamps and source audio
Source quality limitations are surfaced, not hidden

Confidence Map

Example

Clear speaker, full sentenceHigh confidence

Single speaker, low background noiseHigh confidence

Two speakers overlappingReview recommended

Crosstalk + accent + low gainManual review

Confidence is exposed, not buried. Reviewers know which moments deserve a second listen.

Reviewability matters most when audio conditions are challenging.

When the verdict depends on who said what, you need a transcript you can trust.

Every timestamp in this study is auditable. Every finding is reproducible. See for yourself how Secure Stream performs on your recordings.

Request a Demo Try Interactive Demo

Same audio. Four systems.Very different records.

Transcript errors don't just lose words.They lose people.

Speaker Identity

Name Capture

Meaning Preservation

Understanding the Metrics

Speaker Collision

Diarization

Meaning Reversal

Name Capture

How We Ran the Experiment

Source Audio

Blind Processing

Analysis

Test 1: US Mock Trial

Test 2: UK Virtual Deliberation

The Scorecard

Why This Matters

Detailed Findings

US Mock Trial — Clean Audio

Meaning Reversal + Speaker Merge

Ghost Speaker + Name Failure

Strong Speaker Separation, Broken Names

UK Zoom Deliberation — Challenging Audio

Foreperson Contaminated + Juror Fragmented

Juror Merged with Court Staff

Built for review, not blind trust.

Confidence Map

When the verdict depends on who said what, you need a transcript you can trust.

Same audio. Four systems.
Very different records.

Transcript errors don't just lose words.
They lose people.