Research Study

Same audio. Four systems.
Very different records.

We ran a blind experiment to test which transcript services can actually handle real-world legal recordings. The results reveal critical differences in speaker accuracy, name capture, and meaning preservation.

4
Systems tested
2
Test recordings
80+
Minutes analyzed
23
Unique speakers

The Problem

Transcript errors don't just lose words.
They lose people.

In legal research, jury studies, and focus groups, you're not just capturing what was said — you're tracking who said it. When a transcript system merges two speakers into one, or flips the meaning of a statement, the entire analysis becomes unreliable.

Speaker Identity

Each participant needs a consistent identity throughout the session. Merging speakers makes it impossible to track individual perspectives.

Name Capture

When participants introduce themselves, their names must be transcribed correctly. Mangled names break searchability and attribution.

Meaning Preservation

Legal positions must be captured accurately. "Wasn't negligent" vs "was negligent" is not a typo — it's a reversal of position.

Key Concepts

Understanding the Metrics

Before diving into results, here's what we measured and why each metric matters for legal and research transcripts.

Speaker Collision

Critical

When a transcript system incorrectly assigns statements from two or more different people to the same speaker ID. This makes it impossible to track individual perspectives.

A
B
2 people
?
1 speaker ID

Diarization

The process of partitioning an audio stream into segments according to speaker identity. Good diarization means each person gets their own consistent speaker ID.

Meaning Reversal

Critical

When a transcription error changes the fundamental meaning of a statement. "Was not negligent" becoming "was negligent" flips a defense position to a plaintiff position.

wasn'twas

Name Capture

Accurately transcribing participant names when they introduce themselves. Essential for searchability and attribution in the final transcript.

Shwankar
Shawan Wanka

Methodology

How We Ran the Experiment

A blind comparison using identical source audio across four leading transcript systems.

01

Source Audio

We selected two real multi-speaker recordings: a 35-minute US mock trial deliberation and a 45-minute UK jury deliberation over Zoom.

02

Blind Processing

The same audio files were submitted to four transcript services without any pre-processing or enhancement. No system received an advantage.

03

Analysis

We compared outputs for speaker collision, name capture accuracy, and meaning preservation — with timestamps for every finding.

Test 1: US Mock Trial

Controlled conditions

35 minutesClean room audioAmerican accents11 speakersFormal introductions

Test 2: UK Virtual Deliberation

Challenging conditions

45 minutesZoom audioBritish accents12 jurorsCompressed audioCrosstalk

No system is expected to be perfect; the goal was to compare intelligence-readiness for mock trial analysis.

Results

The Scorecard

We tested 4 categories across 2 test conditions. Here's how each system performed.

Secure Stream
Winner
8.7/10

Best for Mock Trial Intelligence

87% pass rate

Strongest overall performance across reviewed test samples.

  • Speaker count aligned with room structure
  • Persistent identities across the session
  • <1% observed speaker collisions
  • Names and introductions preserved
  • Built-in manual speaker cleanup
Competitor 2
1/8

Critical failures

13% pass rate

  • Speaker collision in both tests
  • Dropped juror name entirely
  • Reversed "wasn't negligent" to "was negligent"
  • Merged foreperson with another speaker
Competitor 1
4/8

Mixed results

50% pass rate

  • 51 lines contaminated by speaker collision
  • "Shwankar" became "Shawan Wanka"
  • Created ghost speaker that didn't exist
Competitor 3
3/8

Inconsistent

38% pass rate

  • 2 speaker collisions in Test 2
  • "Shwankar" became "Juanka"
  • "Ingrid" became "grid"
  • Only detected 10 of 12+ speakers

Why This Matters

Any system can perform well on clean audio. The real test is consistency under challenging conditions — Zoom compression, multiple accents, crosstalk. That's where most systems fail, and where the difference becomes critical for research validity.

Evidence

Detailed Findings

Click each finding to see the actual transcript excerpts and timestamps.

Test 1

US Mock Trial — Clean Audio

Competitor 200:04:57 — 00:07:10

Meaning Reversal + Speaker Merge

Competitor 100:06:45 — 00:08:58

Ghost Speaker + Name Failure

Competitor 300:06:49 + 00:08:29

Strong Speaker Separation, Broken Names

Test 2

UK Zoom Deliberation — Challenging Audio

Competitor 200:00:35 + 00:07:11 + 00:32:04

Foreperson Contaminated + Juror Fragmented

Competitor 300:02:21 + 00:29:37

Juror Merged with Court Staff

Reviewable Confidence

Built for review, not blind trust.

Transcript systems perform differently depending on audio quality, crosstalk, accents, room acoustics, microphone placement, and source recordings. SecureStream is built to expose uncertainty, preserve speaker structure, and keep findings tied back to source audio — so teams can review the moments that matter instead of blindly trusting a flat transcript.

  • Lower-confidence segments can be reviewed
  • Speaker identities can be corrected in workflow
  • Findings stay linked to timestamps and source audio
  • Source quality limitations are surfaced, not hidden

Confidence Map

Example
Clear speaker, full sentenceHigh confidence
Single speaker, low background noiseHigh confidence
Two speakers overlappingReview recommended
Crosstalk + accent + low gainManual review

Confidence is exposed, not buried. Reviewers know which moments deserve a second listen.

Reviewability matters most when audio conditions are challenging.

When the verdict depends on who said what, you need a transcript you can trust.

Every timestamp in this study is auditable. Every finding is reproducible. See for yourself how Secure Stream performs on your recordings.