AphasiaBank English — Data for the PSST Challenge
The PSST: Post-Stroke Speech Transcription shared task was held as part of RaPID-4, at the 13th Language Resources and Evaluation Conference (LREC 2022) in Marseille, France. More background is described at the official PSST website.
The PSST Data
Conditions for using this dataset are described in the call for participation.
Archive files in .tar.gz
format.
See below for a description of the contents,
as well as a version history.
- Train Data Pack (260MB)
- Valid Data Pack (29.2MB)
- Test Data Pack (47.8MB)
Additional Resources
psstdata
:
a set of Python scripts for automatically downloading and loading this dataset
psstbaseline
:
the code to download, use, and reproduce the baseline model from the PSST challenge
Publications
Dimitrios Kokkinakis, Charalambos K. Themistocleous, Kristina Lundholm Fors, Athanasios Tsanas, and Kathleen C. Fraser. 2022. Proceedings of the 4th RaPID Workshop: Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments. European Language Resources Association, Marseille, France.
-
The Post-Stroke Speech Transcription (PSST) Challenge
Robert C. Gale, Mikala Fleegle, Gerasimos Fergadiotis, and Steven Bedrick
(pages 41–55) -
Post-Stroke Speech Transcription Challenge (Task B):
Correctness Detection in Anomia Diagnosis with Imperfect Transcripts.
Trang Tran
(pages 56–61) -
Speech Data Augmentation for Improving Phoneme Transcriptions of
Aphasic Speech using wav2vec 2.0 for the PSST Challenge.
Birger Moël, Jim O’Regan, Shivam Mehta, Ambika Kirkland, Harm Lameris, Joakim Gustafsson, and Jonas Beskow
(pages 62–70) -
Data Augmentation for the Post-Stroke Speech Transcription (PSST)
Challenge: Sometimes Less is More.
Jiahong Yuan, Xingyu Cai, and Kenneth Church.
(pages 71–79)
PSST Data Pack Contents
The data packs contain audio files and labels for the PSST Challenge. The contents of the data packs are organized as follows:
The ./audio
directory
The "audio" directory contains sub-directories for the BNT and VNT naming tasks (see task description for more details)
- Within each task directory, there is a subdirectory for each session (e.g. "elman11a")
- Within each session directory, there is a .wav file for each test item (e.g. "elman11a-BNT01-house.wav")
- The naming scheme is consistent across instances, but not all items are present for all speakers
- The audio files are mono audio recordings in standard PCM format, at a sampling rate of 16 kHz and a bitrate of 256 kb/s
The ./utterances.tsv
file
The labels are in the file "utterances.tsv", which is a UTF-8 encoded, tab-separated file with the following fields:
utterance_id
is a unique identifier for each production, of the form {session}-{test}{item}-{prompt} (e.g. "ACWT02a-BNT01-house")session
is the name of the AphasiaBank session from which the production was takentest
indicates which test each utterance comes from, eitherBNT
(Boston Naming Test) orVNT
(Verb Naming Test)prompt
is an orthographic rendering of the target wordtranscript
is the phonemic transcription of the production, in ARPAbet.- Silence is marked using
<sil>
- Spoken noise is marked using
<spn>
- Silence is marked using
correctness
is marked asTRUE
if the production is "correct" according to the clinical scoring rules of the BNT/VNT,FALSE
otherwise- For task 2 (correctness), this is the outcome label
aq_index
is the participant's Aphasia Quotient (AQ). AQ is the Western Aphasia Battery - Revised Aphasia Quotient (Kertesz, 2007) and it is a standardized total score that reflects overall aphasia severity. Values can fall between between 0.0 and 100.0. A lower number indicates higher severity.duration_frames
is the number of audio frames in each recording, or the duration in seconds times 16000filename
contains the relative path within the data pack to the file containing the audio recording for this production
For any questions about the contents of this data pack, please contact Robert Gale (galer@ohsu.edu) and Steven Bedrick (bedricks@ohsu.edu)
Version History
-
Data version 2022-03-02-full (latest)
The data as was used in during the PSST Challenge, including all test labels. -
Older Versions
-
Data version 2022-03-02
The data as was used in during the PSST Challenge.
-