Spanish PerLA Data

Beatriz Gallardo-Paúls
Theory of Languages
University of Valencia

Maite Fernández Urquiza
Theory of Languages
University of Oviedo

Participants: 13 fluent, 19 nonfluent, 11 RHD
Type of Study: AphasiaBank Discourse Protocol -- Spanish
Location: USA
Media type: video
Browsable transcripts
Downloadable transcripts
Media folder

Citation information

Gallardo-Paúls, B., & Sanmartín, J. (2005). Afasia fluente. Materiales para su estudio (vol 1 del corpus PerLA). Valencia: Universitat de València.
Gallardo-Paúls, B., & Moreno-Campos, V. (2005). Afasia no fluente. Materiales y análisis pragmático (Vol. 2 del Corpus PerLa) (Vol. 2). Valencia: Universitat de V alència.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one corpus reference. If none is given, please use the primary AphasiaBank reference: MacWhinney, B., Fromm, D., Forbes, M. & Holland, A. (2011). AphasiaBank: Methods for studying discourse. Aphasiology, 25,1286-1307.

Aim of the Project and Methodological Framework

This file contains a set of transcripts from the PerLA corpus. The acronym PerLA stands for Perception, Language, and Aphasia. The data started to be collected in 2002 at the University of Valencia (UV) under the direction of Professor Beatriz Gallardo-Paúls, and nowadays the corpus keeps growing with the contributions of the PerLA Clinical Linguistics Research Group members (

The corpus was initially conceived as a means of assessing the pragmatic efficacy and the communicative success of aphasic speakers during spontaneous conversation. Our theoretical framework assumes that linguistic impairment is first of all evidenced through language use in colloquial settings of conversation, where it causes problems regarding social interaction and seriously affects the construction of the impaired speaker’s self-image.

Our methodology capitalises on the CA techniques in order to obtain ecological data. Thus, our interest was focused on observing the real performance of aphasic speakers during natural conversation with their key-conversational partner, with the purpose of determining whether their communicative failures were due mainly to any kind of grammatical impairment (that is to say, an impairment in phonology, morphology, syntax, or semantics), or whether they were experiencing pragmatic disabilities related to other cognitive functions (i.e. inability to contribute adequately to conversation, unawareness of the turn-taking system rules, inability to maintain the topic and move the thrust of conversation forwards, difficulties to infer implicit content, inability to represent the other’s mental state, etc.). In carrying out this analysis, our main goal was to achieve a better understanding of how linguistic impairment is displayed during real everyday interactions, and to contribute with this specifically linguistic knowledge to the design of better therapies for aphasic speakers.

Research related to PerLA corpus has been supported by grants from the Spanish Ministries with competences in Science (MICINN, MEC, and MINECO), and from the University of València (UV), being Professor Gallardo-Paúls the main researcher of the projects developed:

File Naming Format and Coding Conventions

Our methodology draws on CA, being that the reason why we have selected and adapted a set of CHAT coding conventions in order to transcribe our data. As it is stated in the CHAT Manual, CHAT conventions are not compulsory, but they allow for a wide range of variations depending on the nature of data, the scope of the study for which the data have been collected, and the kind of phenomena that are considered relevant for coding. Our adaptation has been envisaged to allow the transcription and coding of data showing a wide range of language and communication disorders, in order to carry out further studies from a pragmatic perspective (Fernández- Urquiza & Gallardo-Paúls, 2015). Thereby, we don’t codify morphological, phonological, lexical, or syntactic errors, except for the case of paraphasias. From the wide range of symbols proposed for retracings, repetitions, reformulations and false starts, we have decided to simplify the coding conventions and thus have adopted the symbol [/-] for false starts of any kind (with or without r etracing, including reformulations), and [/] for repetitions. We also make use of the satellite markers of ‡ for the vocative and „ for tag questions or dislocations. Lengthened syllables, pauses, and pauses between syllables are indicated with the conventional CHAT symbols, whilst the emphatic pronunciation is coded with [!].

Data collection protocol

All the informants were contacted by means of collaboration agreements with Spanish National Health System institutions, private health institutions, as well as some public associations like the National Treacher Collins Association and GERNA Association in Navarra (Spain), the National Centre for People with Rare Diseases and their Relatives in Burgos (Spain), the Galician Ataxy Association (Galicia, Spain), Asturian Asperger’s Syndrome Association (Asturias, Spain), Valencian Asperger’s Syndrome Association (Valencia, Spain), and Alzheimer’s Disease Association (Valencia, Spain). All of them gave their explicit consent for the collection and use of the data with research and teaching purposes as long as their anonymity was preserved. Thereby pseudonyms have been used and no video files were attached to the transcripts for this very same reason.

The general protocol for collecting the data comprises a first contact between the researcher and the informant, were the aims of the recording are explained to them. If the informant gives their consent, a date is agreed for a subsequent meeting at the informant’s place, where conversation among the informant, their conversational partner(s) and the researcher is recorded, although the contributions of the last one tend to be minimal. When for any reason the recording could not take place at the informant’s place, it was done either at the informant’s hospital room (if the stroke was very recent) or at the speech-language-therapist’s room. When facing the transcription, the first five minutes of the recording are usually discarded with the purpose of assuring that the informants are focusing on the conversation and not on the presence of the camera. Furthermore, the collection of the Asperger’s syndrome data comprised a graphic support for pragmatic training. The aim of this instrument was to motivate a semi- oriented and inducted discourse with participants from a narrative elicitation task based on a picture story. The graphic support was composed of 6 illustrated cards with specific contents and independent questions. Thus, the function of the images was the same as a conversational script. The interviews were carried out in diverse multi-use rooms located in the Asperger Associations of Asturias and Valencia. Before the recordings, the following instructions were provided to participants: “I am going to show you six illustrated cards and I will pose you different questions about what you are seeing in each picture. For example, I will ask you to put in the place of some characters or to tell me a short story from the drawings. Please try to refer always to the card number”.