BCI Meeting 2023 | Carlos Valle Araya

Sixty healthy individuals participated in this study. EEG data was gathered using a 64-channel system while the participants performed a task involving silent speech production and perception. This task was composed of four segments, wherein each participant was presented with a set of 30 everyday Spanish sentences, repeated approximately six times per subject.

Graphical description of the task

The task performed by the participants was a silent speech production and perception exercise. They were presented with 30 common Spanish sentences. The EEG data was collected 0.5 seconds after the onset of each word in these sentences. The participants repeated this task multiple times, providing a rich dataset for our deep learning models to analyze and learn from. The goal was to explore the potential of translating these brain signals into recognizable text.

Here are the sentences featured in our research:

recién me dijeron que si
La inteligencia artificial es real
era terrible para mi estomago
soy flojo para hacer ejercicios
y yo voy al gimnasio
voy caminando todos los días
Estuve todo el invierno resfriado
Cuando vaya saliendo te llamo
Nunca hay que decir nunca
pero mi abuela tiene siete hermanos
No se puede posponer el plazo
pero este temblor fue más fuerte
Eso te pasa por estar nervioso
yo no vivo con mi esposa
Me dan miedo los edificios altos
Antes todos querían comprar una tele
yo soy la mayor de los hermanos
la música para mi es la vida
Gracias a Dios tuve una buena educación
No fui a ningun lado de vacaciones
Ellos dijeron que el vino estaba malo
Faltan pocos días para salir de vacaciones
Me gusta con dos cucharadas de azúcar
Tengo una muy buena convivencia con mis vecinos
Esa mochila no es igual a la mia
Anda temprano a la tienda para que puedas comprar
y todavía estoy esperando que me llamen para ir
Si no pasa esa micro me voy en metro
Mi marido y yo jubilamos en el mismo año
Cada día me pongo más nervioso con la prueba

The unique 126 classes are:

a	convivencia	el	estuve	metro	mi	pero	saliendo	todavia
abuela	cuando	ellos	faltan	ir	mia	plazo	salir	todo
al	cucharadas	en	flojo	jubilamos	micro	pocos	se	todos
altos	dan	era	fue	la	miedo	pongo	si	tuve
anda	de	es	fuerte	lado	mismo	por	siete	una
antes	decir	esa	fui	llamen	mochila	posponer	soy	vacaciones
artificial	dia	eso	gimnasio	llamo	musica	prueba	te	vaya
azucar	dias	esperando	gracias	los	muy	puedas	tele	vecinos
año	dijeron	esposa	gusta	malo	nervioso	puede	temblor	vida
buena	dios	estaba	hacer	marido	ningun	que	temprano	vino
cada	dos	estar	hay	mas	no	querian	tengo	vivo
caminando	edificios	este	hermanos	mayor	nunca	real	terrible	voy
*comprar*	*educacion*	*estomago*	*igual*	me	*para*	*recien*	*tienda*	y
*con*	*ejercicio*	*estoy*	*inteligencia*	*metro*	*pasa*	*resfriado*	*tiene*	yo

We carried out an analysis of the word frequency in our dataset to check for imbalances that might bias the networks. The most frequent word was the Spanish article “la”, which accounted for 3.5% of the data. Even if the networks were biased towards this specific label, the maximum accuracy would only reach 3.5%, far below the 11.7% accuracy achieved by DeepSpeech 2.

Word Frequency for the 126 classes across all 60 sessions

Channel location diagram for the decoding analysis.

Note: The dataset will be publicly available after the acceptance of the publication Subject-independent decoding of perceived sentences from EEG signals using artificial neural networks.