What to listen to, write

Kortabitarte Egiguren, Irati

Elhuyar Zientzia

Searching in writing is easy on the net. To do this we just have to write the word that we want to consult in the search engine. In these searches, however, we lose, among other things, what is indicated in the audio files, provided that the explanations of what is indicated in these audio files are not collected in the written text.
What to listen to, write
01/03/2008 | Kortabitarte Egiguren, Irati | Elhuyar Zientzia Komunikazioa

ETB's Gaur Egun programs use, among other things, to form speech treatment systems.
EITB
Knowing oral speech and converting it into text is not an easy task. Words do not separate well from each other, one must take into account intonation and in addition the noise of physical signals is an obstacle. In this sense, a broad market has been opened for systems that process and understand oral speech. That is, for tools that convert us into written text.

These systems are currently integrated mainly in telephone services such as appointment, product request, booking request for shows, etc. But there are others like automatic dictation. The latter is working, among others, in the Systems and Automatic Engineering department of the UPV/EHU.

Speech treatment requires a lot of good training. That is, the system must receive some training, which is known as machine learning. For this purpose, on the one hand, files, audios and sounds of television and radio, and on the other, reference texts of what was said in these media. UPV researchers, for example, frequently use ETB's Gaur Egun and Teleberri programs to form the system. It is not necessary to know what has been said literally, but it is capable of collecting a summary of what has been said. In short, try to understand the relationship between sounds and words.

Once the learning process is complete, the system should be able to understand what was said in any Gaur Egun or Teleberri. Although learning is a slow process, once the system has the rules or information internalized, that is, it has the right reference material, it shows the result somewhat quickly. In this case, written text of what was spoken. In short, the goal is to get text from an audio or sound.

Small large

It is true that most of these types of applications that can be found on the market target “large” languages, especially English. However, researchers from the Universidad Politécnica de Donostia-San Sebastián, in collaboration with the IXA, GTTS and Computational Intelligence groups of the UPV/EHU, work with the Basque language. The obvious difference between these 'big' and 'small' languages lies in the number of reference data. This type of English tools have a lot of data, while the reference material in Basque is much smaller. Therefore, researchers are looking for new techniques to better and more accurately take advantage of these few data.

The frequency and intonation of what is said orally help differentiate the type of information the system is receiving.
UPV/EHU
To obtain this degree of precision they use several mathematical equations. They try to find the most relevant features of data sets and audio files that provide adequate information. However, it is quite difficult to make this selection, that is, to choose the information to be received and to be rejected. They usually work frequently and intonation to differentiate the type of information the system is receiving at all times (for example, if it is a question or an expressive sentence).

These systems depend entirely on the language and each language has its own tool. But, for example, UPV/EHU researchers work not only with Basque, but also with Spanish and French. The Teleberri programme or the Infozazpi sessions, for example, have two main objectives: on the one hand, they want to understand Spanish and French – together with Basque – and on the other, to look for similarities between Basque and the other two languages in order to improve the training of tools in Basque.

In this sense, a series of essays are currently being conducted that analyze the possibility of using multiple languages in the same tool. This is the future challenge of UPV researchers: to develop a system capable of understanding Basque, Spanish and French.

Project overview
This research group works in the field of multilingual speech knowledge for Basque and the languages around it. In particular, they develop various tools and resources for automatic access to information through the informative language of the Basque media. To do this, they investigate the techniques to obtain this information in the most effective way possible and, above all, develop methods for minority languages such as Basque.
Director
Dr. Look at Karmele López de Ipiña.
Working team
C.M. López de Ipiña 1, N. Barroso 1, N. Gilisagasti 1, I. Ariztimuño 1, A. Nov 1, N. Ezeiza 2 and M. Hernández 2.
Department
Systems and Automatic Engineering.
Faculty
1 Universidad Politécnica de Donostia-San Sebastián and 2 Facultad de Informática.
On the left, Ixabel Ariztimuño, Nora Barroso, Aitzol Ezeiza, Karmele Lopez de Ipiña and Nerea Ezeiza.
(Photo: UPV)
Kortabitarte Egiguren, Irati
Services
More information
2008
Services
036
Universities
Dissemination of knowledge
Other
Babesleak
Eusko Jaurlaritzako Industria, Merkataritza eta Turismo Saila