To study sign languages – natural languages that use hands, facial and body movements to convey meaning – large data collections are needed. Transcription of videos is very time consuming though and therefore very expensive. The Max Planck Institute for Psycholinguistics in Nijmegen set up a project to automate sign language transcription.
A difficult job, Binyam explains: ‘Signers differ a lot in how they use their language. Clothes, hand size, skin colour, personal style all have to be taken into account. Differences in recordings – where does the light come from, how bright is the video, how far is the subject away – all this complicates the learning task for the computer.’ He teamed up with the machine learning group of the Institute for Computer and Information Science at Radboud University.
The first problem he tackled was that of automatic language recognition. ‘A machine needs to know what language it is before translation can be done.’ There are many sign languages and just like Dutch, German, and British sign languages, Ethiopia, where Binyam originally comes from, has its own sign language too. Binyam succeeded in making a program that can discriminate between six sign languages with an accuracy of 84%.
‘This is a big success rate, given the fact that the machine learned to do so from four signers per language only. This accuracy will improve when we feed the program with more data,’ the computer scientist says. ‘We solved this by generating a dictionary of pixel patterns that appear in the videos and then matched that to those patterns in named languages.’
Video search engines
Other issues that Binyam addressed in his thesis are the recognition of role taking in conversations by a computer (who is signing) and the meaningful part of a gesture (is the signer ‘saying’ something or scratching his nose?) – which may be easy for a human, but not for a machine. His research will help sign language researchers in the painstaking job of transcribing videos of signed stories and can be used for a sign-based search system, or real time translation of signed languages into spoken or written languages. ‘But searching video content is of great interest to search engine makers like Google as well.’
Who is Binyam Gebrekidan Gebre?
Binyam Gebrekidan Gebre (1983, Mekelle, Ethiopia) earned a B.Sc. degree in Computer Science and Engineering at Mekelle Institute of Technology, with very great distinction. In 2008, he won an Erasmus Mundus Masters scholarship in Natural Language Processing and Human Language Technology, taught in two countries: France and the United Kingdom. Here, he wrote his thesis on part-of-speech tagging for Amharic, a Semitic language spoken in Ethiopia. Currently, he is working as a data scientist for the Rechenzentrum Garching (RZG), a computing center for the Max Planck Society (MPS).
00 31 6 29 36 55 84