iFLYTEK multilingual speech technologies won awards for three tasks at the 16th International Workshop on Semantic Evaluation (SemEval 2022). The event was hosted by SIGLEX, the Special Interest Group on the Lexicon of the Association for Computational Linguistics.
iFLYTEK competed against leading research institutions and teams including Dartmouth College, the University of Sheffield, Huawei, and Alibaba Dharma Academy at the event to win awards in the following:
-Task 8: Multilingual News Article Similarity
-Task 2 Subtask A one-Shot: Multilingual Idiomaticity Detection and Sentence Embedding
-Task 11: Multilingual Complex Named Entity Recognition (MultiCoNER).
In SemEval 2022 Task 8: Multilingual News Article Similarity, the Joint Laboratory of HIT and iFLYTEK Research (HFL) won gold by developing systems that identify multilingual news articles and rating pairs on a 4-point scale from most to least similar.
The competing teams were required to develop systems that could identify the similarities and differences between highly similar multilingual paragraphs as demonstrated in the picture above. The systems had to be able to analyze various elements including geography, narrative style, entities, tone, times, and writing styles. The task also emphasized cross-language comprehension to identify accuracy and potential biases to prevent false information.
HFL won SemEval 2022 Task 2 Subtask A one-Shot: Multilingual Idiomaticity Detection and Sentence Embedding, where competitors needed to determine whether sentences composed of different languages contained idiomatic expressions through their respective systems. Despite having no previous experience in translating Galician, one of the languages tested, iFLYTEK researchers were able to develop the capability by making use of its experience with other languages.
In the example above, systems had to determine whether “big fish” is literal or idiomatic in the two sentences. Idiomatic analysis and cross-language comprehension can have broad applications to promote writing and translation accuracy.
iFLYTEK further won three sub-tasks within SemEval 2022 Task 11: MultiCoNER. This shared task challenged NLP enthusiasts to develop complex Named Entity Recognition systems for 11 languages and 2 code-mixed tasks. On the tracks of code-mixed, Chinese and Bangla, iFLYTEK outperformed all other competitors with F1 values at 92.9%, 81.6%, and 84.2% respectively.
Rank of players in Track 13 code-mixed
Rank of players in Track 9 Chinese
Voice interaction systems like those utilized at SemEval 2022 will play an increasingly important role in various industries such as education and healthcare. iFLYTEK has already developed speech recognition of 71 global languages, with 90% accuracy for at least half of those so far. The company has also deployed its technologies overseas to provide services including speech recognition, speech synthesis, machine translation, and picture recognition for more developers. iFLYTEK believes technological innovation and thoughtful application of its AI technologies can meet critical needs and build a better world.