SADiLaR raises global visibility from Poland to Italy

Prof Menno van Zaanen, professor in digital humanities at the South African Centre for Digital Language Resources (SADiLaR) at the North-West University (NWU), recently spent a productive two weeks at the University of Gdańsk in Poland to conduct teaching activities and interdisciplinary research in digital humanities.

He was invited by Dr Karolina Rudnicka from the Faculty of Languages as part of the university's fourth edition of the Visiting Professors programme, which is aimed at increasing the internationalisation of education by exposing students and researchers to excellent researchers from around the world.

During his visit, Prof Van Zaanen gave two guest lectures, collaborated with students on a small joint research project, and worked with Dr Rudnicka on an interdisciplinary research publication.

“I first met Menno in May 2023 during a visit to SADiLaR where we discovered our shared research interests,” recalls Dr Rudnicka, who is an assistant professor at the Institute of Applied Linguistics within the Faculty of Languages.

“Knowing he had never been to Gdańsk or Poland, I saw an opportunity through our university's Visiting Professors programme. Menno was excited to visit, so we applied and successfully secured the funding. It was a productive and enjoyable visit for everyone,” she adds.

In his first guest lecture at the Institute of Applied Linguistics, Prof Van Zaanen shared his personal journey from computer science to digital humanities, recounting some examples of his initial research in digital humanities and highlighting the pitfalls he experienced.

For his second guest lecture, he discussed the formal means of describing natural language learning at the University of Gdańsk’s Institute of Computer Science. “My lecture was about how we can design formal models – essentially using mathematics – to describe how we can learn languages. This is mostly focused on syntax, which describes the rules of how sentences can be put together from words,” he explains

Exploring the social networks of an Oscar Wilde novel

For the small joint research project with the students, Prof Van Zaanen actively took part in the research activities and supervised the students’ outputs.

“Together with the natural language processing students, we explored the social networks (characters and their relationships) in translations of Oscar Wilde's novel The Picture of Dorian Gray. We compared the original English text to translations in German, Polish and Dutch, with the expectation that the social networks would be the same. However, they were not. We now need to figure out exactly why they are not the same – it could be due to computational issues, translation preferences or language preferences.”

One of the goals of the small joint research project was to present the research at a conference and publish the results in a journal. “Following a week of very focused work, we submitted abstracts to two conferences – one has already been accepted while the other is still awaiting an outcome,” says Prof Van Zaanen.

Dr Rudnicka says the students had a great time meeting Menno and collaborating with him on the joint research project. “We're still working on it with them, as they will present the results at a Young Science Congress in July, and we will also be writing the article.”

Prof Van Zaanen also made time to work on an interdisciplinary research publication with Dr Rudnicka concerning the influence of AI-powered writing assistants of the English language.

Fifth workshop on Resources for African Indigenous Languages (RAIL)

Following his visit to Poland, Prof Van Zaanen travelled to Torino, Italy, where he attended the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), along with SADiLaR’s digital humanities researcher in Siswati, Dr Muzi Matfunjwa.

The five-day hybrid conference brought together researchers and practitioners in computational linguistics, speech, multimodality and natural language processing, with special attention to evaluation and the development of resources that support work in these areas.

Furthermore, SADiLaR hosted the Fifth Resources for African Indigenous Languages (RAIL) workshop on the last day of the LREC-COLING 2024 conference.

The theme for this year's RAIL workshop was “Creating resources for less-resourced languages”.

“Many African languages are under-resourced. These languages often share interesting properties such as writing systems or tone, making them different from most high-resourced languages,” Prof Van Zaanen explains.

“From a computational perspective, these languages lack enough linguistic resources to undertake high-level development of Human Language Technologies (HLT) and Natural Language Processing (NLP) tools, which in turn impedes the development of African languages in these areas.”

He says past workshops made it clear that the problems and solutions presented are not only applicable to African languages, but also relevant to many other low-resource languages.

“Because these languages share similar challenges, this workshop provided researchers with opportunities to work collaboratively on issues of language resource development and learn from each other.”

..........

Prof Menno van Zaanen and Dr Karolina Rudnicka.

Submitted on Wed, 07/17/2024 - 14:20