Vis enkel innførsel

dc.contributor.authorGarcía-Díaz, José Antonio
dc.contributor.authorColomo-Palacios, Ricardo
dc.contributor.authorValencia-García, Rafael
dc.date.accessioned2021-11-23T13:27:23Z
dc.date.available2021-11-23T13:27:23Z
dc.date.created2021-09-20T13:59:13Z
dc.date.issued2021
dc.identifier.citationCEUR Workshop Proceedings. 2021, 2943, 512-521.en_US
dc.identifier.issn1613-0073
dc.identifier.urihttps://hdl.handle.net/11250/2831044
dc.description.abstractSexism is harmful behaviour that can make women feel worthless promoting self-censorship and gender inequality. In the digital era, misogynists have found in social networks a place in which they can spread their oppressive discourse towards women. Although this particular form of oppressive speech is banned and punished on most social networks, its identification is quite challenging due to the large number of messages posted everyday. Moreover, sexist comments can be unnoticed as condescends or friendly statements which hinders its identification even for humans. With the aim of improving automatic sexist identification on social networks, we participate in EXIST-2021. This shared task involves the identification and categorisation of sexism language on Spanish and English documents compiled from micro-blogging platforms. Specifically, two tasks were proposed, one concerning a binary classification of sexism utterances and another regarding multi-class identification of sexist traits. Our proposal for solving both tasks is grounded on the combination of linguistic features and state-of-the-art transformers by means of ensembles and multi-input neural networks. To address the multi-language problem, we tackle the problem independently by language to put the results together at the end. Our best result was achieved in task 1 with an accuracy of 75.14% and 61.70% for task 2.en_US
dc.language.isoengen_US
dc.publisherTechnical University of Aachenen_US
dc.relation.urihttp://ceur-ws.org/Vol-2943/exist_paper19.pdf
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.subjectsexism identificationen_US
dc.subjectdocument classificationen_US
dc.subjectfeature engineeringen_US
dc.subjectnatural language processingen_US
dc.titleUMUTeam at EXIST 2021: Sexist Language Identification based on Linguistic Features and Transformers in Spanish and Englishen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2021 for this paper by its authors.en_US
dc.subject.nsiVDP::Humaniora: 000::Språkvitenskapelige fag: 010en_US
dc.subject.nsiVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550::Datateknologi: 551en_US
dc.source.pagenumber512-521en_US
dc.source.volume2943en_US
dc.source.journalCEUR Workshop Proceedingsen_US
dc.identifier.cristin1936075
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal