Projects
GeNTE
Info: GeNTE is a natural, bilingual (English-Italian) corpus designed to benchmark the ability of MT systems to generate gender-neutral translations. GeNTE features 1,500 parallel sentences enriched with manual annotations and features a balanced distribution of phenomena that either entail a gender-neutral or a gendered translation. GeNTE is available on Hugging Face.
Role: Data collection, selection, editing, and annotation. Coordination of the professional translators who produced the gender-neutral Italian references. Dataset validation.
Neo-GATE
Info: Neo-GATE is a bilingual corpus designed to benchmark the ability of machine translation sys- tems to translate from English into Italian using gender-inclusive neomorphemes. Neo-GATE is adaptable to any Italian neomorpheme paradigm thanks to an extensive annotation. Neo-GATE is available on Hugging Face.
Role: Data collection, selection, editing, and annotation. Dataset validation. Adaptation and evaluation scripts programming.
