Abstract
Pronunciation dictionaries allow computational modelling of the pronunciation
of words in a certain language and are widely used in speech technologies, especially in the fields of speech recognition and synthesis. On the other hand, a grapheme-to-phoneme tool is a generalization of a pronunciation dictionary that is not limited to a given and finite vocabulary. In this paper, we present a set of standardized phonological rules for the Faroese language; we introduce FARSAMPA, a machine-readable character set suitable for phonetic transcription of Faroese, and we present a set of grapheme-to-phoneme models for Faroese, which are publicly available and shared under a creative commons license. We
present the G2P converter and evaluate the performance. The evaluation shows reliable results that demonstrate the quality of the data.
of words in a certain language and are widely used in speech technologies, especially in the fields of speech recognition and synthesis. On the other hand, a grapheme-to-phoneme tool is a generalization of a pronunciation dictionary that is not limited to a given and finite vocabulary. In this paper, we present a set of standardized phonological rules for the Faroese language; we introduce FARSAMPA, a machine-readable character set suitable for phonetic transcription of Faroese, and we present a set of grapheme-to-phoneme models for Faroese, which are publicly available and shared under a creative commons license. We
present the G2P converter and evaluate the performance. The evaluation shows reliable results that demonstrate the quality of the data.
Original language | English |
---|---|
Pages | 308-317 |
Number of pages | 10 |
Publication status | Published - May 2023 |
Event | Nodalida 2023: Nordic Conference on Computational Linguistics - Tórshavn, Faroe Islands Duration: 22 May 2023 → 24 May 2023 Conference number: 24 http://nodalida2023.fo |
Conference
Conference | Nodalida 2023 |
---|---|
Abbreviated title | Nodalida |
Country/Territory | Faroe Islands |
City | Tórshavn |
Period | 22/05/23 → 24/05/23 |
Internet address |
Keywords
- pronunciation dictionaries
- computational modeling
- speech technologies
- grapheme-to-phoneme tool
- machine-readable character set
- Faroese language