Speaker
Description
Identifying suitable datasets is a common challenge for data scientists working in domains with scarce data. For research on sign languages, this usually involves extensive literature review or word-of-mouth. Information on individual datasets may be distributed across different publications, data repositories and (potentially defunct) project websites. We introduce the Sign Language Dataset Compendium, an extensive overview of linguistic resources for sign languages. It covers corpora, lexical resources, and commonly used data collection tasks. Special attention is paid to covering many different languages from around the globe. All information is provided in a standardised format to make entries comparable, but kept flexible enough to allow for differences in content. The compendium is a growing resource that is updated regularly.
Keywords
Sign languages
Corpora
Lexical resources
Survey
Metadata
Find me @ my poster | 2, 3 |
---|