Linguistic Linked Data Challenge

Collocated with LREC 2014

Reykjavik, Iceland, 27th May 2014

Please submit datasets via SoftConf

Results

We would like to congratulate our two joint winners of the challenge this year:

Gilles Sérasset and Andon Tchechmedjiev: Dbnary: Wiktionary as Linked Data for 12 Language Editions with Enhanced Translation Relations
Gabriela Vulcu, Raul Lario Monje, Mario Munoz, Paul Buitelaar and Carlos A. Iglesias: Linked-Data based Domain-Specific Sentiment Lexicons

Furthermore, we would also like to highly commend the work of:

Maud Ehrmann, Francesco Cecconi, Daniele Vannella, John Philip McCrae, Philipp Cimiano and Roberto Navigli: A Multilingual Semantic Network as Linked Data: lemon-BabelNet

Call for Datasets

The explosion of information technology has led to a substantial growth in quantity, diversity and complexity of linguistic data accessible on the Web. The lack of interoperability between linguistic and language resources represents a major challenge that needs to be addressed, in particular, if information from different sources is to be combined, such as machine-readable lexicons, corpus data and terminology repositories. The Linked Data in Linguistics (LDL) workshop series provides a forum to discuss these types of resources, strategies to address issues of interoperability between them, protocols to distribute, access and integrate this information and technologies and infrastructures developed on this basis.

This year, there is a data challenge associated to the Linguistic Linked Data Workshop. In addition to regular workshop papers, we will accept dataset description of 4-6 pages describing linguistically or NLP-relevant datasets published on the web as linked data published on the web as linked data. These linguistic datasets include, but are not limited to, lexica, terminologies, semantic networks, annotated and parallel corpora, multimodal resources, typological resources and linguistic metadata. The data challenge committee will review and evaluate data according to the following criteria, with prizes of up to €700, funded by the LIDER project, awarded to the highest scoring datasets:

Availability
- Use of Linked Data and RDF.
- Hosted on a publicly accessible server and be available both during the period of the evaluation and beyond.
- Use of an open license.
Quality of Resource
- Represents useful linguistically or NLP-relevant information.
- Reuses relevant standards and models.
- Contains complex, non-trivial information, e.g., multiple levels of annotation.
Linking
- Links to external resources.
- Reuse of existing properties and categories.
Impact/usefulness of the resource
- Relevant and likely to be reused by many researchers in NLP and wider fields.
- Uses linked data to improve the quality of and access to the resource.
Originality
- Represents a type of resource or a community currently underrepresented in (L)LOD cloud activities
- Facilitates novel and unforeseen applications or use cases (as described by the authors) enabled through Linked Data technology.

Submission and Publication

We accept dataset descriptions of 4-6 pages, which include a URL under which the data is available. The papers of the workshop will be published as online proceedings. All submissions can be presented as lightning talks or posters, albeit such a presentation is optional. In addition, we aim for a journal special issue as post-conference proceedings in which a selected amount of papers presented at the workshop will be published. When submitting a dataset description, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.), to enable their reuse, replicability of experiments, including evaluation ones, etc.

Please submit all datasets via SoftConf. Please select “Make a new submission” and then “Click HERE to make a new dataset challenge submission”

Timeline

Submission deadline: Fri, Feb 28, 2014
Notification of acceptance: Fri, Mar 14, 2014
Camera-ready paper: Fri, Mar 28, 2014
Workshop: Tue, May 27, 2014

Please note that due to synchronization with the main conference, NO EXTENSIONS can be given.

Organizers

Christian Chiarcos (Goethe-Universität Frankfurt am Main, Germany)
John McCrae (Universität Bielefeld, Germany)
Philipp Cimiano (Universität Bielefeld, Germany)

3rd Workshop on Linked Data in Linguistics: Multilingual Knowledge Resources and Natural Language Processing

Reykjavik, Iceland, 27th May 2014. Co-located with LREC 2014