seventh framework
  • Castellano
  • Français
  • English
  • Deutsch
  • Italiano
  • Ελληνικά

PANACEA Annotated Italian Labour Legislation Corpus v.2

PANACEA Annotated Italian Labour Legislation Corpus Version 2 consists of Italian texts in the Labour Legislation (LAB) domain that were collected and automatically annotated in the framework of PANACEA, an EU-FP7 Funded Project under Grant Agreement 248064. The texts were crawled web pages that were automatically detected to be in the Italian language and were automatically classified as relevant to the LAB domain. Data collection took place in the summer of 2011. The automatically assigned annotations deal with sentence and token segmentation, POS and lemma, dependency relations and named entities.

Size information:

  • tokens: 70 million
  • sentences: 2,975,818
  • Download location

    DISCLAIMER: “The right to use the sentences contained in this data set has been granted by their copyright holders. This usage is exclusive for research purposes and no profit can be made out of it. We are grateful to all sources for their kind and generous contribution. For further information on these sources, please see: Acknowledgements

    This resource is distributed under the following licence: CC-BY-SA