| ||||
Building
a Corpus | ||||
Definition: Corpus(TLF translated + Wikipedia) - Large and structured set of texts collected according to various study criteria that may include exhautivity, domain or genre specificity, etc. Theoretic considerations:Relevance : The framework (aims, approaches, ...) of the study must be defined precisely.Compliance: Define the requirements ensuring the corpus provides a realistic representation (i.e., both compliant with the reality of the phenomenon under study, and exhibiting sufficient regularity to obtain some result). Usability: The corpus needs to be structured, and its size must be sufficient to allow for a suitable representation of the phenomenon that is studied. The issue of statistic representativity must be considered. If several phenomena are studied and compared, they need to be equally represented within the corpus. Practical Work: Always ask yourself: If building a corpus, keep the following issues in mind: Bibliography (in French)Définir un Corpus (général): Extrait de la thèse de B. Pincemin (1999)Construction et gestion des corpus (point de vue terminologique) - E. Marshman, OLST 2003 |
Aug. 2005 |