Analyzing Formulaic Patterns in Historical Corpora
Claudine Moulin, Iryna Gurevych, Natalia Filatkina, Richard Eckart de Castilho, "Analyzing Formulaic Patterns in Historical Corpora", dans Jost Gippert, Ralf Gehrke (éd.), Historical Corpora. Challenges and Perspectives. Korpuslinguistik und interdisziplinäre Perspektiven auf Sprache (CLIP), n° 5, Tübingen, mars 2015, p. 51-63.
This paper aims to point out a linguistic phenomenon that due to the current stage of research can be analysed only insufficiently with the help of an electronic text corpus. In this way, the paper adds a new aspect to the discussion about historical corpora by tackling the question of how they should be designed in order to be useful for linguistic research on so‐called formulaic patterns. The novelty of the question becomes apparent considering the fact that at present such historical corpora do not exist. In section 1, we define the term formulaic pattern because a clear understanding of this phenomenon is a prerequisite condition for collaborative research of it by historians of language and corpus and computer linguists. Section 2 gives a brief outline of the state of the art in the field of modern formulaic language within the framework of corpus and computer linguistics. Section 3 shows that some well known problems in this area are exacerbated when applied to historical texts. Section 4 presents a possible solution that has been implemented by the HiFoS Researchers' Group at the University of Trier (Germany). Joint research efforts planned with UKP Lab at the TU Darmstadt (section 5) demonstrate that the restrictions posed by historical formulaic patterns are challenges to be overcome, rather than insurmountable obstacles.