Online
text and concordancer sites for French
Compiled by Betsy
Kerr, University of Minnesota, <bjkerr@umn.edu>
Written texts only:
- 'The
Compleat Lexical Tutor': UQAM
(Université de Québec à Montréal) Web
Concordancer. http://www.lextutor.ca/concordancers/
Easy-to-use and fairly versatile concordancer, with a choice of several
different corpora, including all of Le monde from 1998. Maximum
hits: 2001. Includes composite written and spoken corpora of
equal size (Français parlé and Français écrit) which are useful for
frequency comparisons. Allows consultation of larger text
segments.
- Lexiqum:
Concordancier québécois de RALI (Recherche
appliquée en linguistique informatique, U. de Montréal). http://retour.iro.umontreal.ca/cgi-bin/lexiqum
Maximum of 500 examples from a corpus of written texts of
Quebec French of various genres. Be sure to read the page 'Aide'
for more information on the corpora, and for more search options.
Maximum context size = 200 words.
- ARTFL.
http://humanities.uchicago.edu/ARTFL/ARTFL.html
Project for American and French Research on the Treasury of the French
Language, University of Chicago. A huge searchable database of texts
from 15th-20th
century French literature, philosophy, arts, sciences. (U of M students
can
access this controlled-access site through the U of M Librairies'
site.) Can do
simple or sophisticated searches, but requires a little learning.
Spoken corpora only:
- ELICOP. http://bach.arts.kuleuven.ac.be/elicop/
Etude LInguistique de la
COmmunication Parlée, Département de Linguistique, Université
Catholique de
Louvain (Belgique). Concordancer and extensive transcripts of several
oral
corpora, notably 80 hours of the Orléans corpus (1968-71). This and the
other sub-corpora of the ELILAP corpus are all spontaneous
conversation. The LANCOM corpus presents smaller
samples
of role-plays by Flemish-speaking learners of French (27 hrs.), and a
very
small sample (3-4 hrs.) of the same by Belgian Francophones and French
Francophones.
Go
to 'Consultation
des corpus' for complete instructions on how to use the
concordancer.
Use the link 'Recherches
KWIC' to obtain a simple KWIC (KeyWord In Context) concordance,
i.e. a
list of occurrences with just one line of context (maximum hits: 500).
The link entitled 'Nouveau
formulaire de recherches' allows
one to set the length of context accompanying each occurrence, as well
as other
settings (maximum hits: 999).
From the homepage, click on 'Corpus
étiqueté' ('Tagged corpus') for another search engine that
allows one
to specify the syntactic category of each word in a string of words to
be
searched for. Easy to use. Maximum hits: 999.
All three of these search engines allow the use of what are
called
'regular expressions'; see
instructions for their use on the 'Help'
page accessible from the 'Nouveau formulaire de recherches'.
- DELIC. http://www.up.univ-mrs.fr/delic/crfp
Corpus de Référence du Français Parlé. Project of the Equipe de recherche DELIC (DEscription Linguistique
Informatisée sur Corpus) under the direction of Jean Véronis at
the Université de Provence.
This is a demonstartion concordancer, which will give a maximum of 300
examples from the CRFP corpus. The corpus includes a variety of
different oral genres, varying in degree of formality. For more
information about the
corpus, see the related pages: Composition,
Enregistrement,
and Transcription
(cliquez dans le menu à gauche).
NOTE: The author of
this site will make available to other
researchers, on request, digitized text and audio files of the
Minnesota Corpus, a 10-hour
corpus of spontaneous conversation by two
groups of three native speakers. Write to Betsy Kerr at
bjkerr@umn.edu.
Updated, March 2009