TEACHING VOCABULARY USING NATURALLY OCCURRING DATA: 10 ideas on the use of language corpora in the EFL classroom

    Further to my previous blog post on the importance of adapting materials to foster language learning, I will now focus on the use of raw corpus data for the design of vocabulary activities that promote the acquisition of lexis.

What do language corpora have to offer to an EFL classroom?

    Activities based on corpus data can be very beneficial to language learners as they get exposed to real life language and language chunks that will eventually get stored in their mental lexicon. They raise the students’ lexical awareness through the study of the different idiomatic uses and meanings a word can adopt.

Focus on commonly confused words

    Exposure to corpus language provides students with useful lexical input and can also be used to address the learners’ specific needs by focusing on language issues that impede learning (such as a pair of confusing words, near synonyms and commonly confused phrasal verbs). By constructing materials based on concordance lines, teachers can demonstrate certain features of the target language (TL) and guide learners to notice problematic or important structures of the language.

    Depending on their level and their specific needs, students can either be given access to the raw, unedited corpus, where they can learn to manipulate linguistic data and make their own discoveries about language or be provided with prepared printouts containing corpus-based tasks.

Constructing vocabulary exercises from corpus data

    Below you can find a selection of vocabulary exercises derived from language corpora that focus on the different contextual uses of the near-synonyms broad and wide. These activities are examples of exercises teachers can create based on corpora. Specific concordance lines are deliberately selected in order to direct learners’ attention to the collocates of the two adjectives and their distinct contexts of use.


Students have to guess the KWIC (Key Word In Context) by studying the words in the immediate context. They focus on the collocations the words form, they discover how words are used and how meaning changes in context.


    Students have to pay attention to the way the two adjectives are used in context and then choose the appropriate adjective for each group of collocates. The concordances used in this activity are carefully selected and grouped in order to highlight the different meanings the words adopt according to context. Ss learn the collocations formed by the adjectives broad and wide and begin to notice lexical patterns in the target language.

3. MIXED UP CONCORDANCE LINES (Fill in the gaps)

   This activity contains mixed-up concordances where learners have to fill in the omitted adjective. The exercise highlights the possible confusion between the two adjectives and ss learn to discriminate between the contexts of use of broad and wide.


    Ss must spot the erroneous word and replace it with the appropriate one. By focusing on the meaning of these language patterns, ss learn that even a small mistake can change the meaning of the expression or make it meaningless.


    Students focus on the collocations that broad and wide form. They have to brainstorm and delete the words that do not collocate with the two adjectives.


This activity is made up from selected expanded examples that highlight the different collocations the two adjectives form. Students have to guess the missing collocate by paying attention to the surrounding context.

7. SPAGHETTI exercise

The “spaghetti” exercise that follows and activity no. 8 are simple matching exercises but visually very interesting. Learners have to find the pairs and match the collocates in activities that are more engaging and motivating than the simple instruction of the L1/L2 equivalents of the collocations.



 Learners get exposed to phrases like a broad range of issues and a wide variety of conditions and the contexts in which they are used. The purpose of this exercise is not to provide learners merely with the collocates of broad and wide, but also on ensuring that the language of the exercise itself forms useful lexical input.


Ss have to think carefully in order to rearrange the words and put them in the correct order. They form the expressions themselves and any misconceptions on the order of the collocates are cleared out.

    Learning to notice TL patterns

Activities derived from corpus data should be designed to teach rather than test learners. Their main purpose is to aid vocabulary acquisition and satisfy our learners’ specific needs by providing examples and practice on confusing sets of words etc. Students learn to notice the patterns of the target language. They gain a deeper understanding of the nature of lexis and of the links that words establish in discourse. They do not only pick up the language chunks, but they also realize their importance and begin to recognize them as single units.

Materials writing using language corpora

Some useful links:

  • The British National Corpus (BNC): a 100 million word collection of samples of written and spoken language


• The Oxford Curated Corpora: providing useful corpus data for linguists, lexicographers, materials writers and lamguage teachers https://languages.oup.com/products/corpora/

  • The TIME Magazine Corpus: a collection of 100 million words of text from 275.000 articles from TIME magazine


  • The Michigan Corpus of Academic Spoken English (MICASE): a collection of transcripts of academic speech


Published by Joanna Nifli

Greek-Canadian ELT teacher and freelance translator with work experience at the United Nations and the European Parliament. Holder of an MA in Teaching English as a Foreign Language (MA TEFL), the Cambridge CELTA and an MA in Applied Translation Studies from the University of Leeds. Interested in innovative pedagogies in language education, TESOL, teacher training, applied linguistics and related topics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Best Ed Lessons

The Best Free Educational Lessons for Homeschool, Primary, Middle and High School Students and Teachers - we are non-profit, no fees, no ads

Caroline Hynds, Freelance Editor

Content editor and copy editor based in Brighton & Hove. Professional Member CIEP, Avallain Author accredited.

Dewisant's Blog

Smile! You’re at the best WordPress.com site ever

Secret ESL Teacher

Diary of an ESL Teacher

Matthew Kutter

Instructional Design - ESL & TESOL Education


Archive of seminars for educators scheduled weekly at http://learning2gether.pbworks.com


Anyone who stops learning is old, whether at twenty or eighty. Henry Ford

All about ELT

Expect the unexpected


TESOL International Association advances the quality of English language teaching worldwide

ELT Connect

FREE resources and networking platform

Teachers Together

Helping new English Language Teachers in their profession


“The mind is like a sponge, soaking up endless drops of knowledge." - Robert M. Hensel

Unite ELT

Unite English Language Teachers Branch

ELT for beginners

Tips and resources to help you succeed

Transformations of an EFL teacher

Reflections of an English as a foreign language teacher..

English-Language Thoughts

English-Language Thoughts

TEFL Planet

All things TEFL and beautiful!

%d bloggers like this: