My Learning Log Week 3: Learning Corpus Linguistics
WEEK 3
Learning Corpus
Linguistics
Corpus linguistics can be described
as the study of language based on text corpora. It is the study of
language/linguistic phenomena through the analysis of data obtained from a
corpus. So, what is a corpus. A corpus is a collection of machine-readable,
authentic texts, chosen to characterize or represent a state or variety of a
language. Corpus linguistics is included both scientific processes and IT
tools to make studying corpus linguistics easier and reliable.
Teacher suggested many corpus websites for us such as the
British National Corpus (BNC), Corpus Leeds, Lextutor, etc. To design a
specialized corpus has 5 things to be concerned about. The first is corpus size. There
are no fixed rules; depending on research purposes, availability of data, and
time, but there are limitations of ‘too small’ corpora e.g. not enough hits to
make decent generalization, not covering enough concepts, terms, or patterns
under investigation. The second is text extracts vs. full texts. The whole text
offers more coverage because words or terms to be looked at may be randomly distributed
throughout the text and specific sections may be helpful if we are looking for
words or phrases. The third is a number of texts. Choices can be made between
collect few texts of large size or a number of texts with smaller sizes depend
on your research focus. The fourth is medium. Your corpus linguistic can be
spoken or written texts or mixed. The fifth is subject and text type. It should
mainly focus on the specialized text under investigation, although this is less
clear-cut in multidisciplinary subjects.
After that
teacher assigned us to do our own corpus linguistics on any topic. We can gather
text from many sources such as printed materials, word document texts,
CD-ROMs, texts on the web or online
databases. For my corpus linguistics, the topic I chose is Queen Elizabeth’s
Christmas messages. My corpus linguistics topic text can be gathered from the British
royal website. I will have to collect 68 messages from the royal website and put
them in the corpus program.
Corpus linguistics is a way to study using of words
and their collocation from authentic texts. We will know how to use the words or
any grammar from corpus linguistics.
ความคิดเห็น
แสดงความคิดเห็น