Thursday, June 21, 2012

Corpus Linguistics -- Why It's Useful (雙語字典不可靠)

Free is good, right? Many lazy students like to use free online dictionaries to learn English. This is a good way to harm yourself by memorizing mistakes.

Here is a page from a well-known online English-Chinese dictionary:

Is this so-called "dictionary" reliable? Notice: It's not really a dictionary. It's only a glossary. Please think: If it's free, who will check for mistakes? If you find a mistake, can you complain? Can you get your money back?

Let's check COCA for sentences with lover:

Did you notice how lover collocates with attack, beat up, kill, murder, shoot? Is that what people do to 情侶?Please remember this English proverb: "You get what you pay for" (一分錢一分貨)! There are many excellent English-English learner's dictionaries to help you learn English. Spend a little money and time to learn how to use them. Please don't waste your time with unreliable bilingual dictionaries (不可靠的雙語字典).

Wednesday, June 20, 2012

Corpus Linguistics With COCA: 'Eat Dinner' vs 'Have Dinner'

Each time a word appears in a concordance, it is called a token (= an example). Which is more common, "eat dinner" or "have dinner?" Use COCA to find the answer:

Make sure these 3 settings are correct


Add caption

DIY Corpus Linguistics--Using AntConc

AntConc is free, very powerful and easy-to-use software. 22 years ago, when I did my MA in the UK, I paid 75 pounds (maybe 200 dollars in today's money) for DOS software from Longman that could only do a few of the many things that AntConc does. What a blessing it is that Laurence Anthony and Waseda University are willing to give away this marvelous software for free. 


Let's see how we can use AntConc to analyze C-Collodi's Adventures of Pinocchio, a public domain (free, uncopyrighted) novel. 


This is AntConc's startup screen

1. Click on Open File (or Ctrl-F)

2. Choose Pinocchio

#1 Make sure you've loaded the correct file; #2 Click on the Word List tab; #3 Click on Sort by Frequency; #4 Click on Start to make a list of all the words in this story

There are 40,000+ words in this story, but only 3,790 of them are different. The most common word is the, which is used 1941 times. Pinocchio is used  454 times (of course: this story is all about Pinocchio).

Go down the list to find words which are only used 9 times. Carpenter is one of these words (Pinocchio is a wooden boy who was made by a carpenter). Next, click on the Concordance Plot tab to see where the word carpenter is used.

There are 9 hits, almost all of them at the beginning of the story. This is when Geppetto the carpenter made Pinocchio out of a piece of wood. There is one more hit at the 2/3 point.

Click on the line to see the last sentence in this story with the word carpenter.

Here is that sentence.

Click on the Collocations tab, type in the word fairy in the search box, and click on Start to see which words collocate with fairy. Good is used 11 times, 10 times on the left "Freq (L)" and 1 time on the right "Freq (R)." The word little collocates (= is used together with fairy) 7 times, always on the left (the writer only says "little fairy." He never says "fairy little".)





Semantic relations--Gradable Opposites

Dead and alive are complementary opposites. A living thing is either dead or alive. It can't be both (except maybe viruses 濾過性病毒: Are they dead or alive?). Except as a joke, we can't say that an animal is very dead or slightly dead.  

Gradable opposites are different. Wet and dry are a pair of gradable adjectives. A thing can be soaking wet, very wet, or slightly wet; It can also be bone dry, parched, extremely dry or drier.

In American English, delicious is usually not gradable, but tasty is gradable. We don't usually say "very delicious" (Chinese English: if you do a COCA search for "very delicious," you will find that there are very few examples [probably foreigner English], but "very tasty" is quite common). That's why we don't say "Is it delicious?" (this sounds rather strange to English speakers' ears), but it's OK to say "Is it tasty?" Very tasty, extremely tasty, not so tasty, and tasteless are also OK.

Semantic relations--Complementary Opposites

The word antonym is made of two parts:

ant- (anti-) means opposite
-onym means name
so antonyms are words which have opposite meanings.

If we think about antonyms, however, we see that we have a problem: What does "opposite" mean? Some words seem to fit together: if you have one, you must have the other. This is called complementarity. The Yin Yang symbol on the South Korean flag is a beautiful example of complementarity. The cat picture below looks similar (So cute!). Do you see how they seem to fit together?

640px-Flag_of_South_Korea (Wikimedia).svg.png

Semantic Relations--Synonyms

Synonyms are words which have the same meaning. Of course, this is not completely true. There is almost always some kind of difference between two words. In the diagram at the bottom of this post, speak, say, and tell are synonyms of each other.


Semantic Relations--Hyponyms

Semantics deals with word meanings. There are many ways in which words can be semantically related. One of these is hyponymy.

Hypo- in Greek means "under"
and -onym means "name"
so a hyponym is an "under name."

is-a shows a semantic relationship.

"X" is-a "Z" and "Y" is-a "Z"

"X" and "Y" are hyponyms of "Z."
= The meaning X and Y is included in the meaning of Z.

Here are some examples:

{ABC...XYZ} and {abc...xyz} are all letters, so {ABC...XYZ} and {abc...xyz} are hyponyms of "letter." Capital letters and lower-case letters, vowel letters and consonant letters are all letters, and none of them is more important than any other letters, so they are all drawn with the same shapes and arranged in a circle:

Hyponyms--Letters

Mothers and fathers are parents.  So are grandmothers and grandfathers, stepmothers and stepfathers. However, the words "mother" and "father" are closer to the typical, everyday meaning of "parent," so I didn't use the same shapes for all of these words.

In the picture below, octagons (= 8-sided figures) are closest in shape to circles, so these shapes represent "mother" and "father." Triangles (= 3-sided figures) are much farther away from the typical, everyday meaning of "parent," so they represent the words for stepparents.

Sunday, June 17, 2012

Corpus Linguistics Introduction: COCAS and The Babel English-Chinese Parallel Concordancer

Corpus linguistics uses computer software (concordancers) to look at very large samples of real language. These samples are called corpora (singular = corpus). A corpus is a collection of texts. Some corpora only contain one genre: spoken English, newspaper English, scientific English. Other corpora try to use samples from many different types of language use. Some corpora are bilingual: two languages side by side.

balanced corpus of American English











balanced corpus contains texts from many different genres. A good example of a balanced corpus is COCA, the Corpus of Contemporary American English.



Parts of the Brain (Typographic Art)

Here is a very interesting way to present the parts of the brain.
Compare this picture with the blank drawing of the brain lobes.

labguest, CC-BY-SAThe Brain Typography-3302264930_945507f26c.jpg

Stroke: What Should You Do?

[Updated April 17, 2015]
Remember this: ANYONE can have a stroke, even young people. MOST important: rush to
get treated with special stroke medicine (unfortunately, not every hospital can do this correctly).

Strokes happen when blood vessels in the brain break. If oxygen stops coming to that part of the brain, it will die. This can cause paralysis (part of the body can't move), aphasia or other problems. Many of these problems can be greatly reduced if a stroke patient gets the correct treatment very soon (within the first few hours). This is what FAST is about.

What does FAST stand for? Use this 3 minute video to help you remember:

http://www.youtube.com/watch?v=YHzz2cXBlGk
The words to Stroke Heroes Act FAST appear below