back of class debate

So I have the feeling that might not have been a great question.  Perhaps my classmates will roll their eyes and say, “That’s a long -standing debate covered on page 72 of that article you tl;dr-ed,” or “We discussed that 20 minutes ago while you were skulking at the back of the virtual classroom, updating your facebook status and dressing your Bitstrips avatar.”

Never mind, I have another question (I usually do.).  How does corpus study differentiate between collocations and cliches?  Presumably they would both show up as high frequency word clusters.

I think this is a better question.  I am pretty sure that quantitative analysis cannot determine that.  In fact, I am not sure that a human analyst can answer that question.  After all, when does a collocation become a cliche?

There’s an article in Slate that charts the progress of the phrase “fail better.”  Apparently the phrase has become almost ubiquitous in Silicon Valley.  It is overused and used inappropriately, to the point that  the words have lost their power.  When did the phrase become a cliche? Not when Beckett first wrote it — the oxymoron was a raw and powerful expression of Beckett’s pessimism.  Not when a canny publicist first used it as an inspirational slogan — that was a true work of marketing genius, turning the meaning of the original on its head without losing the potency of the inherent contradiction.  Was it the first person who copied this repurposed mantra?  the 25th?  What about if I have not read Beckett’s work or followed Silicon Valley motivational literature?  If I am  reading  it for the first time, is it still a cliche?

So if a human analyst is to answer this question, he or she needs to navigate through layers of context and assumptions and personal judgement.  If I use a corpus to analyse the data, I will just come up with the finding that “fail better” is now a frequent collocation.  In the world of the corpus, all collocations are presented as equally valid uses of language.

Just as with the usage question, there is a difference between “everyone is doing it” and “it’s the right thing to do.”  Again, the corpus does not distinguish between the two.