(Stefanowitsch & Gries 2005)
What is CA?
* Theoretical prerequisites * Methodological prerequisites
Why do we need CCA?
Little attention is paid to possible interactions between (sets of) lexemes occurring in two (or more) slots of the same construction
What is CCA?
Aim: to identify the association strength between pairs of lexical items occurring in two different slots of the same construction; to wit, looking at the way in which lexical items in one slot covary with those on another slot
Contingency table for covarying collexeme analysis*
|+Wslot1||freq (+Wslot1, +Wslot2)||freq (+Wslot1, -Wslot2)||row total (freq of the lexeme in slot 1)|
|-Wslot1||freq (-Wslot1, +Wslot2)||freq (-Wslot1, -Wslot2)||row total|
|column total (frequency of the lexeme in slot 2)||column total||grand total|
*Note: bold freq counts can be obtained directly form the corpus; the prerequisite for the contingency table is that the construction must be determined and given beforehand
Duo perspectives: paradigmatic + syntagmatic (set of choices available in a given position of a syntagmatic structure in relation to the set of choices available in another position in the same structure)
Statistics: Fisher-Yates Exact test + logarithmic transformation
How different slots in a construction are related semantically?
Principle of Semantic Coherence: a word in any slot of a construction must be compatible with semantics provided by the construction for that slot, there should be an overall coherence among all slots.
『按照构式语法的原则，一个词素之所以能够出现在构式的某个槽位中且不引起歧义，是因为该词素的意义与构式意义相容。两个或多个槽位的词素能够与该构式共现，说明这两个或多个槽位中的词素有意义的连贯性（semantic coherence），并且这种连贯的意义与构式义是相容的。』（胡建和张佳易 2012）
*NOTE: this is not the Semantic Coherence Principle by Goldberg (1995: 50).
What kind of semantic coherence should be expected for any given construction?
Case study 1 (The into- causative)
SUBJcauser Vcausing.event OBJcausee [OBL into V-ingresulting.event]
Semantic constraints(Wierzbicka 1998): the causee initially does not want to perform the resulting event but where the causer overcomes this resistance, typically by persuasion or trickery.
Assumption: The causing-event slot should prefer verbs denoting actions that are suited to overcoming resistance; the resulting-event slot should prefer verbs denoting actions that causees are likely not to want to perform
Results: The covarying collexemes hold a high degree of semantic coherence; the sets of covarying collexemes also hold a high systematicity
Case study 2 (Possessive constructions)
a. NPpossessor’s Npossessee
b. det Nwhole of NPpart
Semantic constraints: s-genitive –> possession (ownership, kinship, body-part relations); of-construction –> partitive (part-whole, quantity relations)
Assumptions: The semantic constraints above should have semantic coherence effects on these two possessive constructions
Results: ICE-GB data –> bad; input-to-acquisition data (caretaker language from Manchester Corpus) –> a clear semantic prototype of possession
Case study 3 (The way-construction)
SUBJtheme Vmove POSS way [OBL P NP]path
Semantic constraints: (motion) verb <–> (path) preposition
Assumptions: should have semantic coherence effects
Results: verb-prep pairs in the way-construction display image-schematic coherence
- verbs of circumvention/forcibly creating a path + OBSTACLE prepositions
- verbs of forcibly creating a path/moving through a small opening + CONTAINER prepositions
Drawback of the previous analysis > It restricts the investigation of the covariance of collexemes to one specific context (the constructions in question), disregarding the frequencies of the construction and the collexemes in the remainder of the corpus
The version of covarying-collexeme analysis introduced above treats covarying-collexme pairs as bigrams and investigates them in the subcorpus made up of the tokens of the construction in question –>
item-based covarying-collexeme analysis
Corrected version: system-based
- Research question: Is the association of a given
collexeme1-collexeme2-constructiontrigram stronger than any of the possible associations between just two of its elements in the absence of the third
- 2×2×2 (three dimensions) frequency table (pp.23)
configural frequency analysis– binomial test + log10-transform (to identify the overall degree of attraction/repulsion of the three elements)
- Results: similar to the results from item-based version; stricter in the identification of repelled collexemes
- Issue: Is the association of a given
collexeme1-collexeme2-constructiontrigram (target trigram) stronger than any of the possible associations between just two of its elements in the absence of the third (elsewhere trigram)?
- Possibility: two of three elements are so strongly associated with each other that this association strength alone also accounts for the significant association of the whole trigram.
- Relevant to both loosely and strictly constructional view
System-based corrections under a strictly and loosely constructional view
Log10-transformed p-value of the trigrams below; subtraction –> distinctive
- Strictly constructional view:
- clm1+clm2+¬cx - ¬clm1+clm2+cx
- Loosely constructional view:
- problematic data (specifically with respect to the repelled trigrams)
- simple string search (all -ing words, because the POS-tagging in the BNC is unreliable) –> maximal recall, precision reduction –> inflate the freq of items in question in the elsewhere context
- potential for application of this method (system-based CCA) is severely limited
- promising enough and valuable addition to CA
- CCA: investigate the relationship between lexical items occurring in different slots of the same construction, and more generally, for investigating associations between triplets of linguistic signs
- Two specific theoretical issues:
- semantic compatibility between constructions and lexical items
- semantic coherence between lexical items occurring in different slots of the same construction
Two variants of CCA
- (pair of covarying collexemes ONLY in the construction in scrutiny)
- (consider overall single and joint frequencies of the words and the construction)
- loosely constructional view (the co-occurrence of two lexical items within a construction VS. their co-occurrence outside of this construction)
- strictly constructional view (the co-occurrence of all three elements VS. the co-occurrence of any two of all three elements)
Item-based method has a considerably higher precision and recall –> preferable
Frequency data < collostruction strength
Future work: register or dialect; statistical clustering techniques (objective identification of semantic classes)