The Parallel Architecture and its Lexicon: Is There Anything Useful for LFG?

Ray Jackendoff

Abstract

Proceedings of LFG10; CSLI Publications On-line

Goal of the talk

The framework I've developed over the last 30 years: the Parallel Architecture, incorporating Conceptual Semantics and Simpler Syntax (the latter in collaboration with Peter Culicover).

PA shares many features with LFG, but diverges in some respects. Goal today is to summarize the approach, see if there's anything useful to LFG practitioners in it.

Shared aspiration of the frameworks: To characterize the human language capacity in a way that's true to the details of human languages and in a way that can be incorporated into theories of language processing and language acquisition.

A further goal of the Parallel Architecture (don't know about LFG): A graceful integration of linguistic theory into larger theory of the mind/brain.

Basic organization of the Parallel Architecture

Classical generative grammar: Combinatorial structure of language arises through the syntactic component; phonology and semantics are “interpretive”. This approach is shared by (much of) formal semantics.

Already in 1970s, phonology was shown to have its own characteristic structure: segmental/ syllabic, prosodic, and tone structure have their own autonomous characteristics and are not derivable from syntax.

Every approach to semantics since the 1970s—formal semantics, cognitive semantics, approaches from computer science/artificial intelligence, and my Conceptual Semantics—attributes to meaning formal structures that don't resemble syntactic structure.

The Parallel Architecture: Phonology, syntax, and semantics are independent generative systems, linked by sets of interface rules.

Each component may be composed of subcomponents or tiers—smaller independent generative systems linked together by interface rules, e.g. in phonology, relation of metrical structure to syllabic structure; in semantics, thematic structure (who did what to whom) and information structure (topic, focus, etc.).

A well-formed sentence has well-formed structures in all three domains (and their subdomains), connected by well-formed links among them. All rules are treated as licensing constraints—there is no intrinsic ordering within or among components.

Fundamental feature of LFG is two-tiered syntax: c-structure and f-structure. Simpler Syntax has a syntactic structure parallel to c-structure—no movement, no distinction between deep and surface structure, therefore characterizable in terms of phrase structure rules.

Simpler Syntax also has a Grammatical Function tier, motivated on many of the same grounds as LFG's f-structure. The work of f-structure in LFG is divided between GF-structure and CS. The work of LFG's a-structure is divided between CS and syntactic argument structure. (Always an important empirical question in such architectures: In what component do particular phenomena belong?)

This approach integrates nicely into an overall architecture of mind, so we can relate language on one hand to audition and articulation, and on the other hand to visual perception and action on the world. Spatial Structure integrates inputs from vision, haptic faculty (sense of touch), and proprioception (body sense), leads to formulation of action. Interface between Spatial Structure and Conceptual Structure is what allows us to talk about what we see and to formulate actions based on instructions, i.e. to connect language to the (perceived/conceived) world.

The place of morphology

Where does morphology fit into the Parallel Architecture? Classical generative grammar often attempts to treat it as phrasal syntax plus "low-level adjustment rules." In common-practice LFG (as I understand it), it's part of "(sub-)lexical rules," though there have been more structured proposals.

In the Parallel Architecture, it's possible to recognize that morphology, like phrasal structures, has 3 independent parts:

Morphosyntax (syntactic category, inflectional features, etc.)
Morphophonology (including allomorphy, inflectional class, phonology of reduplication, etc.)
"Morphosemantics" (what semantic features can be expressed by bound morphemes and what semantic relations can be expressed through morphological combination).

Result: Phonology and syntax are split "vertically" into phrasal and morphological components, with somewhat different grammars. Morphosyntax doesn't allow freedom of order, is often templatic; phrasal syntax inside of morphosyntax is very rare. But in Parallel Architecture, mechanisms of combination are the same (see below).

This differs in spirit from LFG, where morphology is (often) taken to be "in the lexicon," and Lexical Integrity demands that morphological structure is invisible to c-structure (though not to f-structure).

In PA, corresponding to Lexical Integrity: Morphosyntax does not (normally) incorporate phrasal nodes, but morphosyntax can contribute to CS in some of the same ways as phrasal syntax. Upshot is that different languages (or the same language!) can convey the same meaning either through morphology or phrasal syntax (in very much the same ways as in LFG).

Illustrating the formalism

Parallel Architecture encoding of those purple cows

Interface linking is expressed by coindexation between the structures.

Contrast with traditional notation:

Cow is not syntactic information, it's phonological. So it doesn't belong in syntactic structure—only N belongs in syntax. (We can still use traditional syntactic trees as long as we understand that they're shorthand for the more formal treatment.)

The GF-tier: a great deal more skeletal than f-structure, in that it deals only with NP arguments of verbs; PP arguments, clausal arguments, and adjuncts are linked directly from syntax (c-structure) to Conceptual Structure. Used in account of (inter alia) passive, raising, agreement, syntactic anaphora (behave/avail X-self)

Illustration: The meaning SEEM takes propositional semantic argument, but the verb seem takes a subject and an infinitival syntactic argument. Binding of GF3 in subordinate clause to GF1 in main clause of (3) gives the effect of subject-to-subject raising. Strong parallel to LFG account.

Words and other lexical items in the Parallel Architecture

In PA, a word is an interface rule, stored in long-term memory, that links small parts of the three structures.

(4)	Long-term memory encoding of cow
	Phonology	Syntax	Semantics
	kaw₃	N₃	COW₃

Words can contain contextual features that specify their environment (notated in italics):

(5)	Long-term memory encoding of devour
	Phonology	Syntax	Semantics
	dχvawr₄	V₄	DEVOUR₄ (X, Y)

Not all words contain all three components.

(6)	a.	Phonology and meaning, no syntax
		hello, ouch, upsy-daisy, allakazam, wow, shhh, gee whiz, dammit
	b.	Phonology and syntax, no meaning
		do (do-support), it (pleonastic), of (N of NP)
	c.	Phonology only
		fiddle-de-dee, inka-dinka-doo

A psycholinguistic grounding for the structure of the lexicon: What has to be learned and stored, and what can be built online? (Note: Anything built online can be stored.) Lexical insertion = Accessing a lexical item in long-term memory and incorporating into larger structure in working memory.

A lexicon conceived in these terms must contain entries that are more than single words.

Idioms and fixed expressions

(7)	a.	Idioms
		kick the bucket, a breath of fresh air, right on the money, the jig is up, day in day out, clean as a whistle, pie in the sky, ...
	b.	Fixed expressions (clichés, etc.)
		baby-blue eyes, take it from me, weapons of mass destruction, whistle while you work, leave a message at the tone, money can't buy me love, ...

In Parallel Architecture, idioms can be stored as whole phonological/syntactic units, linked to noncompositional semantics (parallels HPSG treatment); fixed expressions can be stored with compositional semantics.

Two interpretations of (8):

(8) is accessed in long-term memory and "inserted" by Unification into all three structures at once in working memory, creating a link between them
(8) is a constraint that licenses linking all three structures This approach is possible in LFG as well, but my impression is that it's not the standard one.

Regular morphology:

Can be computed online. Must be computed online in languages with massive morphology, e.g. Turkish, Mohawk. You don't learn and store every one of 10,000+ verb forms. Rather, regular affixes have their own lexical entries.

(9)	Long-term memory encoding of the English regular past tense
	Phonology	Syntax	Semantics
	X₈-[_Aff d/t/χd]₇	[_V V₈ - Tense₇]	[PAST₇ (Y₈)]

Combines with a verb in working memory exactly the way a transitive verb combines with its object (cf. (5)), except that it's morphosyntactic combination, not phrasal.

(10)	Construction of devoured in working memory by Unification of (5) and (9):
	Phonology	Syntax	Semantics
	dəvawr₄ -[_Aff d]₇	[_V V₄ - Tense₇]	[PAST₇ (DEVOUR₄ (X, Y))]

So a regular inflection is a lexical item but not a word, and a regularly inflected verb is a word, but not necessarily a lexical item.

Irregular morphology:

Must be stored. Syntactically and semantically parallel to regular morphology (cf. (10)), but phonology is noncompositional. Note parallelism to idioms (cf. (8)).

(11)	Long-term memory encoding of caught
	Phonology	Syntax	Semantics
	kɔt₉	[_V V - Tense]₉	[PAST (CATCH (X, Y))]₉

Constructional idioms:

Pieces of syntactic structure that carry meaning beyond that carried by the words. (Cf. Construction Grammar)

(12)	Bill belched his way down the street.

Problems this construction raises:

A. Belch does not license an object (other than cognate object)

(13)	Bill belched a loud belch/*noise.

B. Any intransitive verb of appropriate semantics can be substituted for belched.

(14)	Bill joked/laughed/twirled his way down the street.

C. Transitive verbs cannot be substituted, with or without their object.

(15)	*Bill told jokes his way down the street.

D. Meaning includes a sense of subject's motion, without a word that expresses this.

Conclusion: There is an idiom consisting of a VP including way, which combines productively with verbs to mean 'go PP, while/by V-ing.'

(16)	Phonology	Syntax	Semantics
	way₁₁	[_VP V₁₂ [_NP pro+poss N₁₁] PP₁₃]₁₄	[_Event GO (X, [_Path Y]₁₃); BY/WHILE (F₁₂ (X))]₁₄

Totally noncanonical linking of syntax to semantics: head of VP in syntax is a manner/means modifier in CS; head in CS (GO) is unexpressed but licenses the PP. Easy to state in PA formalism.

Parallel noncanonical linking, in which syntactic head of NP is understood as modifier:

(17)	that travesty of a theory (= 'that theory, which is a travesty')
	a gem of a paper (= 'a paper that is a gem')

Other VP constructional idioms:

(18)	a.	rumble around the corner (V PP = 'go PP in such a way to make a V-ing sound')
	b.	knit the afternoon away (V NP away = 'spend NP[time] V-ing')
	c.	paint me a picture (V NP1 NP2 = 'V NP2 for the benefit of NP1')
	d.	water the plants flat (V NP AP = 'cause NP to become AP by V-ing it')
	e.	sing your head off (V [NP pro's head] off = 'V excessively')

Constructional idioms with noncanonical syntactic structure:

(19)	a.	The more you eat, the fatter you get (the more S, the more S)
	b.	One more beer and I'm leaving (NP and S)
	c.	How about some lunch? (How about XP?)
	d.	student after student (N-P-N)

All of these phenomena can be captured with a uniform formalism. How does LFG do these things?

Generalizations in the lexicon

How does the grammar express the generalization that VP idioms and VP constructional idioms are syntactically canonical VPs?

Standard approach in LFG, HPSG, Construction Grammar, Cognitive Grammar—and PA: Lexical items falling under a generalization are linked to a lexical entry that expresses the generalization, forming an inheritance hierarchy. Intuition: An item is easier to learn if it can fall into an inheritance hierarchy, perhaps easier to access/process as well.

Implication: A purely syntactic phrase structure rule for VP is a node in an inheritance hierarchy whose deaughters are VP idioms and VP constructions. Therefore, phrase structure rule for VP is a lexical item. Instead of traditional notation (20a), substitute notation in (20b). (My impression is that LFG tries to keep a rule like (20a) formally distinct from the structure (20b) that it generates. I'm not sure I understand why.) Parallel to words that have only phonological structure (hidy-ho).

In other words, rules of grammar are lexical items as well! There is a continuum from stereotypical words, which specify fully linked phonology, syntax, and semantics, through constructions, to phrase structure rules, which consist entirely of variables. (21a,b) fall under (21c), which falls under (21d), which falls under (21e).

(21)	a.	VP idiom—no variables:	[_VP [_V kick] [_NP [_Det the] [_N bucket]]]
	b.	VP idioms with a variable:	[_VP [_V take] NP [_PP [_P to] [_NP task]]]
			[_VP V [_NP pro's head] [_Prt off]]
	c.	VP structure with more variables:	[_VP V (NP) (PP)]
	d.	Head parameter for VP:	[_VP V ....]
	e.	X-bar theory:	[_XP ... X ...]

(21c,d,e) are lexical items that consist only of syntactic structure. Similarly, phonotactic rules can be conceived of as lexical items (templates) that consist only of phonological structure.

In short, knowledge of language = repertoire of stored pieces of structure + Unification

Productivity and semiproductivity

Standard notion of rule in generative grammar is that it is freely productive.

In phrasal grammar, structure of English VP is productive: one can learn a new verb and immediately know how to compose it into VPs.

In morphology, English regular plural is productive: one can learn a new noun and know immediately how to form its plural (cf. wugs test). Same for many Turkish affixes. Also English expletive infixation, for which one presumably does not store forms (manufuckinfacturer).

Semiproductive phenomena: There is a generalization, but one must learn applicable instances one by one, and often one has a sense that new forms are novel.

(22)	a.	English ablaut past tense:
		sing-sang, ring-rang vs. swing-swung, wring-wrung vs. think-thought, bring-brought
	b.	English zero denominal verbs:
		butter/mustard the bread, carpet/rug the floor, pocket the money/bucket the water*.
	Meanings are only partially predictable, e.g. mother vs. father.

There may be pockets of true productivity inside a semiproductive pattern:

(23)	a.	verbs of attachment: nail, screw, tape, glue, velcro
	b.	verbs of instrument of communication: phone, email, fax, skype
	(But note: no denominal verbs for vehicles: *train_V, *car_V, *wagon_V)

The stereotype from traditional grammar is that semiproductivity happens in morphology, hence "in the lexicon." But semiproductivity is not confined to morphology:

(24)	N-P-N construction
	a.	Idioms: head over heels, arm in arm, tongue in cheek
	b.	Productive prepositions
		i.	by: day by day, snake by snake
		ii.	for: dollar for dollar, student for student
		iii.	after: fossil after fossil, book after book
		iv.	(up)on: argument upon argument, dissertation upon dissertation
	c.	Semi-productive with to
		face to face, toe to toe, shoulder to shoulder, eye to eye, cheek to cheek, hand to hand
		?foot to foot, ?finger to finger, ?front to front

(25)	Sluice-stranding: John went to NY with someone, but I couldn't find out who with
	a.	who with/to/from/for/next to/about/*beside
	b.	what with/for/from/of/on/in/about/at/before/into/near/beside
	c.	how much for/by/with (*how many for)
	d.	where to/from/*near
	e.	*which (book) with/to/from/next to/about/beside

The same phenomenon can have lots of listed instances and still be productive. E.g. there are thousands of lexicalized compounds—but compounding is also freely productive:

(26)	[winter weather] [skin troubles]
	[[[health management] cost] containment] services
	[Fresh Pond Parkway] [[sewer separation and surface enhancement] project]

An old treatment of semiproductivity (going back at least to Chomsky's "Remarks on Nominalization"): Productive phenomena are "rules of grammar", semiproductive phenomena are "lexical rules." This more or less was meant to coincide with the distinction between morphosyntax ("lexical") and phrasal syntax ("grammar"). (Are there some relics of this thinking in LFG's treatment of the lexicon?)

But such a distinction is impossible. Both morphosyntax and phrasal syntax have both productive and semiproductive phenomena, and many phenomena cross back and forth between semiproductivity and productivity. So we want an account that says the two kinds of rules are formally the same, except for something that marks whether they're productive or not.

The distinction could be a feature on the rule. Better solution: it's a feature on the variable. Reason: there are rules that are productive on one variable and semiproductive on another variable.

(27)	a.	Beaver/Wissahickon Creek (also lake, bay, island, pond, mountain, hill, street)
	b.	Lake Michigan/Superior/Geneva (also mount)
	c.	the Atlantic/Pacific/Arctic Ocean (also sea, river, desert, fault, turnpike)
	d.	the Bay of Fundy/Biscay (also gulf, isle)

The variable for the name is productive, but you have to learn which pattern a particular geographical feature fits into, therefore this variable is semiproductive.

(28)	a.	X_prod Y_semiprod
	b.	Y_semiprod X_prod
	c.	the X_prod Y_semiprod
	d.	the Y_semiprod of X_prod

(Instances of lake and bay also have to be learned one by one.)

Overall story: A lexical item/rule L can contain variables which may be marked either productive or semiproductive. In either case, instances of the variable may be listed and linked to L in the inheritance hierarchy. But in addition, a productive variable may combine freely with anything that meets its conditions.

Summary of Parallel Architecture

Phonology, syntax, and semantics are independent generative domains, linked by interface rules.
Within each of these domains, there is a distinction between phrasal principles and within-word principles.
The lexicon contains not only words but affixes, idioms, constructions, and phrase structure rules—there is no formal distinction between lexicon and grammar. A lexical item is rule-like to the extent that it contains more variables and less particular content.
Variables may be either productive or semiproductive.

This seems to cut language at better joints than traditional approaches—and at the same time it is more responsive to psycholinguistic concerns and integration of the language faculty into the larger architecture of the mind.

I leave it to you to decide how much of this might be appropriately incorporated into LFG and how to do it.

A major shortcoming of the approach: It deals almost exclusively with English (albeit a lot of offbeat aspects of English!), with only occasional allusions to other languages. I welcome discussion of how it might be applied to typologically different languages.

References

Jackendoff, Foundations of Language (OUP, 2002)
Culicover and Jackendoff, Simpler Syntax (OUP, 2005)
Culicover, Syntactic Nuts (OUP, 1999)
Culicover, Natural Language Syntax (OUP, 2009)
Jackendoff, Meaning and the Lexicon (OUP, 2010)