This is a big one.

It goes something like this:

Internal semantic representation <-> Linguistic structure <-> Phonetic (or visual (sign language or orthographic)) realization

A language has a lexicon and a grammar. The lexicon is a collection of lexemes. Each lexeme has a number of forms.
- now we get a complication - do we include the prediction of forms within the grammar? We are forced to discriminate between syntax and morphology, but it will also be a cornerstone of our approach that there can be an ambiguous boundary between the two.

We are forced to be more flexible than concatenative grammar. (Constructions like 'walk' being substituted into 'X-ed' to give 'walked', and then 'I', 'walked', and 'to the park' being substituted into 'NP VP PP' to give 'I walked to the park'.) We also have regular processes such as in the Semitic triconsonantal root system.

There are also phonetic processes which may obscure the tree structure. Can we say that this only concerns phonetic encoding (speaking) and decoding (hearing), which produces purified structures which are passed down to lower layers? (assimilation, samdhi, etc.)

It is worth looking at equivalent descriptions. For example, 'to be' could be treated as multiple deficient verbs (so 'am' is a verb (or not?) which can only be used with 'I' as a subject).
- Another complication. 'NP VP' is a general pattern. It is a pattern both of analysis and production. In the case "I am", it cannot be productive, although the hearer will identify it as occurring along with the specific, productive template 'I-unit am-unit'.

Same with non-productive suffixes which retain meaning.

Particular grammatical structures encode multiple semantic possibilities. A complete specification of semantics would truly be a philosophical achievement of gigantic proportions. Before then we would be describing classes of semantics (various ways of dividing up the spatial field, near, far, etc.).

We need to think about discourse - how is a sentence processed as it is heard and meaning built up.


There is a basic way of describing a language grammar as a tree structure. However, this is deficient in several ways:

  • Discourse - The hearer doesn't get the whole structure at once, but as it is built up. If we hear "It is" then we don't know whether a predicative complement like "good" will follow, or if we are in the middle of a passive construction like "It is being fixed".
  • Semantics - The verb is semantically the most important part of speech but you might not guess this from the constituent grammar. In fact, there's no limit on the number of parts of speech you could have in such a grammar.
  • What variations in structure are likely or easy for people to say. Assimilation is a similar question. Also consider the pronunciation of Russian consonants.


We spontaneously generate thoughts. Thoughts are supposed to represent reality. There are two forms of language, one which refers to the subject, but the other which refers to the exposition. Words like 'but'. It is interesting to see if you can exclude such thoughts from your mind. For example, replace 'Whether this is true, I couldn't say' with 'This is true with low probability.'

