4.1. Introductory

Morphology is the part of grammar that deals with the form of words. Lojban's morphology is fairly simple compared to that of many languages, because Lojban words don't change form depending on how they are used. English has only a small number of such changes compared to languages like Russian, but it does have changes like boys as the plural of boy, or walked as the past-tense form of walk. To make plurals or past tenses in Lojban, you add separate words to the sentence that express the number of boys, or the time when the walking was going on.

However, Lojban does have what is called derivational morphology: the capability of building new words from old words. In addition, the form of words tells us something about their grammatical uses, and sometimes about the means by which they entered the language. Lojban has very orderly rules for the formation of words of various types, both the words that already exist and new words yet to be created by speakers and writers.

A stream of Lojban sounds can be uniquely broken up into its component words according to specific rules. These so-called morphology rules are summarized in this chapter. (A detailed algorithm for breaking sounds into words is part of the PEG grammar at the end of the book.) First, here are some conventions used to talk about groups of Lojban letters, including vowels and consonants.

  1. V represents any single Lojban vowel except y; that is, it represents a, e, i, o, or u.

  2. VV represents a (falling) diphthong, one of the following:

    aieioiau

  3. (Vowel as used in this chapter means any V or VV.)

  4. V'V represents a two-syllable vowel pair with an apostrophe separating the vowels, one of the following:

    a'aa'ea'ia'oa'u
    e'ae'ee'ie'oe'u
    i'ai'ei'ii'oi'u
    o'ao'eo'io'oo'u
    u'au'eu'iu'ou'u

  5. C represents a single Lojban consonant, not including the apostrophe, one of b, c, d, f, g, j, k, l, m, n, p, r, s, t, v, x, or z . Syllabic l, m, n, and r count as consonants for the purposes of this chapter.

  6. G represents a single Lojban on-glide, one of i, or u.

  7. CC represents two adjacent consonants of type C which constitute one of the 48 permissible initial consonant pairs:

    pl pr fl fr
    bl br vl vr

    cp cf ct ck cm cn cl cr
    jb jv jd jg jm
    sp sf st sk sm sn sl sr
    zb zv zd zg zm

    tc tr ts kl kr
    dj dr dz gl gr

    ml mr xl xr
  8. An onset is a single C or G, a CC string as shown above, or a permissible initial consonant triple (CCC string), one of:

    cfr cfl sfr sfl jvr jvl zvr zvl
    cpr cpl spr spl jbr jbl zbr zbl
    ctr str jdr zdr
    ckr ckl skr skl jgr jgl zgr zgl
    cmr cml smr sml jmr jml zmr zml
  9. C/C represents two adjacent consonants which constitute one of the permissible consonant pairs (not necessarily a permissible initial consonant pair). The permissible consonant pairs are explained in Section 4.1. In brief, any consonant pair is permissible unless it: contains two identical letters, contains both a voiced (excluding r, l, m, n) and an unvoiced consonant, or is one of certain specified forbidden pairs.

  10. C/CC represents a consonant triple. The first two consonants must constitute a permissible consonant pair; the last two consonants must constitute a permissible initial consonant pair.

Lojban has three basic word classes – parts of speech – in contrast to the eight that are traditional in English. These three classes are called cmavo, brivla, and cmevla. Each of these classes has uniquely identifying properties – an arrangement of letters that allows the word to be uniquely and unambiguously recognized as a separate word in a string of Lojban, upon either reading or hearing, and as belonging to a specific word-class.

They are also functionally different: cmavo are the structure words, corresponding to English words like and, if, the and to; brivla are the content words, corresponding to English words like come, red, doctor, and freely; cmevla correspond to English proper names like James, Afghanistan, and Pope John Paul II.