Updated: 3/19/2012
Index to Sets      Cognate Sets      Finderlist      
Introduction      Languages      References      

Micronesian Comparative Dictionary

Proto-Micronesian Reconstructions—1

Oceanic Linguistics Vol. 42, Num. 1, June 2003, 1-110.

Byron W. Bender, Ward H. Goodenough, Frederick H. Jackson, Jeffrey C. Marck, Kenneth L. Rehg, Ho-min Sohn, Stephen Trussel, and Judith W. Wang

One name, that of Robert W. Hsu, really belongs here with ours, but he has demurred. Certainly, without his initiative and assistance at every turn, this article and many of the dictionaries upon which it is based would not have appeared in their present form — if they appeared at all.

University of Hawaiʻi and University of Pennsylvania

Part 1 presents some 980 reconstructions for Proto-Micronesian, Proto-Central Micronesian, and Proto-Western Micronesian. Part 2 (to appear in volume 42 [2]) gives reconstructions for two additional subgroups within Proto-Micronesian: Proto-Pohnpeic and Proto-Chuukic, and their immediate ancestor, Proto-Pohnpeic-Chuukic. A handful of putative loans are also identified, and a single English finder list is provided for all of the reconstructions.


Lexical data for a number of Micronesian languages began to be collected systematically in the mid-1960s as part of the development of language lessons for the U.S. Peace Corps and in connection with other Micronesian language projects that followed at the University of Hawai‘i. These data were stored on a mainframe computer using programs then being developed (Hsu and Peters 1984), and eventually dictionaries were published for a number of the languages included in this study (Elbert 1972, Abo et al. 1976, Lee 1976, Sohn and Tawerilmang 1976, Harrison and Albert 1977, Jensen 1977, Rehg and Sohl 1979, Goodenough and Sugita 1980, Jackson and Mark 1991). Comparative work using these data began with Marck 1977, focusing on the group of languages referred to as Nuclear Micronesian. In the next several years, the authors of the current study put the initial data on computer and substantially added to them by directly eliciting information from speakers of Micronesian languages who were students in the Bilingual Education Project for Micronesia at the University of Hawai‘i. Hsu (1976) was especially helpful in cognate searches. This early activity culminated in a printout identified as Bender et al. 1984. Preliminary findings and some of the computer programs being used in the comparative work are summarized in Bender and Wang 1985. More recent work and the initial compilation of this presentation of the data have been done primarily by the second author.


In this etymological dictionary we attribute to Proto-Micronesian (PMc) an inherited lexical item shared between a Chuukic or a Pohnpeic language, or Marshallese or Kiribati (Gilbertese), on the one hand, and Kosraean (Kusaiean), on the other. We have also attributed to PMc some lexical items in only one Micronesian language that are shared with some other Austronesian language and cannot be attributed to borrowing, although a systematic search for such items has not been made. Following Jackson (1983), we attribute to Proto–Central Micronesian (PCMc) items shared by Kiribati and any other Micronesian language lower in the tree of figure 1, but not found in any other Austronesian language, and we attribute to Proto–Western Micronesian (PWMc) items shared by Marshallese and any other language lower in the tree, but not found in any other Austronesian language. (Micronesian languages not shown in figure 1 include Sonsorolese [Sns] and Tobi [Tob] [which should be positioned on a par with PuA], the Tanapag dialect of Carolinian [Crn] [which should be included with Pul, Chk, and Mrt within PECk], and Pingelapese [Png] [which should be included with Pon and Mok under PPC]). Protoforms from all three languages—PMc, PCMc, and PWMc—are interspersed in a single alphabetized list in part 1 of this dictionary. Part 2 (in a future issue) will give protoforms for Proto-Chuukic (PCk), Proto–Pohnpeic (PPon), and Proto–Pohnpeic-Chuukic (PPC) for which no higher-level reconstructions are now possible. Consonant correspondences are given in tables 2 and 3. Vowel correspondences receive preliminary discussion in Jackson 1983:321–323 and Goodenough 1992.

Work for the future includes a careful review of evidence of loanwords from Marshallese and Kiribati not only in Kosraean but also in the Pohnpeic and Chuukic languages (see Rehg and Bender 1990). The seafaring atoll dwellers of the Marshall and Kiribati islands were frequent voyagers to the high islands of Kosrae and Pohnpei to their west. There were Polynesian-speaking settlers there as well—or at least there were Polynesian contacts—attested to by loanwords such as Pohnpeian sakaw (‘kava’ < Pn ta-kawa) and Sangaro (‘a god’ < Pn Tangaroa). There has been, moreover, massive borrowing into Kiribati from Samoan and to a lesser extent, more recently, from Tuvaluan. Kiribati speakers have expanded their settlements both southward into northern Tuvalu and northward into the southern Marshall Islands in the past few hundred years. Marshallese shows what are clearly loanwords from Kiribati (Bender 1981). Marck (1994) has reconstructed some Proto-Chuukic and even a few Proto-Micronesian forms where irregularities of sound correspondences raise the possibility that they result from a chain of borrowings rather than being inherited cognates. Such differences are noted under the forms in question. Complicating the problem are some forms that appear to be preserved as inherited cognates in some Micronesian languages and to have been reintroduced from them as loans into other Micronesian languages. Doublets, related forms such that one of them fits the pattern of sound correspondences in the inherited vocabulary and the other does not, attest to such internal borrowing as well.


In order to facilitate comparison, the velar nasal symbol (ŋ ) is used in the protolanguages PMc, POc, PCk, and UAn (Dempwolff’s Uraustronesisch), and in the nuclear languages being compared in lieu of the ng, g, §, and so forth used in their orthographies. The w of labiovelars in the various languages is made superscript, and this is substituted for the primes of p' and m' used by Marck (1977, 1994) and Jackson (1983). We use ñ to represent the palatal nasal of PPC, PMc, POc, and UAn; we use R to represent the retroflex continuant of Puluwatese, Crn Carolinian, and Satawalese, but otherwise as standardly used in PEO, POc, PMP or UAn, and PAn. We use á, é, ó, and ú to represent the low front unrounded, the mid central unrounded, the low back rounded, and the high central unrounded or rounded vowels of the various Chuukic languages. We use ε for the lax mid front vowel of Pohnpeian and Pingelapese; Mokilese e and ε are both written as e, following Harrison and Salich (1977). We use ɔ for the lax mid back rounded vowel that is written oa in all of the Pohnpeic languages. For Marshallese, we use the phonemic transcription of Abo et al. (1976), but substitute for b, for , ŋ for g, for q, and we use superscript w to show the rounding of ṃʷ, ŋʷ, ḷʷ, and . For Kosraean, we have rewritten the digraph vowels to make their phonetic value more transparent, as shown in table 1. Our orthography for POc is that of Ross (1988), to which we add S to his s. Our orthography for PCk is that of Jackson (1983), to which we add y. Our orthography for PMc is that of Jackson (1986), except that we substitute s for his d and S for his z, and we add y and Z. The symbol # marks an entry that we consider to be an analogical.

Lee 1976our replacements
  frontcentralback    frontcentralback
highiihu  highiɨu
upper mideuco  upper mideəo
lower midacuhoh  lower midεʌɔ
lowahaoa  lowæaɒ





L refers to the section on loans to be included in part 2.
V is sometimes used for a reconstructed vowel of indeterminate quality.
Vowel diacritics are ignored in alphabetizing.
Square brackets enclose two possibilities for protophonemes where the evidence is ambiguous or contradictory (e.g., [sS]); the first is used for purposes of alphabetization.
“(sic)” ‘so, thus’ may follow a form that differs from the expected reflex in some way in order to assert that a copying mistake has not occurred.