/orthography/mri.html

Te Reo Māori

Austronesian / Malayo-Polynesian / Eastern Malayo-Polynesian / Oceanic / Central Pacific linkage / Tokalau Fijian / Polynesian / Nuclear Polynesian / Northern Outlier Polynesian-East Polynesian / Solomons Northern Outlier Polynesian-East Polynesian / East Polynesian / East Polynesian Proximal / Southern East Polynesian Proximal / Maoric / Maori /

have only done a brief overview of surface level research. corrections and comments more than welcome, direct to atmatmmachine.nz.

macrons are known in Māori as ‘tohutō’. macron is used here for consistency. dialects are called ‘mita’.

Te Taura Whiri i te Reo Māori

Māori Language Commission

this orthography is defined on a governmental level and is considered the standard for everyday use. the Māori Language Act 1987 led to this standard being established but has also led to the decline of regional dialects.

te reo Māori letters

vowels

minuscule

a

e

i

o

u

majuscule

A

E

I

O

U

minuscule

majuscule

consonants

minuscule

h

k

m

n

ng

majuscule

H

K

M

N

Ng

minuscule

p

r

t

w

wh

majuscule

P

R

T

W

Wh

variations

minuscule

k

b

l

n

s

majuscule

K

B

L

N

S

minuscule

w'

◌̈

majuscule

W'

◌̈

macrons are used to show vowel length (including for loan words) except in cases of reduplication, where two letters are used in series, or passivisation, where either is valid though macron is preferred. reduplication typically drops any macrons that might have already existed meaning ‘-aā-’ is less likely than ‘-aa-’, though either is acceptable. if the lengthening is indicative of emphasis, a macron is never used.

regional variations are not permitted to be used by anyone subscribing to this orthography.

the following must always be written with a macron:

plural possessive particles ‘ā’, ‘a’, ‘ō’, and ‘o’ are long before the object and short if after the object.

suffixes and prefixes are written as part of the word; hyphens are no longer widely used. when used they are generically represented by ‘-’ [u+002d] which is available on a keyboard, but the dedicated unicode codepoint is ‘‐’ [u+2010]. compound words with four or fewer vowels are written as one word, those with five or more vowels are written as two or more words. the following prefixes have exceptions: ‘koki’ is written alone and ‘ine’ uses a hyphen. the base words ‘manawa’ and ‘ngākau’ form one word when combined with a mono- or bi-syllabic word but two words in all other cases. flora and fauna names do not form one word. many expressions of time are written as a single word despite the aforementioned rule and many are also written in many words. where ‘tonu’ or ‘tata’ are present in a time (not after), the time is broken into constituent parts bar the leading ‘i’ which remains connected. the prefix ‘’ is connected with a hyphen when meaning ‘in the manner of’. ‘mā’ is not connected but when used to refer to compass points is connected either side with a hyphen. ‘whaka te’ and the following noun are collapsed into one (retaining any hyphens) as such: ‘whakate-’. ‘ko ia’ refers to a person, ancestor, or personification whereas ‘koia’ is used in all other instances.

names are capitalised including the preceding ‘te’ which stands alone unlike ‘Ngā’ or ‘Kā’ which are combined. names with six or fewer syllables (not counting ‘Te’) are not hyphenated whereas those with seven or more are hyphenated. ‘Te’ takes a hyphen before and after when part of a name embedded within another name. Particles are also hyphenated when embedded. The first common adjective occuring after a noun isn't hyphenated and joined instead. Hyphens are placed where two of the same vowels would meet. Like ‘Te’, ‘Tai’ is not hyphenated and is capitalised, unless followed by ‘o’ or ‘a’ where they are hyphenated. When referring to iwi, hapū, etc., ‘Ngā’, ‘Ngāi’, ‘Ngāti’, ‘Te’, ‘Te Āti’ all stand alone and do not count as syllables for earlier set out purposes.

hia-’ when followed by a three letter or less word is joined but otherwise stands apart. the apostrophe (‘'’ [u+0027]) symbol is used to mark glottal stops present in some Māori dialects and other Polynesian languages. Place names beginning with ‘Ō’, ‘O’ are spelt with a macron. The use of apostrophes for possession or contraction does not exist and should be avoided when writing Māori words other languages.

Sources Te Whanake Te Aka Māori–English, English–Māori Dictionary and Index / John C. Moorfield
Te Taura Whiri i te Reo Mäori Guidelines for Mäori Language Orthography

Eastern North Island Māori

Kāi-Tahu

of the South Island, also known as Ngāi Tahu

most notably, ‘k’ replaces ‘ng’ in this dialect. some, but not many, historical sources were taught to write ‘ng’ but spoke ‘k’ making for some confusion. however, luckily modern Kāi-Tahu retain their dialect.

there are also many idioms and words unique to Kāi-Tahu, many of which are conversational. see the sources for lists.

sub-dialects within Kāi-Tahu are also present and require further investigation.

Pakehā corruped some records, confusing ‘b’ with ‘p’ and ‘l’ with ‘r’.

Sources Te Rūnanga o Ōtākou / Language History
Te Rūnanga o Ōtākou / Te Reo Māori

Ngāti Kahungunu

south-east north island

this dialect is claimed to be deceased. most distinctions are oral; orthography is indistinct. there are several idioms unique to this dialect, see sources.

Sources Ngāti Kahunugunu / Ketuketu Kīwaha

Ngāti Porou

Tūrangi-nui-a-Kiwa

most of the distinction in this dialect is body language, oral, etc., with some unique terms. ‘kei te aha’ and ‘taputapu’ are phrases unique to Ngāti Puhou. sometimes preferred is ‘-au’ to ‘-ou’.

Sources Iwi Dialects: Because Te Reo Isn’t the Same Everywhere / Te Pararē / Te Mana Ākonga

Te Arawa Tūhoe

of both Te Arawa and Tūhoe

ng’ becomes ‘n’: this is especially written. ‘-ou’ changes to ‘-au’, ‘-ei’ changes to ‘-ai’.

Sources Iwi Dialects: Because Te Reo Isn’t the Same Everywhere / Te Pararē / Te Mana Ākonga

Western North Island Māori

Ngāti Tūwharetoa

central North Island plateau

there are no clearly documented distinctions with this dialect.

Sources

North Auckland Māori

Ngāpuhi

far North, also known as Ngāti Hine

this dialect has extensive transliterated and borrowed words. the letter ‘s’ is also introduced in this dialect.

wh’ can become ‘h’. this is most often not written.

Sources Iwi Dialects: Because Te Reo Isn’t the Same Everywhere / Te Pararē / Te Mana Ākonga NZ Herald

Te Aupōuri

far North

there are no clearly documented distinctions with this dialect.

Sources

Taranaki

Ngaa Rauru, Ngāti Ruanui

mounga’ favoured over ‘maunga’.

h’ in ‘wh’ generally dropped in pronunciation.

Ngaa Rauru

uses double vowels rather than macrons.

Ngāti Ruanui

wh’ becomes ‘w'’ with a glottal stop. also spoken by Te Ātiawa.

Sources NZ Herald The Spinoff

Waikato

easy to distinguish

double vowels instead of macrons. ‘tuupu(na)’ instead of ‘tiipu(na)’. ‘ng’ can be used as a prefix.

Sources Iwi Dialects: Because Te Reo Isn’t the Same Everywhere / Te Pararē / Te Mana Ākonga

Whanganui

also known as Wanganui

no difference in orthography for glottal stops. ‘wh’ pronounced with glottal stop like ‘w'’.

ending letter ‘a’ dropped in informal language

-au’ used over ‘-ou’.

Sources

⠞⠑ ⠗⠑⠕ ⠠⠍⠸⠁⠕⠗⠊

braille

⠞⠑ ⠗⠑⠕ ⠠⠍⠸⠁⠕⠗⠊ letters

vowels

⠸⠁

⠸⠑

⠸⠊

⠸⠕

⠸⠥

consonants

⠝⠛

capitalisation mark

derived from 6-dot English Braille and maintaining compatibility with Unified English Braille.

the prefix equivalent to a macron is ⠸ [u+2838]. the prefix marking capitalisation is ⠠ [u+2820].

can’t find any abbreviations specific to this braille.

other notes

before the inclusion of the macron into unicode in 1993, macrons were encoded as a diaeresis, they may appear visually as a macron or not depending on the font used.

historically, double vowel (two letters) was frequently used.

far north uses ‘ngohi’ over ‘ika’, the name of an ancestor.

data

precomposed characters are used in this section where available.

alphabetised (as per Te Aka):
ā Ā a A ē Ē e E h H ī Ī i I k K m M n N ng Ng ō Ō o O p P r R s S t T ū Ŭ u U w W wh Wh
ignored: - ‐ '
remarks: tohutō only count against the same column: e.g., āta would precede ata but not aroha. minuscule before majuscule are like macrons in only counting against the same column: e.g., Ata follows ata but not atea.

regex:
vowels: (?'mri_vowels'ā|Ā|a|A|ē|Ē|e|E|ī|Ī|i|I|ō|Ō|o|O|ū|Ŭ|u|U)
consonants: (?'mri_consonants'wh|Wh|ng|Ng|h|H|k|K|m|M|n|N|p|P|r|R|s|S|t|T)
valid in a single word: (?'mri_ignore'-|‐|')