English as official language in England, Canada, Australia and New Zealand. Features of modern English. Basic stages of becoming of English. Description of functional classification of writing of the systems. Consideration of English orthography.

The zhuyin phonetic glossing script for Chinese divides syllables in two or three, but into onset, medial, and rime rather than consonant and vowel. Pahawh Hmong is similar, but can be considered to divide syllables into either onset-rime or consonant-vowel (all consonant clusters and diphthongs are written with single letters); as the latter, it is equivalent to an abugida but with the roles of consonant and vowel reversed. Other scripts are intermediate between the categories of alphabet, abjad and abugida, so there may be disagreement on how they should be classified.

Graphic classification of writing systems

Perhaps the primary graphic distinction made in classifications is that of linearity. Linear writing systems are those in which the characters are composed of lines, such as the Latin alphabet and Chinese characters. Chinese characters are considered linear whether they're written with a ball-point pen or a calligraphic brush, or cast in bronze. Similarly, Egyptian hieroglyphs and Maya glyphs were often painted in linear outline form, but in formal situations they were carved in bas-relief. Non-linear systems, on the other hand, such as braille, are not composed of lines, no matter which instrument is used to write them. The earliest examples of writing are linear: the Sumerian script of c. 3300 BCE was linear, though its cuneiform descendants were not.

Cuneiform was probably the earliest non-linear writing. Its glyphs were formed by pressing the end of a reed stylus into moist clay, not by tracing lines in the clay with the stylus as had been done previously. The result was a radical transformation of the appearance of the script.

Braille is a non-linear adaptation of the Latin alphabet that completely abandoned the Latin forms. The letters are composed of raised bumps on the writing substrate, which can be leather (Louis Braille's original material), stiff paper, plastic or metal.

There are also transient non-linear adaptations of the Latin alphabet, including Morse code, the manual alphabets of various sign languages, and semaphore, in which flags or bars are positioned at prescribed angles. However, if "writing" is defined as a potentially permanent means of recording information, then these systems do not qualify as writing at all, since the symbols disappear as soon as they are used.

If the Edo script is indeed a complete writing system, it may be the only natural example of a script in which the color of the graphemes is contrastive.


Scripts are also graphically characterized by the direction in which they are written. Egyptian hieroglyphs were written in either horizontal direction, with the animal and human glyphs turned to face the beginning of the line. The early alphabet could be written in multiple directions, horizontally (left-to-right or right-to-left) or vertically (up or down). It was commonly written boustrophedonically: starting in one (horizontal) direction, then turning at the end of the line and reversing direction.

The Greek alphabet and its successors settled on a left-to-right pattern, from the top to the bottom of the page. In Timed Text (TT) Authoring Format, this pattern is abbreviated LRTB. Other scripts, such as Arabic and Hebrew, came to be written right-to-left. Scripts that incorporate Chinese characters have traditionally been written vertically (top-to-bottom), from the right to the left of the page, but nowadays are frequently written left-to-right, top-to-bottom, due to Western influence, a growing need to accommodate terms in the Roman alphabet, and technical limitations in popular electronic document formats. The Uighur alphabet and its descendants are unique in being the only scripts written top-to-bottom, left-to-right; this direction originated from an ancestral Semitic direction by rotating the page 90° counter-clockwise to conform to the appearance of vertical Chinese writing. Several scripts used in the Philippines and Indonesia, such as Hanuno'o, are traditionally written with lines moving away from the writer, from bottom to top, but are read horizontally left to right.

Writing systems on computers

Different ISO/IEC standards are defined to deal with each individual writing systems to implement them in computers. Today most of those standards are re-defined in a collective standard, the ISO/IEC 10646 "Universal Character Set", and a parallel, closely related work, The Unicode Standard. Both are generally encompassed by the term Unicode. In Unicode, each character, in every language's writing system, is (simplifying slightly) given a unique identification number, known as its code point. The computer's software uses the code point to look up the appropriate character in the font file, so the characters can be displayed on the page or screen.

A keyboard is the device most commonly used for writing via computer. Each key is associated with a standard code which the keyboard sends to the computer when it is pressed. By using a combination of alphabetic keys with modifier keys such as Ctrl, Alt, Shift and AltGr, various character codes are generated and sent to the CPU. The operating system intercepts and converts those signals to the appropriate characters based on the keyboard layout and input method, and then delivers those converted codes and characters to the running application software, which in turn looks up the appropriate glyph in the currently used font file, and requests the operating system to draw these on the screen.

In computers and telecommunication systems, graphemes and other grapheme-like units that are required for text processing are represented by "characters" that typically manifest in encoded form. For technical aspects of computer support for various writing systems, see Universal Character Set, CJK (Chinese, Japanese, Korean) and Bi-directional text, as well as Category: Character encoding.

Historical significance of writing systems

Historians draw a distinction between prehistory and history, with history defined by the advent of writing. The cave paintings and petroglyphs of prehistoric peoples can be considered precursors of writing, but are not considered writing because they did not represent language directly.

Writing systems always develop and change based on the needs of the people who use them. Sometimes the shape, orientation and meaning of individual signs also changes over time. By tracing the development of a script it is possible to learn about the needs of the people who used the script as well as how it changed over time.

Tools and materials

The many tools and writing materials used throughout history include stone tablets, clay tablets, wax tablets, vellum, parchment, paper, copperplate, styluses, quills, ink brushes, pencils, pens, and many styles of lithography. It is speculated that the Incas might have employed knotted threads known as quipu (or khipu) as a writing system.

The typewriter and various forms of word processors have subsequently become widespread writing tools, and various studies have compared the ways in which writers have framed the experience of writing with such tools as compared with the pen or pencil. For more information see writing implements.

History of early writing

By definition, the modern practice of history begins with written records; evidence of human culture without writing is the realm of prehistory.

The writing process evolved from economic necessity in the ancient near east. Archaeologist Denise Schmandt-Besserat determined the link between previously uncategorized clay "tokens" and the first known writing, cuneiform. The clay tokens were used to represent commodities, and perhaps even units of time spent in labor, and their number and type became more complex as civilization advanced. A degree of complexity was reached when over a hundred different kinds of tokens had to be accounted for, and tokens were wrapped and fired in clay, with markings to indicate the kind of tokens inside. These markings soon replaced the tokens themselves, and the clay envelopes were demonstrably the prototype for clay writing tablets.

English writing system

Since around the ninth century, English has been written in the Latin alphabet, which replaced Anglo-Saxon runes. The spelling system, or orthography, is multilayered, with elements of French, Latin and Greek spelling on top of the native Germanic system; it has grown to vary significantly from the phonology of the language. The spelling of words often diverges considerably from how they are spoken.

Though letters and sounds may not correspond in isolation, spelling rules that take into account syllable structure, phonetics, and accents are 75% or more reliable. Some phonics spelling advocates claim that English is more than 80% phonetic. However, English has fewer consistent relationships between sounds and letters than many other languages; for example, the sound sequence ough can be pronounced in 10 different ways. The consequence of this complex orthographic history is that reading can be challenging. It takes longer for students to become completely fluent readers of English than of many other languages, including French, Greek, and Spanish.

Table - Basic sound-letter correspondence


Alphabetic representation:







t, th (rarely) thyme, Thames

th thing (African American, New York)



th that (African American, New York)


c (+ a, o, u, consonants), k, ck, ch, qu (rarely) conquer, kh (in foreign words)


g, gh, gu (+ a, e, i), gue (final position)






n (before g or k), ng


f, ph, gh (final, infrequent) laugh, rough

th thing (many forms of English language in England)



th with (Cockney, Estuary English)


th thick, think, through


th that, this, the


s, c (+ e, i, y), sc (+ e, i, y), c often c (facade/facade)


z, s (finally or occasionally medially), ss (rarely) possess, dessert, word-initial x xylophone


sh, sch, ti (before vowel) portion, ci/ce (before vowel) suspicion, ocean; si/ssi (before vowel) tension, mission; ch (esp. in words of French origin); rarely s/ss before u sugar, issue; chsi in fuchsia only


medial si (before vowel) division, medial s (before "ur") pleasure, zh (in foreign words), z before u azure, g (in words of French origin) (+e, i, y) genre, j (in words of French origin) bijou


kh, ch, h (in foreign words)

occasionally ch loch (Scottish English, Welsh English)


h (syllable-initially, otherwise silent), j (in words of Spanish origin) jai alai


ch, tch, t before u future, culture

t (+ u, ue, eu) tune, Tuesday, Teutonic (several dialects - see Phonological history of English consonant clusters)


j, g (+ e, i, y), dg (+ e, i, consonant) badge, judg(e)ment

d (+ u, ue, ew) dune, due, dew (several dialects - another example of yod coalescence)


r, wr (initial) wrangle


y (initially or surrounded by vowels), j hallelujah






wh (pronounced hw)

Scottish and Irish English, as well as some varieties of American, New Zealand, and English English

Written accents

Unlike most other Germanic languages, English has almost no diacritics except in foreign loanwords (like the acute accent in cafe), and in the uncommon use of a diaeresis mark (often in formal writing) to indicate that two vowels are pronounced separately, rather than as one sound (e.g. naive, Zoe). Words such as decor, cafe, resume/resume, entree, fiancee and naive are frequently spelt both with or without diacritics.

Some English words retain diacritics to distinguish them from others, such as anime, expose, lame, ore, ore, pate, pique, and rose, though these are sometimes also dropped (for example, resume/resume is often spelt resume in the United States). To clarify pronunciation, a small number of loanwords may employ a diacritic that does not appear in the original word, such as mate, from Spanish yerba mate, or Male, the capital of the Maldives, following the French usage.

Formal written English

A version of the language almost universally agreed upon by educated English speakers around the world is called formal written English. It takes virtually the same form regardless of where it is written, in contrast to spoken English, which differs significantly between dialects, accents, and varieties of slang and of colloquial and regional expressions. Local variations in the formal written version of the language are quite limited, being restricted largely to the spelling differences between British and American English, along with a few minor differences in grammar and lexis.

Basic and simplified versions

To make English easier to read, there are some simplified versions of the language. One basic version is named Basic English, a constructed language with a small number of words created by Charles Kay Ogden and described in his book Basic English: A General Introduction with Rules and Grammar (1930). The language is based on a simplified version of English. Ogden said that it would take seven years to learn English, seven months for Esperanto, and seven weeks for Basic English. Thus, Basic English may be employed by companies which need to make complex books for international use, as well as by language schools that need to give people some knowledge of English in a short time.

Ogden did not put any words into Basic English that could be said with a few other words and he worked to make the words work for speakers of any other language. He put his set of words through a large number of tests and adjustments. He also made the grammar simpler, but tried to keep the grammar normal for English users.

The concept gained its greatest publicity just after the Second World War as a tool for world peace. Although it was not built into a program, similar simplifications were devised for various international uses.

Another version, Simplified English, exists, which is a controlled language originally developed for aerospace industry maintenance manuals. It offers a carefully limited and standardized subset of English. Simplified English has a lexicon of approved words and those words can only be used in certain ways. For example, the word close can be used in the phrase "Close the door" but not "do not go close to the landing gear".

English orthography

English orthography is the alphabetic spelling system used by the English language. English orthography, like other alphabetic orthographies, uses a set of rules that generally governs how speech sounds are represented in writing.

English has relatively complicated spelling rules when compared to other languages with alphabetic orthographies. Because of the complex history of the English language, nearly every sound can be legitimately spelled in more than one way, and many spellings can be pronounced in more than one way.

Phonemic representation

Like most alphabetic systems, letters in English orthography may represent a particular sound. For example, the word cat (pronounced /?k?t/) consists of three letters ‹c›, ‹a›, and ‹t›, in which ‹c› represents the sound /k/, ‹a› the sound /?/, and ‹t› the sound /t/.

Single letters or multiple sequences of letters may provide this function. Thus, the single letter ‹c› in the word cat represents the single sound /k/. In the word ship (pronounced /???p/), the digraph ‹sh› (two letters) represents the sound /?/. In the word ditch, the three letters ‹tch› represent the sound /t?/.

Less commonly, a single letter can represent multiple sounds voiced in succession. The most common example is the letter ‹x› which normally represents the consonant cluster /ks/ (for example, in the word ex-wife, pronounced /??ks?wa?f/).

The same letter (or sequence of letters) may indicate different sounds when it occurs in different positions within a word. For instance, the digraph ‹gh› represents the sound /f/ at the end of some words, such as rough /?r?f/. At the beginning of syllables (i.e. the syllable onset), the digraph ‹gh› represents the sound /?/, such as in the word ghost (pronounced /??o?st/). Conversely, the digraph ‹gh› never represents the sound /f/ in syllable onsets and never represents the sound /?/ in syllable codas. (Incidentally, this shows that ghoti does not follow English spelling rules to sound like fish.)

Word origin

Another type of spelling characteristic is related to word origin. For example, when representing a vowel, the letter ‹y› in non-word-final positions represents the sound /?/ in some words borrowed from Greek (reflecting an original upsilon), whereas the letter usually representing this sound in non-Greek words is the letter ‹i›. Thus, the word myth (pronounced /?m?и/) is of Greek origin, while pith (pronounced /?p?и/) is a Germanic word. Other examples include ‹th› representing /t/ (which is usually represented by ‹t›), ‹ph› representing /f/ (which is usually represented by ‹f›), and ‹ch› representing /k/ (which is usually represented by ‹c› or ‹k›) -- the use of these spellings for these sounds often mark words that have been borrowed from Greek.

Some, such as Brengelman (1970), have suggested that, in addition to this marking of word origin, these spellings indicate a more formal level of style or register in a given text, although Rollins (2004) finds this point to be exaggerated as there would be many exceptions where a word with one of these spellings, such as ‹ph› for /f/ (like telephone), could occur in an informal text.

Homophone differentiation

Spelling may also be used to distinguish between homophones (words with the same pronunciation but different meanings). For example, the words hour and our are pronounced identically in some dialects (as /?a?(?)r/). However, they are distinguished from each other orthographically by the addition of the letter ‹h›. Another example is the pair of homophones plain and plane, where both are pronounced /?ple?n/ but are marked with two different orthographic representations of the vowel /e?/.

In written language, this may help to resolve potential ambiguities that would arise otherwise (cf. He's breaking the car vs. He's braking the car). This is particularly advantageous in writing since, unlike in the spoken language, the reader often has no recourse to ask for clarification. Nevertheless, homophones that are unresolved by spelling still exist (for example, the word bay has at least five fundamentally different meanings).

Some proponents of spelling reform view homophones as undesirable and would prefer that they be eliminated. Doing so, however, would increase orthographic ambiguities that would need to be resolved via the linguistic context.

Marking sound changes in other letters.

Another function of English letters is to provide information about other aspects of pronunciation or the word itself. Rollins (2004) uses the term "markers" for letters with this function. Letters may mark different types of information. One type of marking is that of a different pronunciation of another letter within the word. An example of this is letter ‹e› in the word cottage (pronounced /?k?t?d?/). Here ‹e› indicates that the preceding ‹g› should represent the sound /d?/. This contrasts with the more common value of ‹g› in word-final position as the sound /?/, such as in tag (pronounced /t??/).

A particular letter may have more than one pronunciation-marking role. Besides the marking of word-final ‹g› as indicating /d?/ as in cottage, the letter ‹e› may also mark an altered pronunciation for other vowels. In the pair ban and bane, the ‹a› of ban has the value /?/, whereas the ‹a› of bane is marked by the ‹e› as having the value /e?/.

A single letter may even fill multiple pronunciation-marking roles simultaneously. For example, in the word wage the ‹e› marks not only the change of the ‹a› from /?/ to /e?/, but also of the ‹g› from /?/ to /d?/.

Functionless letters

Some letters have no linguistic function. In Old and Middle English [v] was an allophone of /f/ occurring between vowels. The deletion of historical final schwas at the end of words such as give and have phonemicized /v/, but the now-silent ‹e› remained at the end of most /v/-final words . Words spelled with final ‹v› such as rev and Slav remain comparatively rare.

Multiple functionality

A given letter or (letters) may have dual functions. For example, the letter ‹i› in the word cinema has a sound-representing function (representing the sound /?/) and a pronunciation-marking function (marking the ‹c› as having the value /s/ opposed to the value /k/).


English includes some words that can be written with accent marks. These words have mostly been imported from other languages, usually French. As imported words become increasingly naturalised, there is an increasing tendency to omit the accent marks, even in formal writing. For example, words such as role and hotel were first seen with accents when they were borrowed into English, but now the accent is almost never used. The words were originally considered French borrowings - even accused by some of being foreign phrases used where English alternatives would suffice - but today their French origin is largely forgotten. The strongest tendency to retain the accent is in words that are atypical of English morphology and therefore still perceived as slightly foreign. For example, cafe and pate both have a pronounced final e, which would be "silent" by the normal English pronunciation rules. In a few cases, there are regional differences: for instance, the first accent on resume has generally disappeared in the U.S., but is retained in the UK.

Further examples of words typically retaining diacritics when used in English are: applique, attache, blase, bric-a-brac, brotchen,[6] cliche, creme, crepe, facade, fiance(e), flambe, naive, naivete, ne(e), papier-mache, passe, pinata, protege, resume, risque, uber-, voila. Italics, with appropriate accents, are generally applied to foreign terms that are uncommonly used in or have not been assimilated into English: for example, adios, coup d'etat, creme brulee, piece de resistance, raison d'etre, uber (ubermensch), vis-a-vis.

It was formerly common in English to use a diaeresis mark to indicate a hiatus: for example, cooperate, dais, reelect. The New Yorker and Technology Review magazines still use it for this purpose, even though it is increasingly rare in modern English. Nowadays the diaeresis is normally left out (cooperate), or a hyphen is used (co-operate). It is, however, still common in loanwords such as naive and noel.

Written accents are also used occasionally in poetry and scripts for dramatic performances to indicate that a certain normally unstressed syllable in a word should be stressed for dramatic effect, or to keep with the metre of the poetry. This use is frequently seen in archaic and pseudoarchaic writings with the -ed suffix, to indicate that the e should be fully pronounced, as with cursed.


In certain older texts (typically British), the use of the ligatures ? and ? is common in words such as arch?ology, diarrh?a, and encyclop?dia. Such words have Latin or Greek origin. Nowadays, the ligatures have been generally replaced in British English by the separated digraph ae and oe (encyclopaedia, diarrhoea; but usually economy, ecology) and in American English by e (encyclopedia, diarrhea; but usually paean, amoeba, oedipal, Caesar). In some cases, usage may vary; for instance, both encyclopedia and encyclopaedia are current in the UK.


The English spelling system, compared to the systems used in many other languages, is quite irregular and complex. Although French presents a similar degree of difficulty when encoding (writing), English is more difficult when decoding (reading). English has never had any formal regulating authority, like the Spanish Real Academia Espanola, Italian Accademia della Crusca or the French Academie francaise. Attempts to regularize or reform the language, including spelling reform, have usually met with failure.

The only significant exceptions were the reforms of Noah Webster which resulted in many of the differences between British and American spelling, such as center/centre, and dialog/dialogue. (Other differences, such as -ize/-ise in realize/realise etc, came about separately; see American and British English spelling differences for details.)

Besides the quirks the English spelling system has inherited from its past, there are other idiosyncrasies in spelling that make it tricky to learn. English contains 24-27 (depending on dialect) separate consonant phonemes and, depending on dialect, anywhere from fourteen to twenty vowels. However, there are only 26 letters in the modern English alphabet, so there cannot be a one-to-one correspondence between letters and sounds. Many sounds are spelled using different letters or multiple letters, and for those words whose pronunciation is predictable from the spelling, the sounds denoted by the letters depend on the surrounding letters. For example, the digraph th represents two different sounds (the voiced interdental fricative and the voiceless interdental fricative) (see Pronunciation of English th), and the voiceless alveolar fricative can be represented by the letters s and c.

Furthermore, English makes no attempt to Anglicise the spellings of most recent loanwords, but preserves the foreign spellings, even when they employ exotic conventions like the Polish cz in Czech or the Old Norse fj in fjord (although New Zealand English exclusively spells it fiord). In fact, instead of loans being respelled to conform to English spelling standards, sometimes the pronunciation changes as a result of pressure from the spelling. One example of this is the word ski, which was adopted from Norwegian in the mid-18th century, although it did not become common until 1900. It used to be pronounced shee, which is similar to the Norwegian pronunciation, but the increasing popularity of the sport after the middle of the 20th century helped the sk pronunciation replace it.

Of course, such a philosophy can be taken too far. For instance, there was also a period when the spellings of words was altered in what is now regarded as a misguided attempt to make them conform to what were perceived to be the etymological origins of the words. For example, the letter b was added to debt in an attempt to link it to the Latin debitum, and the letter s in island is a misplaced attempt to link it to Latin insula instead of the Norse word igland, which is the true origin of the English word. The letter p in ptarmigan has no etymological justification whatsoever. Some are just randomly changed: for example, score used to be spelled skor.

The spelling of English continues to evolve. Many loanwords come from languages where the pronunciation of vowels corresponds to the way they were pronounced in Old English, which is similar to the Italian or Spanish pronunciation of the vowels, and is the value the vowel symbols [a], [e], [i], [o], and [u] have in the International Phonetic Alphabet. As a result, there is a somewhat regular system of pronouncing "foreign" words in English, and some borrowed words have had their spelling changed to conform to this system. For example, Hindu used to be spelled Hindoo, and the name Maria used to be pronounced like the name Mariah, but was changed to conform to this system. It has been argued that this influence probably started with the introduction of many Italian words into English during the Renaissance, in fields like music, from which come the words andante, viola, forte, etc.

Commercial advertisers have also had an effect on English spelling. In attempts to differentiate their products from others, they introduce new or simplified spellings like lite instead of light, thru instead of through, smokey instead of smoky (for "smokey bacon" flavour crisps), and rucsac instead of rucksack. The spellings of personal names have also been a source of spelling innovations: affectionate versions of women's names that sound the same as men's names have been spelled differently: Nikki and Nicky, Toni and Tony, Jo and Joe.

As examples of the idiosyncratic nature of English spelling, the combination ou can be pronounced in at least six different ways: /?/ in famous, /?r/ in journey, /a?/ in loud, /?/ in should, /u:/ in you, /??r/ in tour; and the vowel sound /i:/ in me can be spelt in at least ten different ways: paediatric, me, seat, seem, ceiling, people, chimney, machine, siege, phoenix. (These examples assume a more-or-less standard non-regional British English accent. Other accents will vary.)

Sometimes everyday speakers of English change a counterintuitive pronunciation simply because it is counterintuitive. Changes like this are not usually seen as "standard", but can become standard if used enough. An example is the word miniscule, which still competes with its original spelling of minuscule, though this might also be because of analogy with the word mini.


Inconsistencies and irregularities in English spelling have gradually increased in number throughout the history of the English language. There are a number of contributing factors. First, gradual changes in pronunciation, such as the Great Vowel Shift, account for a tremendous number of irregularities. Second, relatively recent loan words from other languages generally carry their original spellings, which are often not phonetic in English. The Romanization of languages (e.g., Chinese) using alphabets derived from the Latin alphabet has further complicated this problem, for example when pronouncing Chinese place names. Third, some prescriptivists have had partial success in their attempts to normalize the English language, forcing a change in spelling but not in pronunciation.

The regular spelling system of Old English was swept away by the Norman Conquest, and English itself was eclipsed by French for three centuries, eventually emerging with its spelling much influenced by French. English had also borrowed large numbers of words from French, which for reasons of prestige and familiarity kept their French spellings. The spelling of Middle English, such as in the writings of Geoffrey Chaucer, is very irregular and inconsistent, with the same word being spelled differently, sometimes even in the same sentence. However, these were generally much better guides to pronunciation than modern English spelling can honestly claim.

For example, the sound /?/, normally written u, is spelled with an o in son, love, come, etc., due to Norman spelling conventions which prohibited writing u before v, m, n due to the graphical confusion that would result. (v, u, n were identically written with two minims in Norman handwriting; w was written as two u letters; m was written with three minims, hence mm looked like vun, nvu, uvu, etc.) Similarly, spelling conventions also prohibited final v. Hence the identical spellings of the three different vowel sounds in love, grove and prove are due to ambiguity in the Middle English spelling system, not sound change.

There was also a series of linguistic sound changes towards the end of this period, including the Great Vowel Shift, which resulted in the i in mine, for example, changing from a pure vowel to a diphthong. These changes for the most part did not detract from the rule-governed nature of the spelling system; but in some cases they introduced confusing inconsistencies, like the well-known example of the many pronunciations of ough (rough, through, though, trough, plough, etc.). Most of these changes happened before the arrival of printing in England. However, the arrival of the printing press merely froze the current system, rather than providing the impetus for a realignment of spelling with pronunciation. Furthermore, it introduced further inconsistencies, partly because of the use of typesetters trained abroad, particularly in the Low Countries. For example, the h in ghost was influenced by Dutch. The addition and deletion of a silent e at the ends of words was also sometimes used to make the right-hand margin line up more neatly.

By the time dictionaries were introduced in the mid 1600s, the spelling system of English started to stabilize, and by the 1800s, most words had set spellings.


Punctuation marks are symbols that indicate the structure and organization of written language, as well as intonation and pauses to be observed when reading aloud, also see orthography.

In written English, punctuation is vital to disambiguate the meaning of sentences. For example, "woman, without her man, is nothing" and "woman: without her, man is nothing" have greatly different meanings, as do "eats shoots and leaves" and "eats, shoots and leaves". "King Charles walked and talked half an hour after his head was cut off" is alarming; "King Charles walked and talked; half an hour after, his head was cut off", less so. (For English usage, see the articles on specific punctuation marks.)

The rules of punctuation vary with language, location, register and time and are constantly evolving. Certain aspects of punctuation are stylistic and are thus the author's (or editor's) choice. Tachygraphic language forms, such as those used in online chat and text messages, may have wildly different rules.


The earliest writing had no capitalization, no spaces and few punctuation marks. This worked as long as the subject matter was restricted to a limited range of topics (e.g., writing used for recording business transactions). Punctuation is historically an aid to reading aloud (vis George Bernard Shaw).

The oldest known document using punctuation is the Mesha Stele (9th century BC). This employs points between the words and horizontal strokes between the sense section as punctuation.

The Greeks were using punctuation marks consisting of vertically arranged dots - usually two (cf. the modern colon) or three - in around the 5th century BC. Greek playwrights such as Euripides and Aristophanes used symbols to distinguish the ends of phrases in written drama: this essentially helped the play's cast to know when to pause. In particular, they used three different symbols to divide speeches, known as commas (indicated by a centred dot), colons (indicated by a dot on the base line), and periods or full stops (indicated by a raised dot).

The Romans (circa 1st century BC) also adopted symbols to indicate pauses.

Punctuation developed dramatically when large numbers of copies of the Christian Bible started to be produced. These were designed to be read aloud and the copyists began to introduce a range of marks to aid the reader, including indentation, various punctuation marks and an early version of initial capitals. St Jerome and his colleagues, who produced the Vulgate translation of the Bible into Latin, developed an early system (circa 400 AD); this was considerably improved on by Alcuin. The marks included the virgule (forward slash) and dots in different locations; the dots were centred in the line, raised or in groups.

The use of punctuation was not standardised until after the invention of printing. Credit for introducing a standard system is generally given to Aldus Manutius and his grandson. They popularized the practice of ending sentences with the colon or full stop, invented the semicolon, made occasional use of parentheses and created the modern comma by lowering the virgule.

The standards and limitations of evolving technologies have exercised further pragmatic influences. For example, minimisation of punctuation in typewritten matter became economically desirable in the 1960s and 1970s for the many users of carbon-film ribbons, since a period or comma consumed the same length of expensive non-reusable ribbon as did a capital letter.

Other languages

Other European languages use much the same punctuation as English. The similarity is so strong that the few variations may confuse a native English reader. Quotation marks are particularly variable across European languages. For example, in French and Russian, quotes would appear as: « Je suis fatigue. » (in French, each "double punctuation," as the guillemet, requires a non-breaking space; in Russian it does not).

In Greek, the question mark is written as the English semicolon, while the functions of the colon and semicolon are performed by a raised point (·), known as the ano teleia (Ьнщ фелеЯб).

Arabic and Persian languages--written from right to left--use a reversed question mark: ї, and a reversed comma: Ў . This is a modern innovation; pre-modern Arabic did not use punctuation. Hebrew, which is also written from right to left, uses the same character as in English (?). Spanish uses an inverted question mark at the beginning of a question as well as the normal question mark at the end.

Originally, Sanskrit had no punctuation. In the 1600s, Sanskrit and Marathi, both written in the Devanagari script, started using the vertical bar (|) to end a line of prose and double vertical bars (||) in verse.

Texts in Chinese, Japanese and Korean were left unpunctuated until the modern era. In unpunctuated texts, the grammatical structure of sentences in classical writing is inferred from context. Most punctuation marks in modern Chinese, Japanese and Korean have similar functions to their English counterparts; however, they often look different and have different customary rules.

Novel punctuation marks

An international patent application was filed, and published in 1992 under WO number WO9219458, for two new punctuation marks: the "question comma" and the "exclamation comma." The patent application entered into national phase exclusively with Canada, advertised as lapsing in Australia on 27 January 1994 and in Canada on 6 November 1995.

Russian designer Artemy Lebedev suggested a double-comma sign, which he believes would communicate a pause better than the semicolon does. Lebedev, however, seems unaware of the widespread use of semicolon in English (in Russian, independent clauses can be separated by commas; as a result, the semicolon is used -- infrequently -- only for stylistic purposes).

Kinds of punctuation.

· Back in ancient Greece and Rome, when a speech was prepared in writing, marks were used to indicate where--and for how long--a speaker should pause. These pauses (and eventually the marks themselves) were named after the sections they divided. The longest section was called a period, defined by Aristotle as "a portion of a speech that has in itself a beginning and an end." The shortest pause was a comma (literally, "that which is cut off"), and midway between the two was the colon--a "limb," "strophe," or "clause."

· Marking the Beat The three marked pauses were sometimes graded in a geometric progression, with one "beat" for a comma, two for a colon, and four for a period. As W.F. Bolton observes in A Living Language (1988), "such marks in oratorical 'scripts' began as physical necessities but needed to coincide with the 'phrasing' of the piece, the demands of emphasis, and other nuances of elocution."

· Almost Pointless Until the introduction of printing in the late 15th century, punctuation in English was decidedly unsystematic and at times virtually absent. Many of Chaucer's manuscripts, for instance, were punctuated with nothing more than periods at the end of verse lines, without regard for syntax or sense.

· Slash and Double SlashThe favorite mark of England's first printer, William Caxton (1420-1491), was the forward slash (also known as the solidus, virgule, oblique, diagonal, and virgula suspensiva)--forerunner of the modern comma. Some writers of that era also relied on a double slash (as found today in http://) to signal a longer pause or the start of a new section of text.

· Ben ("Two Pricks") JonsonOne of the first to codify the rules of punctuation in English was the playwright Ben Jonson--or rather, Ben:Jonson, who included the colon (he called it the "pause" or "two pricks") in his signature. In the final chapter of The English Grammar (1640), Jonson briefly discusses the primary functions of the comma, parenthesis, period, colon, question mark (the "interrogation"), and exclamation point (the "admiration").

· Talking Points In keeping with the practice (if not always the precepts) of Ben Jonson, punctuation in the 17th and 18th centuries was increasingly determined by the rules of syntax rather than the breathing patterns of speakers. Nevertheless, this passage from Lindley Murray's best-selling English Grammar (over 20 million sold) shows that even at the end of the 18th century punctuation was still treated, in part, as an oratorical aid:

Punctuation is the art of dividing a written composition into sentences, or parts of sentences, by points or stops, for the purpose of marking the different pauses which the sense, and an accurate pronunciation require. The Comma represents the shortest pause; the Semicolon, a pause double that of the comma; the Colon, double that of the semicolon; and a period, double that of the colon. The precise quantity or duration of each pause, cannot be defined; for it varies with the time of the whole. The same composition may be rehearsed in a quicker or a slower time; but the proportion between the pauses should be ever invariable. (English Grammar, Adapted to the Different Classes of Learners, 1795) Under Murray's scheme, it appears, a well-placed period might give readers enough time to pause for a snack.

· Writing Points By the end of the industrious 19th century, grammarians had come to de-emphasize the elocutionary role of punctuation:

Punctuation is the art of dividing written discourse into sections by means of points, for the purpose of showing the grammatical connection and dependence, and of making the sense more obvious

It is sometimes stated in works on Rhetoric and Grammar, that the points are for the purpose of elocution, and directions are given to pupils to pause a certain time at each of the stops. It is true that a pause required for elocutionary purposes does sometimes coincide with a grammatical point, and so the one aids the other. Yet it should not be forgotten that the first and main ends of the points is to mark grammatical divisions. Good elocution often requires a pause where there is no break whatever in the grammatical continuity, and where the insertion of a point would make nonsense. (John Seely Hart, A Manual of Composition and Rhetoric, 1892)

· Final Points In our own time, the declamatory basis for punctuation has pretty much given way to the syntactic approach. Also, in keeping with a century-long trend toward shorter sentences, punctuation is now more lightly applied than it was in the days of Dickens and Emerson.

· Countless style guides spell out the conventions for using the various marks. Yet when it comes to the finer points (regarding serial commas, for instance), sometimes even the experts disagree.

· Meanwhile, fashions continue to change. In modern prose, dashes are in; semicolons are out. Apostrophes are either sadly neglected or tossed around like confetti, while quotation marks are seemingly dropped at random on unsuspecting words.

· And so it remains true, as G. V. Carey observed decades ago, that punctuation is governed "two-thirds by rule and one-third by personal taste."

