blob: 32d8c3fee3a3eb63432bc0e849b514cd07f0c656 [file] [log] [blame]
Bram Moolenaar82cf9b62005-06-07 21:09:25 +00001*spell.txt* For Vim version 7.0aa. Last change: 2005 Jun 07
Bram Moolenaar217ad922005-03-20 22:37:15 +00002
3
4 VIM REFERENCE MANUAL by Bram Moolenaar
5
6
7Spell checking *spell*
8
91. Quick start |spell-quickstart|
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000102. Generating a spell file |spell-mkspell|
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000113. Spell file format |spell-file-format|
Bram Moolenaar217ad922005-03-20 22:37:15 +000012
13{Vi does not have any of these commands}
14
15Spell checking is not available when the |+syntax| feature has been disabled
16at compile time.
17
18==============================================================================
191. Quick start *spell-quickstart*
20
21This command switches on spell checking: >
22
23 :setlocal spell spelllang=en_us
24
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000025This switches on the 'spell' option and specifies to check for US English.
Bram Moolenaar217ad922005-03-20 22:37:15 +000026
27The words that are not recognized are highlighted with one of these:
28 SpellBad word not recognized
29 SpellRare rare word
30 SpellLocal wrong spelling for selected region
31
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000032Vim only checks words for spelling, there is no grammar check.
33
34To search for the next misspelled word:
35
36 *]s* *E756*
37]s Move to next misspelled word after the cursor.
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000038 A count before the command can be used to repeat.
39 This uses the @Spell and @NoSpell clusters from syntax
40 highlighting, see |spell-syntax|.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000041
42 *[s*
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000043[s Like "]s" but search backwards, find the misspelled
44 word before the cursor.
45
46 *]S*
47]S Like "]s" but only stop at bad words, not at rare
48 words or words for another region.
49
50 *[S*
51[S Like "]S" but search backwards.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000052
Bram Moolenaar217ad922005-03-20 22:37:15 +000053
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000054To add words to your own word list:
55
56 *zg*
57zg Add word under the cursor as a good word to
58 'spellfile'. In Visual mode the selected characters
59 are added as a word (including white space!).
60
61 *zw*
62zw Add word under the cursor as a wrong (bad) word to
63 'spellfile'. In Visual mode the selected characters
64 are added as a word (including white space!).
65
66 *:spellg* *:spellgood*
67:spellg[ood] {word} Add [word} as a good word to 'spellfile'.
68
69:spellw[rong] {word} Add [word} as a wrong (bad) word to 'spellfile'.
70
71After adding a word to 'spellfile' its associated ".spl" file will
72automatically be updated.
73
74
Bram Moolenaar6bb68362005-03-22 23:03:44 +000075PERFORMANCE
76
77Note that Vim does on-the-fly spellchecking. To make this work fast the
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000078word list is loaded in memory. Thus this uses a lot of memory (1 Mbyte or
Bram Moolenaar6bb68362005-03-22 23:03:44 +000079more). There might also be a noticable delay when the word list is loaded,
80which happens when 'spelllang' is set. Each word list is only loaded once,
81they are not deleted when 'spelllang' is made empty. When 'encoding' is set
82the word lists are reloaded, thus you may notice a delay then too.
83
84
Bram Moolenaar217ad922005-03-20 22:37:15 +000085REGIONS
86
87A word may be spelled differently in various regions. For example, English
88comes in (at least) these variants:
89
90 en all regions
Bram Moolenaar5c5474b2005-04-19 21:40:26 +000091 en_au Australia
Bram Moolenaar217ad922005-03-20 22:37:15 +000092 en_ca Canada
Bram Moolenaar5c5474b2005-04-19 21:40:26 +000093 en_gb Great Britain
94 en_nz New Zealand
95 en_us USA
Bram Moolenaar217ad922005-03-20 22:37:15 +000096
97Words that are not used in one region but are used in another region are
98highlighted with SpellLocal.
99
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000100Always use lowercase letters for the language and region names.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000101
102
103SPELL FILES
104
105Vim searches for spell files in the "spell" subdirectory of the directories in
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000106'runtimepath'. The name is: LL-XXX.EEE.spl, where:
107 LL the language name
108 -XXX optional addition
109 EEE the value of 'encoding'
Bram Moolenaar217ad922005-03-20 22:37:15 +0000110
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000111Exceptions:
112- Vim uses "latin1" when 'encoding' is "iso-8859-15". The euro sign doesn't
113 matter for spelling.
114- When no spell file for 'encoding' is found "ascii" is tried. This only
115 works for languages where nearly all words are ASCII, such as English. It
116 helps when 'encoding' is not "latin1", such as iso-8859-2, and English text
117 is being edited.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000118
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000119Spelling for EBCDIC is currently not supported.
120
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000121A spell file might not be available in the current 'encoding'. See
122|spell-mkspell| about how to create a spell file. Converting a spell file
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000123with "iconv" will NOT work!
Bram Moolenaar217ad922005-03-20 22:37:15 +0000124
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000125 *E758* *E759*
126When loading a spell file Vim checks that it is properly formatted. If you
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000127get an error the file may be truncated, modified or intended for another Vim
128version.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000129
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000130
131WORDS
132
133Vim uses a fixed method to recognize a word. This is independent of
134'iskeyword', so that it also works in help files and for languages that
135include characters like '-' in 'iskeyword'. The word characters do depend on
136'encoding'.
137
138A word that starts with a digit is always ignored.
139
140
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +0000141SYNTAX HIGHLIGHTING *spell-syntax*
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000142
143Files that use syntax highlighting can specify where spell checking should be
144done:
145
146 everywhere default
147 in specific items use "contains=@Spell"
148 everywhere but specific items use "contains=@NoSpell"
149
150Note that mixing @Spell and @NoSpell doesn't make sense.
151
Bram Moolenaar217ad922005-03-20 22:37:15 +0000152==============================================================================
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +00001532. Generating a spell file *spell-mkspell*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000154
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000155Vim uses a binary file format for spelling. This greatly speeds up loading
156the word list and keeps it small.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000157
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000158You can create a Vim spell file from the .aff and .dic files that Myspell
159uses. Myspell is used by OpenOffice.org and Mozilla. You should be able to
160find them here:
161 http://lingucomponent.openoffice.org/spell_dic.html
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000162You can also use a plain word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000163
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000164:mksp[ell] [-ascii] {outname} {inname} ... *:mksp* *:mkspell*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000165 Generate a Vim spell file word lists. Example: >
166 :mkspell nl nl_NL.words
167<
168 When {outname} ends in ".spl" it is used as the output
169 file name. Otherwise it should be a language name,
170 such as "en". The file written will be
171 {outname}.{encoding}.spl. {encoding} is the value of
172 the 'encoding' option.
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000173
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000174 When the [-ascii] argument is present, words with
175 non-ascii characters are skipped. The resulting file
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000176 ends in "ascii.spl".
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000177
178 The input can be the Myspell format files {inname}.aff
179 and {inname}.dic. If {inname}.aff does not exist then
180 {inname} is used as the file name of a plain word
181 list.
182
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000183 Multiple {inname} arguments can be given to combine
184 regions into one Vim spell file. Example: >
185 :mkspell ~/.vim/spell/en /tmp/en_US /tmp/en_CA /tmp/en_AU
186< This combines the English word lists for US, CA and AU
187 into one en.spl file.
188 Up to eight regions can be combined. *E754* *755*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000189
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000190 When the spell file was written all currently used
191 spell files will be reloaded.
192
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000193:mksp[ell] [-ascii] {add-name}
194 Like ":mkspell" above, using {add-name} as the input
195 file and producing an output file that has ".spl"
196 appended.
197
198Since you might want to change a Myspell word list for use with Vim the
199following procedure is recommended:
Bram Moolenaar217ad922005-03-20 22:37:15 +0000200
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +00002011. Obtain the xx_YY.aff and xx_YY.dic files from Myspell.
2022. Make a copy of these files to xx_YY.orig.aff and xx_YY.orig.dic.
2033. Change the xx_YY.aff and xx_YY.dic files to remove bad words, add missing
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000204 words, define word characters with FOL/LOW/UPP, etc. The distributed
205 "src/spell/*.diff" files can be used.
2064. Set 'encoding' to the desired encoding and use |:mkspell| to generate the
207 Vim spell file.
2085. Try out the spell file with ":set spell spelllang=xx_YY".
Bram Moolenaar217ad922005-03-20 22:37:15 +0000209
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000210When the Myspell files are updated you can merge the differences:
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00002111. Obtain the new Myspell files as xx_YY.new.aff and xx_UU.new.dic.
2122. Use Vimdiff to see what changed: >
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000213 vimdiff xx_YY.orig.dic xx_YY.new.dic
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00002143. Take over the changes you like in xx_YY.dic.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000215 You may also need to change xx_YY.aff.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00002164. Rename xx_YY.new.dic to xx_YY.orig.dic and xx_YY.new.aff to xx_YY.new.aff.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000217
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000218==============================================================================
2199. Spell file format *spell-file-format*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000220
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000221This is the format of the files that are used by the person who creates and
222maintains a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000223
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000224Note that we avoid the word "dictionary" here. That is because the goal of
225spell checking differs from writing a dictionary (as in the book). For
226spelling we need a list of words that are OK, thus need not to be highlighted.
227Names will not appear in a dictionary, but do appear in a word list. And
228some old words are rarely used and are common misspellings. These do appear
229in a dictionary but not in a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000230
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000231There are two formats: one with affix compression and one without. The files
232with affix compression are used by Myspell (Mozilla and OpenOffice.org). This
233requires two files, one with .aff and one with .dic extension. The second
234format is a list of words.
235
236
237FORMAT OF WORD LIST
238
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000239The words must appear one per line. That is all that is required.
240Additionally the following items are recognized:
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000241- Empty and blank lines are ignored.
242- Lines starting with a # are ignored (comment lines).
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000243- A line starting with "/encoding=", before any word, specifies the encoding
244 of the file. After the second '=' comes an encoding name. This tells Vim
245 to setup conversion from the specified encoding to 'encoding'.
246- A line starting with "/?" specifies a word that should be marked as rare.
247- A line starting with "/!" specifies a word that should be marked as bad.
248- A line starting with "/=" specifies a word where case must match exactly.
249 A "?" or "!" may be following: "/=?" and "/=!".
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000250- Other lines starting with '/' are reserved for future use. The ones that
251 are not recognized are ignored (but you do get a warning message).
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000252
253
254FORMAT WITH AFFIX COMPRESSION
255
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000256There are two files: the basic word list and an affix file. The affixes are
257used to modify the basic words to get the full word list. This significantly
258reduces the number of words, especially for a language like Polish. This is
259called affix compression.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000260
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000261The format for the affix and word list files is mostly identical to what
262Myspell uses (the spell checker of Mozilla and OpenOffice.org). A description
263can be found here:
264 http://lingucomponent.openoffice.org/affix.readme ~
265Note that affixes are case sensitive, this isn't obvious from the description.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000266
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000267Vim supports a few extras. Hopefully Myspell will support these too some day.
268See |spell-affix-vim|.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000269
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000270The basic word list and the affix file are combined and turned into a binary
271spell file. All the preprocessing has been done, thus this file loads fast.
272The binary spell file format is described in the source code (src/spell.c).
273But only developers need to know about it.
274
275The preprocessing also allows us to take the Myspell language files and modify
276them before the Vim word list is made. The tools for this can be found in the
277"src/spell" directory.
278
279
280WORD LIST FORMAT *spell-wordlist-format*
281
282A very short example, with line numbers:
283
284 1 1234
285 2 aan
286 3 Als
287 4 Etten-Leur
288 5 et al.
289 6 's-Gravenhage
290 7 's-Gravenhaags
291 8 bedel/P
292 9 kado/1
293 10 cadeau/2
294
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000295The first line contains the number of words. Vim ignores it, but you do get
296an error message if it's not there. *E760*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000297
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000298What follows is one word per line. There should be no white space before or
299after the word.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000300
301When the word only has lower-case letters it will also match with the word
302starting with an upper-case letter.
303
304When the word includes an upper-case letter, this means the upper-case letter
305is required at this position. The same word with a lower-case letter at this
306position will not match. When some of the other letters are upper-case it will
307not match either.
308
309The same word with all upper-case characters will always be OK.
310
311 word list matches does not match ~
312 als als Als ALS ALs AlS aLs aLS
313 Als Als ALS als ALs AlS aLs aLS
314 ALS ALS als Als ALs AlS aLs aLS
315 AlS AlS ALS als Als ALs aLs aLS
316
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000317The KEP affix ID can be used to specifically match a word with identical case
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000318only, see below.
319
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000320Note in line 5 to 7 that non-word characters are used. You can include
321any character in a word. When checking the text a word still only matches
322when it appears with a non-word character before and after it. For Myspell a
323word starting with a non-word character probably won't work.
324
325After the word there is an optional slash and flags. Most of these flags are
326letters that indicate the affixes that can be used with this word.
327
328 *spell-affix-vim*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000329A flag that Vim adds and is not in Myspell is the flag defined with KEP in the
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000330affix file. This has the meaning that case matters. This can be used if the
331word does not have the first letter in upper case at the start of a sentence.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000332Example (assuming that = was used for KEP):
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000333
334 word list matches does not match ~
335 's morgens/= 's morgens 'S morgens 's Morgens
336 's Morgens 's Morgens 'S morgens 's morgens
337
338 *spell-affix-mbyte*
339The basic word list is normally in an 8-bit encoding, which is mentioned in
340the affix file. The affix file must always be in the same encoding as the
341word list. This is compatible with Myspell. For Vim the encoding may also be
342something else, any encoding that "iconv" supports. The "SET" line must
343specify the name of the encoding. When using a multi-byte encoding it's
344possible to use more different affixes.
345
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000346 *spell-affix-chars*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000347When using an 8-bit encoding the affix file should define what characters are
348word characters (as specified with ENC). This is because the system where
349":mkspell" is used may not support a locale with this encoding and isalpha()
350won't work. For example when using "cp1250" on Unix.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000351
352 *E761* *E762*
353Three lines in the affix file are needed. Simplistic example:
354
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000355 FOL áëñáëñ ~
356 LOW áëñáëñ ~
357 UPP áëñÁËÑ ~
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000358
359All three lines must have exactly the same number of characters.
360
361The "FOL" line specifies the case-folded characters. These are used to
362compare words while ignoring case. For most encodings this is identical to
363the lower case line.
364
365The "LOW" line specifies the characters in lower-case. Mostly it's equal to
366the "FOL" line.
367
368The "UPP" line specifies the characters with upper-case. That is, a character
369is upper-case where it's different from the character at the same position in
370"FOL".
371
372ASCII characters should be omitted, Vim always handles these in the same way.
373When the encoding is UTF-8 no word characters need to be specified.
374
375 *E763*
376All spell files for the same encoding must use the same word characters,
Bram Moolenaar46df82e2005-04-24 22:06:24 +0000377otherwise they can't be combined without errors. The XX.ascii.spl spell file
378generated with the "-ascii" argument will not contain the table with
379characters, so that it can be combine with spell files for any encoding.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000380
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000381 *spell-affix-KEP*
382In the affix file a KEP line can be used to define the affix name used for
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000383keep-case words. Example:
384
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000385 KEP = ~
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000386
387See above for an example |spell-affix-vim|.
388
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000389 *spell-affix-RAR*
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000390In the affix file a RAR line can be used to define the affix name used for
391rare words. Example:
392
393 RAR ? ~
394
395Rare words are highlighted differently from bad words. This is to be used for
396words that are correct for the language, but are hardly ever used and could be
397a typing mistake anyway.
398
399
Bram Moolenaar217ad922005-03-20 22:37:15 +0000400 vim:tw=78:sw=4:ts=8:ft=help:norl: