blob: 906e5cc983ca2a56d95391949d86a59960066fe1 [file] [log] [blame]
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00001*spell.txt* For Vim version 7.0aa. Last change: 2005 Apr 23
Bram Moolenaar217ad922005-03-20 22:37:15 +00002
3
4 VIM REFERENCE MANUAL by Bram Moolenaar
5
6
7Spell checking *spell*
8
91. Quick start |spell-quickstart|
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000102. Generating a spell file |spell-mkspell|
119. Spell file format |spell-file-format|
Bram Moolenaar217ad922005-03-20 22:37:15 +000012
13{Vi does not have any of these commands}
14
15Spell checking is not available when the |+syntax| feature has been disabled
16at compile time.
17
18==============================================================================
191. Quick start *spell-quickstart*
20
21This command switches on spell checking: >
22
23 :setlocal spell spelllang=en_us
24
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000025This switches on the 'spell' option and specifies to check for US English.
Bram Moolenaar217ad922005-03-20 22:37:15 +000026
27The words that are not recognized are highlighted with one of these:
28 SpellBad word not recognized
29 SpellRare rare word
30 SpellLocal wrong spelling for selected region
31
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000032Vim only checks words for spelling, there is no grammar check.
33
34To search for the next misspelled word:
35
36 *]s* *E756*
37]s Move to next misspelled word after the cursor.
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000038 A count before the command can be used to repeat.
39 This uses the @Spell and @NoSpell clusters from syntax
40 highlighting, see |spell-syntax|.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000041
42 *[s*
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000043[s Like "]s" but search backwards, find the misspelled
44 word before the cursor.
45
46 *]S*
47]S Like "]s" but only stop at bad words, not at rare
48 words or words for another region.
49
50 *[S*
51[S Like "]S" but search backwards.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000052
Bram Moolenaar217ad922005-03-20 22:37:15 +000053
Bram Moolenaar6bb68362005-03-22 23:03:44 +000054PERFORMANCE
55
56Note that Vim does on-the-fly spellchecking. To make this work fast the
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000057word list is loaded in memory. Thus this uses a lot of memory (1 Mbyte or
Bram Moolenaar6bb68362005-03-22 23:03:44 +000058more). There might also be a noticable delay when the word list is loaded,
59which happens when 'spelllang' is set. Each word list is only loaded once,
60they are not deleted when 'spelllang' is made empty. When 'encoding' is set
61the word lists are reloaded, thus you may notice a delay then too.
62
63
Bram Moolenaar217ad922005-03-20 22:37:15 +000064REGIONS
65
66A word may be spelled differently in various regions. For example, English
67comes in (at least) these variants:
68
69 en all regions
Bram Moolenaar5c5474b2005-04-19 21:40:26 +000070 en_au Australia
Bram Moolenaar217ad922005-03-20 22:37:15 +000071 en_ca Canada
Bram Moolenaar5c5474b2005-04-19 21:40:26 +000072 en_gb Great Britain
73 en_nz New Zealand
74 en_us USA
Bram Moolenaar217ad922005-03-20 22:37:15 +000075
76Words that are not used in one region but are used in another region are
77highlighted with SpellLocal.
78
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000079Always use lowercase letters for the language and region names.
Bram Moolenaar217ad922005-03-20 22:37:15 +000080
81
82SPELL FILES
83
84Vim searches for spell files in the "spell" subdirectory of the directories in
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000085'runtimepath'. The name is: LL-XXX.EEE.spl, where:
86 LL the language name
87 -XXX optional addition
88 EEE the value of 'encoding'
Bram Moolenaar217ad922005-03-20 22:37:15 +000089
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +000090Exceptions:
91- Vim uses "latin1" when 'encoding' is "iso-8859-15". The euro sign doesn't
92 matter for spelling.
93- When no spell file for 'encoding' is found "ascii" is tried. This only
94 works for languages where nearly all words are ASCII, such as English. It
95 helps when 'encoding' is not "latin1", such as iso-8859-2, and English text
96 is being edited.
Bram Moolenaar217ad922005-03-20 22:37:15 +000097
Bram Moolenaar6bb68362005-03-22 23:03:44 +000098Spelling for EBCDIC is currently not supported.
99
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000100A spell file might not be available in the current 'encoding'. See
101|spell-mkspell| about how to create a spell file. Converting a spell file
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000102with "iconv" will NOT work!
Bram Moolenaar217ad922005-03-20 22:37:15 +0000103
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000104 *E758* *E759*
105When loading a spell file Vim checks that it is properly formatted. If you
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000106get an error the file may be truncated, modified or intended for another Vim
107version.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000108
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000109
110WORDS
111
112Vim uses a fixed method to recognize a word. This is independent of
113'iskeyword', so that it also works in help files and for languages that
114include characters like '-' in 'iskeyword'. The word characters do depend on
115'encoding'.
116
117A word that starts with a digit is always ignored.
118
119
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +0000120SYNTAX HIGHLIGHTING *spell-syntax*
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000121
122Files that use syntax highlighting can specify where spell checking should be
123done:
124
125 everywhere default
126 in specific items use "contains=@Spell"
127 everywhere but specific items use "contains=@NoSpell"
128
129Note that mixing @Spell and @NoSpell doesn't make sense.
130
Bram Moolenaar217ad922005-03-20 22:37:15 +0000131==============================================================================
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +00001322. Generating a spell file *spell-mkspell*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000133
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000134Vim uses a binary file format for spelling. This greatly speeds up loading
135the word list and keeps it small.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000136
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000137You can create a Vim spell file from the .aff and .dic files that Myspell
138uses. Myspell is used by OpenOffice.org and Mozilla. You should be able to
139find them here:
140 http://lingucomponent.openoffice.org/spell_dic.html
Bram Moolenaar217ad922005-03-20 22:37:15 +0000141
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000142:mksp[ell] [-ascii] {outname} {inname} ... *:mksp* *:mkspell*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000143 Generate spell file {outname}.spl from Myspell files
144 {inname}.aff and {inname}.dic.
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000145 When the [-ascii] argument is present, words with
146 non-ascii characters are skipped. The resulting file
147 ends in "ascii.spl". Otherwise the resulting file
148 ends in "ENC.spl", where ENC is the value of
149 'encoding'.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000150 Multiple {inname} arguments can be given to combine
151 regions into one Vim spell file. Example: >
152 :mkspell ~/.vim/spell/en /tmp/en_US /tmp/en_CA /tmp/en_AU
153< This combines the English word lists for US, CA and AU
154 into one en.spl file.
155 Up to eight regions can be combined. *E754* *755*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000156
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000157Since you might want to change the word list for use with Vim the following
158procedure is recommended:
Bram Moolenaar217ad922005-03-20 22:37:15 +0000159
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +00001601. Obtain the xx_YY.aff and xx_YY.dic files from Myspell.
1612. Make a copy of these files to xx_YY.orig.aff and xx_YY.orig.dic.
1623. Change the xx_YY.aff and xx_YY.dic files to remove bad words, add missing
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000163 words, define word characters with FOL/LOW/UPP, etc. The distributed
164 "src/spell/*.diff" files can be used.
1654. Set 'encoding' to the desired encoding and use |:mkspell| to generate the
166 Vim spell file.
1675. Try out the spell file with ":set spell spelllang=xx_YY".
Bram Moolenaar217ad922005-03-20 22:37:15 +0000168
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000169When the Myspell files are updated you can merge the differences:
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00001701. Obtain the new Myspell files as xx_YY.new.aff and xx_UU.new.dic.
1712. Use Vimdiff to see what changed: >
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000172 vimdiff xx_YY.orig.dic xx_YY.new.dic
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00001733. Take over the changes you like in xx_YY.dic.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000174 You may also need to change xx_YY.aff.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00001754. Rename xx_YY.new.dic to xx_YY.orig.dic and xx_YY.new.aff to xx_YY.new.aff.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000176
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000177==============================================================================
1789. Spell file format *spell-file-format*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000179
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000180This is the format of the files that are used by the person who creates and
181maintains a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000182
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000183Note that we avoid the word "dictionary" here. That is because the goal of
184spell checking differs from writing a dictionary (as in the book). For
185spelling we need a list of words that are OK, thus need not to be highlighted.
186Names will not appear in a dictionary, but do appear in a word list. And
187some old words are rarely used and are common misspellings. These do appear
188in a dictionary but not in a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000189
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000190There are two files: the basic word list and an affix file. The affixes are
191used to modify the basic words to get the full word list. This significantly
192reduces the number of words, especially for a language like Polish. This is
193called affix compression.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000194
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000195The format for the affix and word list files is mostly identical to what
196Myspell uses (the spell checker of Mozilla and OpenOffice.org). A description
197can be found here:
198 http://lingucomponent.openoffice.org/affix.readme ~
199Note that affixes are case sensitive, this isn't obvious from the description.
200Vim supports a few extras. Hopefully Myspell will support these too some day.
201See |spell-affix-vim|.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000202
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000203The basic word list and the affix file are combined and turned into a binary
204spell file. All the preprocessing has been done, thus this file loads fast.
205The binary spell file format is described in the source code (src/spell.c).
206But only developers need to know about it.
207
208The preprocessing also allows us to take the Myspell language files and modify
209them before the Vim word list is made. The tools for this can be found in the
210"src/spell" directory.
211
212
213WORD LIST FORMAT *spell-wordlist-format*
214
215A very short example, with line numbers:
216
217 1 1234
218 2 aan
219 3 Als
220 4 Etten-Leur
221 5 et al.
222 6 's-Gravenhage
223 7 's-Gravenhaags
224 8 bedel/P
225 9 kado/1
226 10 cadeau/2
227
228The first line contains the number of words. Vim ignores it. *E760*
229
230What follows is one word per line. There should be no white space after the
231word.
232
233When the word only has lower-case letters it will also match with the word
234starting with an upper-case letter.
235
236When the word includes an upper-case letter, this means the upper-case letter
237is required at this position. The same word with a lower-case letter at this
238position will not match. When some of the other letters are upper-case it will
239not match either.
240
241The same word with all upper-case characters will always be OK.
242
243 word list matches does not match ~
244 als als Als ALS ALs AlS aLs aLS
245 Als Als ALS als ALs AlS aLs aLS
246 ALS ALS als Als ALs AlS aLs aLS
247 AlS AlS ALS als Als ALs aLs aLS
248
249Note in line 5 to 7 that non-word characters are used. You can include
250any character in a word. When checking the text a word still only matches
251when it appears with a non-word character before and after it. For Myspell a
252word starting with a non-word character probably won't work.
253
254After the word there is an optional slash and flags. Most of these flags are
255letters that indicate the affixes that can be used with this word.
256
257 *spell-affix-vim*
258A flag that Vim adds and is not in Myspell is the "=" flag. This has the
259meaning that case matters. This can be used if the word does not have the
260first letter in upper case at the start of a sentence. Example:
261
262 word list matches does not match ~
263 's morgens/= 's morgens 'S morgens 's Morgens
264 's Morgens 's Morgens 'S morgens 's morgens
265
266 *spell-affix-mbyte*
267The basic word list is normally in an 8-bit encoding, which is mentioned in
268the affix file. The affix file must always be in the same encoding as the
269word list. This is compatible with Myspell. For Vim the encoding may also be
270something else, any encoding that "iconv" supports. The "SET" line must
271specify the name of the encoding. When using a multi-byte encoding it's
272possible to use more different affixes.
273
274Performance hint: Although using affixes reduces the number of words, it
275reduces the speed. It's a good idea to put all the often used words in the
276word list with the affixes prepended/appended.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000277
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000278 *spell-affix-chars*
279The affix file should define the word characters when using an 8-bit encoding
280(as specified with ENC). This is because the system where ":mkspell" is used
281may not support a locale with this encoding and isalpha() won't work. For
282example when using "cp1250" on Unix.
283
284 *E761* *E762*
285Three lines in the affix file are needed. Simplistic example:
286
287 FOL áëñáëñ
288 LOW áëñáëñ
289 UPP áëñÁËÑ
290
291All three lines must have exactly the same number of characters.
292
293The "FOL" line specifies the case-folded characters. These are used to
294compare words while ignoring case. For most encodings this is identical to
295the lower case line.
296
297The "LOW" line specifies the characters in lower-case. Mostly it's equal to
298the "FOL" line.
299
300The "UPP" line specifies the characters with upper-case. That is, a character
301is upper-case where it's different from the character at the same position in
302"FOL".
303
304ASCII characters should be omitted, Vim always handles these in the same way.
305When the encoding is UTF-8 no word characters need to be specified.
306
307 *E763*
308All spell files for the same encoding must use the same word characters,
309otherwise they can't be combined without errors.
310
Bram Moolenaar217ad922005-03-20 22:37:15 +0000311
312 vim:tw=78:sw=4:ts=8:ft=help:norl: