blob: 9a72f3b0d0f7edfbbf3f4eaeca64c46625959a16 [file] [log] [blame]
Bram Moolenaar30abd282005-06-22 22:35:10 +00001*spell.txt* For Vim version 7.0aa. Last change: 2005 Jun 22
Bram Moolenaar217ad922005-03-20 22:37:15 +00002
3
4 VIM REFERENCE MANUAL by Bram Moolenaar
5
6
7Spell checking *spell*
8
91. Quick start |spell-quickstart|
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000102. Generating a spell file |spell-mkspell|
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000113. Spell file format |spell-file-format|
Bram Moolenaar217ad922005-03-20 22:37:15 +000012
13{Vi does not have any of these commands}
14
15Spell checking is not available when the |+syntax| feature has been disabled
16at compile time.
17
18==============================================================================
191. Quick start *spell-quickstart*
20
21This command switches on spell checking: >
22
23 :setlocal spell spelllang=en_us
24
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000025This switches on the 'spell' option and specifies to check for US English.
Bram Moolenaar217ad922005-03-20 22:37:15 +000026
27The words that are not recognized are highlighted with one of these:
Bram Moolenaar520470a2005-06-16 21:59:56 +000028 SpellBad word not recognized |hl-SpellBad|
29 SpellRare rare word |hl-SpellRare|
30 SpellLocal wrong spelling for selected region |hl-SpellLocal|
Bram Moolenaar217ad922005-03-20 22:37:15 +000031
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000032Vim only checks words for spelling, there is no grammar check.
33
34To search for the next misspelled word:
35
36 *]s* *E756*
37]s Move to next misspelled word after the cursor.
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000038 A count before the command can be used to repeat.
39 This uses the @Spell and @NoSpell clusters from syntax
40 highlighting, see |spell-syntax|.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000041
42 *[s*
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000043[s Like "]s" but search backwards, find the misspelled
Bram Moolenaar30abd282005-06-22 22:35:10 +000044 word before the cursor. Doesn't recognize words
45 split over two lines, thus may stop at words that are
46 not highlighted as bad.
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000047
48 *]S*
49]S Like "]s" but only stop at bad words, not at rare
50 words or words for another region.
51
52 *[S*
53[S Like "]S" but search backwards.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000054
Bram Moolenaar217ad922005-03-20 22:37:15 +000055
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +000056To add words to your own word list: *E764*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000057
58 *zg*
59zg Add word under the cursor as a good word to
60 'spellfile'. In Visual mode the selected characters
61 are added as a word (including white space!).
62
63 *zw*
64zw Add word under the cursor as a wrong (bad) word to
65 'spellfile'. In Visual mode the selected characters
66 are added as a word (including white space!).
67
Bram Moolenaar520470a2005-06-16 21:59:56 +000068 *:spe* *:spellgood*
69:spe[llgood] {word} Add [word} as a good word to 'spellfile'.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000070
Bram Moolenaar520470a2005-06-16 21:59:56 +000071 *:spellw* *:spellwrong*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000072:spellw[rong] {word} Add [word} as a wrong (bad) word to 'spellfile'.
73
74After adding a word to 'spellfile' its associated ".spl" file will
Bram Moolenaar3638c682005-06-08 22:05:14 +000075automatically be updated. More details about the 'spellfile' format below
76|spell-wordlist-format|.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000077
78
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +000079Finding suggestions for bad words:
80
81 *z?*
Bram Moolenaar30abd282005-06-22 22:35:10 +000082z? For the word under/after the cursor suggest correctly
83 spelled words. This also works to find alternative
84 for words that are not highlighted as bad words.
85 The results are sorted on similarity to the word
86 under/after the cursor.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +000087 This may take a long time. Hit CTRL-C when you are
88 bored.
89 You can enter the number of your choice or press
90 <Enter> if you don't want to replace.
Bram Moolenaarf3bd51a2005-06-14 22:11:18 +000091 If 'verbose' is non-zero a score will be displayed to
92 indicate the likeliness to the badly spelled word (the
93 higher the score the more different).
Bram Moolenaard857f0e2005-06-21 22:37:39 +000094 When a word was replaced the redo command "." will
95 repeat the word replacement. This works like "ciw",
96 the good word and <Esc>.
97
98The 'spellsuggest' option influences how the list of suggestions is generated
99and sorted. See |'spellsuggest'|.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000100
101
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000102PERFORMANCE
103
Bram Moolenaar3638c682005-06-08 22:05:14 +0000104Note that Vim does on-the-fly spell checking. To make this work fast the
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000105word list is loaded in memory. Thus this uses a lot of memory (1 Mbyte or
Bram Moolenaar3638c682005-06-08 22:05:14 +0000106more). There might also be a noticeable delay when the word list is loaded,
107which happens when 'spelllang' or 'spell' is set. Each word list is only
108loaded once, they are not deleted when 'spelllang' is made empty or 'spell' is
109reset. When 'encoding' is set the word lists are reloaded, thus you may
110notice a delay then too.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000111
112
Bram Moolenaar217ad922005-03-20 22:37:15 +0000113REGIONS
114
115A word may be spelled differently in various regions. For example, English
116comes in (at least) these variants:
117
118 en all regions
Bram Moolenaar5c5474b2005-04-19 21:40:26 +0000119 en_au Australia
Bram Moolenaar217ad922005-03-20 22:37:15 +0000120 en_ca Canada
Bram Moolenaar5c5474b2005-04-19 21:40:26 +0000121 en_gb Great Britain
122 en_nz New Zealand
123 en_us USA
Bram Moolenaar217ad922005-03-20 22:37:15 +0000124
125Words that are not used in one region but are used in another region are
Bram Moolenaar520470a2005-06-16 21:59:56 +0000126highlighted with SpellLocal |hl-SpellLocal|.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000127
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000128Always use lowercase letters for the language and region names.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000129
Bram Moolenaar3638c682005-06-08 22:05:14 +0000130When adding a word with |zg| or another command it's always added for all
131regions. You can change that by manually editing the 'spellfile'. See
132|spell-wordlist-format|.
133
Bram Moolenaar217ad922005-03-20 22:37:15 +0000134
135SPELL FILES
136
137Vim searches for spell files in the "spell" subdirectory of the directories in
Bram Moolenaar3638c682005-06-08 22:05:14 +0000138'runtimepath'. The name is: LL.EEE.spl, where:
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000139 LL the language name
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000140 EEE the value of 'encoding'
Bram Moolenaar217ad922005-03-20 22:37:15 +0000141
Bram Moolenaar3638c682005-06-08 22:05:14 +0000142Only the first file is loaded, the one that is first in 'runtimepath'. If
143this succeeds then additionally files with the name LL.EEE.add.spl are loaded.
144All the ones that are found are used.
145
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000146Exceptions:
147- Vim uses "latin1" when 'encoding' is "iso-8859-15". The euro sign doesn't
148 matter for spelling.
149- When no spell file for 'encoding' is found "ascii" is tried. This only
150 works for languages where nearly all words are ASCII, such as English. It
151 helps when 'encoding' is not "latin1", such as iso-8859-2, and English text
Bram Moolenaar3638c682005-06-08 22:05:14 +0000152 is being edited. For the ".add" files the same name as the found main
153 spell file is used.
154
155For example, with these values:
156 'runtimepath' is "~/.vim,/usr/share/vim70,~/.vim/after"
157 'encoding' is "iso-8859-2"
158 'spelllang' is "pl"
159
160Vim will look for:
1611. ~/.vim/spell/pl.iso-8859-2.spl
1622. /usr/share/vim70/spell/pl.iso-8859-2.spl
1633. ~/.vim/spell/pl.iso-8859-2.add.spl
1644. /usr/share/vim70/spell/pl.iso-8859-2.add.spl
1655. ~/.vim/after/spell/pl.iso-8859-2.add.spl
166
167This assumes 1. is not found and 2. is found.
168
169If 'encoding' is "latin1" Vim will look for:
1701. ~/.vim/spell/pl.latin1.spl
1712. /usr/share/vim70/spell/pl.latin1.spl
1723. ~/.vim/after/spell/pl.latin1.spl
1734. ~/.vim/spell/pl.ascii.spl
1745. /usr/share/vim70/spell/pl.ascii.spl
1756. ~/.vim/after/spell/pl.ascii.spl
176
177This assumes none of them are found (Polish doesn't make sense when leaving
178out the non-ASCII characters).
Bram Moolenaar217ad922005-03-20 22:37:15 +0000179
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000180Spelling for EBCDIC is currently not supported.
181
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000182A spell file might not be available in the current 'encoding'. See
183|spell-mkspell| about how to create a spell file. Converting a spell file
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000184with "iconv" will NOT work!
Bram Moolenaar217ad922005-03-20 22:37:15 +0000185
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000186 *E758* *E759*
187When loading a spell file Vim checks that it is properly formatted. If you
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000188get an error the file may be truncated, modified or intended for another Vim
189version.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000190
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000191
192WORDS
193
194Vim uses a fixed method to recognize a word. This is independent of
195'iskeyword', so that it also works in help files and for languages that
196include characters like '-' in 'iskeyword'. The word characters do depend on
197'encoding'.
198
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000199The table with word characters is stored in the main .spl file. Therefore it
200matters what the current locale is when generating it! A .add.spl file does
201not contain a word table.
202
Bram Moolenaar3638c682005-06-08 22:05:14 +0000203A word that starts with a digit is always ignored. That includes hex numbers
204in the form 0xff and 0XFF.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000205
206
Bram Moolenaar30abd282005-06-22 22:35:10 +0000207WORD COMBINATIONS
208
209It is possible to spell-check words that include a space. This is used to
210recognize words that are invalid when used by themselves, e.g. for "et al.".
211It can also be used to recognize "the the" and highlight it.
212
213The number of spaces is irrelevant. In most cases a line break may also
214appear. However, this makes it difficult to find out where to start checking
215for spelling mistakes. When you make a change to one line and only that line
216is redrawn Vim won't look in the previous line, thus when "et" is at the end
217of the previous line "al." will be flagged as an error. And when you type
218"the<CR>the" the highlighting doesn't appear until the first line is redrawn.
219Use |CTRL-L| to redraw right away. "[s" will also stop at a word combination
220with a line break.
221
222When encountering a line break Vim skips characters such as '*', '>' and '"',
223so that comments in C, shell and Vim code can be spell checked.
224
225
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +0000226SYNTAX HIGHLIGHTING *spell-syntax*
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000227
228Files that use syntax highlighting can specify where spell checking should be
229done:
230
Bram Moolenaar3638c682005-06-08 22:05:14 +00002311. everywhere default
2322. in specific items use "contains=@Spell"
2333. everywhere but specific items use "contains=@NoSpell"
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000234
Bram Moolenaar3638c682005-06-08 22:05:14 +0000235For the second method adding the @NoSpell cluster will disable spell checking
236again. This can be used, for example, to add @Spell to the comments of a
237program, and add @NoSpell for items that shouldn't be checked.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000238
Bram Moolenaar30abd282005-06-22 22:35:10 +0000239
240VIM SCRIPTS
241
242If you want to write a Vim script that does something with spelling, you may
243find these functions useful:
244
245 spellbadword() find badly spelled word at the cursor
246 spellsuggest() get list of spelling suggestions
247
Bram Moolenaar217ad922005-03-20 22:37:15 +0000248==============================================================================
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +00002492. Generating a spell file *spell-mkspell*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000250
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000251Vim uses a binary file format for spelling. This greatly speeds up loading
252the word list and keeps it small.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000253
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000254You can create a Vim spell file from the .aff and .dic files that Myspell
255uses. Myspell is used by OpenOffice.org and Mozilla. You should be able to
256find them here:
257 http://lingucomponent.openoffice.org/spell_dic.html
Bram Moolenaar30abd282005-06-22 22:35:10 +0000258You can also use a plain word list. The results are the same, the choice
259depends on what you find.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000260
Bram Moolenaare13305e2005-06-19 22:54:15 +0000261Make sure your current locale is set properly, otherwise Vim doesn't know what
262characters are upper/lower case letters. If the locale isn't available (e.g.,
263when using an MS-Windows codepage on Unix) add tables to the .aff file
264|spell-affix-chars|.
265
Bram Moolenaar520470a2005-06-16 21:59:56 +0000266:mksp[ell][!] [-ascii] {outname} {inname} ... *:mksp* *:mkspell*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000267 Generate a Vim spell file word lists. Example: >
268 :mkspell nl nl_NL.words
269<
270 When {outname} ends in ".spl" it is used as the output
271 file name. Otherwise it should be a language name,
272 such as "en". The file written will be
273 {outname}.{encoding}.spl. {encoding} is the value of
274 the 'encoding' option.
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000275
Bram Moolenaar520470a2005-06-16 21:59:56 +0000276 When the output file already exists [!] must be added
277 to overwrite it.
278
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000279 When the [-ascii] argument is present, words with
280 non-ascii characters are skipped. The resulting file
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000281 ends in "ascii.spl".
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000282
283 The input can be the Myspell format files {inname}.aff
284 and {inname}.dic. If {inname}.aff does not exist then
285 {inname} is used as the file name of a plain word
286 list.
287
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000288 Multiple {inname} arguments can be given to combine
289 regions into one Vim spell file. Example: >
290 :mkspell ~/.vim/spell/en /tmp/en_US /tmp/en_CA /tmp/en_AU
291< This combines the English word lists for US, CA and AU
292 into one en.spl file.
293 Up to eight regions can be combined. *E754* *755*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000294 The REP and SAL items of the first .aff file where
295 they appear are used. |spell-affix-REP|
296 |spell-affix-SAL|
Bram Moolenaar217ad922005-03-20 22:37:15 +0000297
Bram Moolenaar30abd282005-06-22 22:35:10 +0000298 This command uses a lot of memory, required to find
299 the optimal word tree (Polish requires a few hundred
300 Mbyte). The final result will be much smaller.
301
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000302 When the spell file was written all currently used
303 spell files will be reloaded.
304
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000305:mksp[ell] [-ascii] {add-name}
306 Like ":mkspell" above, using {add-name} as the input
307 file and producing an output file that has ".spl"
308 appended.
309
310Since you might want to change a Myspell word list for use with Vim the
311following procedure is recommended:
Bram Moolenaar217ad922005-03-20 22:37:15 +0000312
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +00003131. Obtain the xx_YY.aff and xx_YY.dic files from Myspell.
3142. Make a copy of these files to xx_YY.orig.aff and xx_YY.orig.dic.
3153. Change the xx_YY.aff and xx_YY.dic files to remove bad words, add missing
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000316 words, define word characters with FOL/LOW/UPP, etc. The distributed
317 "src/spell/*.diff" files can be used.
3184. Set 'encoding' to the desired encoding and use |:mkspell| to generate the
319 Vim spell file.
3205. Try out the spell file with ":set spell spelllang=xx_YY".
Bram Moolenaar217ad922005-03-20 22:37:15 +0000321
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000322When the Myspell files are updated you can merge the differences:
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00003231. Obtain the new Myspell files as xx_YY.new.aff and xx_UU.new.dic.
3242. Use Vimdiff to see what changed: >
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000325 vimdiff xx_YY.orig.dic xx_YY.new.dic
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00003263. Take over the changes you like in xx_YY.dic.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000327 You may also need to change xx_YY.aff.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00003284. Rename xx_YY.new.dic to xx_YY.orig.dic and xx_YY.new.aff to xx_YY.new.aff.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000329
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000330==============================================================================
3319. Spell file format *spell-file-format*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000332
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000333This is the format of the files that are used by the person who creates and
334maintains a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000335
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000336Note that we avoid the word "dictionary" here. That is because the goal of
337spell checking differs from writing a dictionary (as in the book). For
338spelling we need a list of words that are OK, thus need not to be highlighted.
339Names will not appear in a dictionary, but do appear in a word list. And
340some old words are rarely used and are common misspellings. These do appear
341in a dictionary but not in a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000342
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000343There are two formats: one with affix compression and one without. The files
344with affix compression are used by Myspell (Mozilla and OpenOffice.org). This
345requires two files, one with .aff and one with .dic extension. The second
346format is a list of words.
347
348
Bram Moolenaar3638c682005-06-08 22:05:14 +0000349FORMAT OF WORD LIST *spell-wordlist-format*
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000350
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000351The words must appear one per line. That is all that is required.
352Additionally the following items are recognized:
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000353- Empty and blank lines are ignored.
354- Lines starting with a # are ignored (comment lines).
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000355- A line starting with "/encoding=", before any word, specifies the encoding
356 of the file. After the second '=' comes an encoding name. This tells Vim
357 to setup conversion from the specified encoding to 'encoding'.
Bram Moolenaar3638c682005-06-08 22:05:14 +0000358- A line starting with "/regions=" specifies the region names that are
359 supported. Each region name must be two ASCII letters. The first one is
360 region 1. Thus "/regions=usca" has region 1 "us" and region 2 "ca".
361 In an addition word list the list should be equal to the main word list!
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000362- A line starting with "/?" specifies a word that should be marked as rare.
363- A line starting with "/!" specifies a word that should be marked as bad.
364- A line starting with "/=" specifies a word where case must match exactly.
365 A "?" or "!" may be following: "/=?" and "/=!".
Bram Moolenaar3638c682005-06-08 22:05:14 +0000366- Digits after "/" indicate the regions in which the word is valid. If no
367 regions are specified the word is valid in all regions.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000368- Other lines starting with '/' are reserved for future use. The ones that
369 are not recognized are ignored (but you do get a warning message).
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000370
Bram Moolenaar3638c682005-06-08 22:05:14 +0000371Example:
372
373 # This is an example word list comment
374 /encoding=latin1 encoding of the file
375 /regions=uscagb regions "us", "ca" and "gb"
376 example word for all regions
377 /1blah word for region 1 "us"
Bram Moolenaard857f0e2005-06-21 22:37:39 +0000378 /!vim bad word
Bram Moolenaar3638c682005-06-08 22:05:14 +0000379 /?3Campbell rare word in region 3 "gb"
380 /='s mornings keep-case word
381
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000382
383FORMAT WITH AFFIX COMPRESSION
384
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000385There are two files: the basic word list and an affix file. The affixes are
386used to modify the basic words to get the full word list. This significantly
387reduces the number of words, especially for a language like Polish. This is
388called affix compression.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000389
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000390The format for the affix and word list files is mostly identical to what
391Myspell uses (the spell checker of Mozilla and OpenOffice.org). A description
392can be found here:
393 http://lingucomponent.openoffice.org/affix.readme ~
394Note that affixes are case sensitive, this isn't obvious from the description.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000395
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000396Vim supports a few extras. Hopefully Myspell will support these too some day.
397See |spell-affix-vim|.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000398
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000399The basic word list and the affix file are combined and turned into a binary
400spell file. All the preprocessing has been done, thus this file loads fast.
401The binary spell file format is described in the source code (src/spell.c).
402But only developers need to know about it.
403
404The preprocessing also allows us to take the Myspell language files and modify
405them before the Vim word list is made. The tools for this can be found in the
406"src/spell" directory.
407
408
Bram Moolenaar3638c682005-06-08 22:05:14 +0000409WORD LIST FORMAT *spell-dic-format*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000410
411A very short example, with line numbers:
412
413 1 1234
414 2 aan
415 3 Als
416 4 Etten-Leur
417 5 et al.
418 6 's-Gravenhage
419 7 's-Gravenhaags
420 8 bedel/P
421 9 kado/1
422 10 cadeau/2
423
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000424The first line contains the number of words. Vim ignores it, but you do get
425an error message if it's not there. *E760*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000426
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000427What follows is one word per line. There should be no white space before or
428after the word.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000429
430When the word only has lower-case letters it will also match with the word
431starting with an upper-case letter.
432
433When the word includes an upper-case letter, this means the upper-case letter
434is required at this position. The same word with a lower-case letter at this
435position will not match. When some of the other letters are upper-case it will
436not match either.
437
438The same word with all upper-case characters will always be OK.
439
440 word list matches does not match ~
441 als als Als ALS ALs AlS aLs aLS
442 Als Als ALS als ALs AlS aLs aLS
443 ALS ALS als Als ALs AlS aLs aLS
444 AlS AlS ALS als Als ALs aLs aLS
445
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000446The KEP affix ID can be used to specifically match a word with identical case
Bram Moolenaare7566042005-06-17 22:00:15 +0000447only, see below |spell-affix-KEP|.
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000448
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000449Note in line 5 to 7 that non-word characters are used. You can include
450any character in a word. When checking the text a word still only matches
451when it appears with a non-word character before and after it. For Myspell a
452word starting with a non-word character probably won't work.
453
454After the word there is an optional slash and flags. Most of these flags are
455letters that indicate the affixes that can be used with this word.
456
457 *spell-affix-vim*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000458A flag that Vim adds and is not in Myspell is the flag defined with KEP in the
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000459affix file. This has the meaning that case matters. This can be used if the
460word does not have the first letter in upper case at the start of a sentence.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000461Example (assuming that = was used for KEP):
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000462
463 word list matches does not match ~
464 's morgens/= 's morgens 'S morgens 's Morgens
465 's Morgens 's Morgens 'S morgens 's morgens
466
467 *spell-affix-mbyte*
468The basic word list is normally in an 8-bit encoding, which is mentioned in
469the affix file. The affix file must always be in the same encoding as the
470word list. This is compatible with Myspell. For Vim the encoding may also be
471something else, any encoding that "iconv" supports. The "SET" line must
472specify the name of the encoding. When using a multi-byte encoding it's
473possible to use more different affixes.
474
Bram Moolenaare13305e2005-06-19 22:54:15 +0000475
476CHARACTER TABLES
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000477 *spell-affix-chars*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000478When using an 8-bit encoding the affix file should define what characters are
479word characters (as specified with ENC). This is because the system where
480":mkspell" is used may not support a locale with this encoding and isalpha()
481won't work. For example when using "cp1250" on Unix.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000482
Bram Moolenaare7566042005-06-17 22:00:15 +0000483 *E761* *E762* *spell-affix-FOL*
484 *spell-affix-LOW* *spell-affix-UPP*
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000485Three lines in the affix file are needed. Simplistic example:
486
Bram Moolenaare13305e2005-06-19 22:54:15 +0000487 FOL áëñ ~
488 LOW áëñ ~
489 UPP ÁËÑ ~
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000490
491All three lines must have exactly the same number of characters.
492
493The "FOL" line specifies the case-folded characters. These are used to
494compare words while ignoring case. For most encodings this is identical to
495the lower case line.
496
497The "LOW" line specifies the characters in lower-case. Mostly it's equal to
498the "FOL" line.
499
500The "UPP" line specifies the characters with upper-case. That is, a character
501is upper-case where it's different from the character at the same position in
502"FOL".
503
504ASCII characters should be omitted, Vim always handles these in the same way.
505When the encoding is UTF-8 no word characters need to be specified.
506
507 *E763*
508All spell files for the same encoding must use the same word characters,
Bram Moolenaar46df82e2005-04-24 22:06:24 +0000509otherwise they can't be combined without errors. The XX.ascii.spl spell file
510generated with the "-ascii" argument will not contain the table with
511characters, so that it can be combine with spell files for any encoding.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000512
Bram Moolenaare7566042005-06-17 22:00:15 +0000513
Bram Moolenaare13305e2005-06-19 22:54:15 +0000514AFFIXES
515 *spell-affix-PFX* *spell-affix-SFX*
516The usual PFX (prefix) and SFX (suffix) lines are supported (see the Myspell
517documentation). Note that Myspell ignores any extra text after the relevant
518info. Vim requires this text to start with a "#" so that mistakes don't go
519unnoticed. Example:
520
521 SFX F 0 in [^i]n # Spion > Spionin ~
522
523 *spell-affix-PFXPOSTPONE*
524When an affix file has very many prefixes that apply to many words it's not
525possible to build the whole word list in memory. This applies to Hebrew (a
526list with all words is over a Gbyte). In that case applying prefixes must be
527postponed. This makes spell checking slower. It is indicated by this keyword
528in the .aff file:
529
530 PFXPOSTPONE ~
531
532Only prefixes without a chop string can be postponed, prefixes with a chop
533string will still be included in the word list.
534
535
536KEEP-CASE WORDS
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000537 *spell-affix-KEP*
538In the affix file a KEP line can be used to define the affix name used for
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000539keep-case words. Example:
540
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000541 KEP = ~
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000542
543See above for an example |spell-affix-vim|.
544
Bram Moolenaare13305e2005-06-19 22:54:15 +0000545
546RARE WORDS
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000547 *spell-affix-RAR*
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000548In the affix file a RAR line can be used to define the affix name used for
549rare words. Example:
550
551 RAR ? ~
552
553Rare words are highlighted differently from bad words. This is to be used for
554words that are correct for the language, but are hardly ever used and could be
Bram Moolenaar30abd282005-06-22 22:35:10 +0000555a typing mistake anyway. When the same word is found as good it won't be
556highlighted as rare.
557
558
559BAD WORDS
560 *spell-affix-BAD*
561In the affix file a BAD line can be used to define the affix name used for
562bad words. Example:
563
564 BAD ! ~
565
566This can be used to exclude words that would otherwise be good. For example
567"the the". Once a word has been marked as bad it won't be undone by
568encountering the same word as good.
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000569
570
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000571REPLACEMENTS *spell-affix-REP*
572
573In the affix file REP items can be used to define common mistakes. This is
574used to make spelling suggestions. The items define the "from" text and the
575"to" replacement. Example:
576
577 REP 4 ~
578 REP f ph ~
579 REP ph f ~
580 REP k ch ~
581 REP ch k ~
582
583The first line specifies the number of REP lines following. Vim ignores it.
584
585
586SIMILAR CHARACTERS *spell-affix-MAP*
587
588In the affix file MAP items can be used to define letters that very much
589alike. This is mostly used for a letter with different accents. This is used
590to prefer suggestions with these letters substituted. Example:
591
592 MAP 2 ~
593 MAP eéëêè ~
594 MAP uüùúû ~
595
596The first line specifies the number of MAP lines following. Vim ignores it.
597
Bram Moolenaare7566042005-06-17 22:00:15 +0000598A letter must only appear in one of the MAP items. It's a bit more efficient
599if the first letter is ASCII or at least one without accents.
600
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000601
602SOUNDS-A-LIKE *spell-affix-SAL*
603
604In the affix file SAL items can be used to define the sounds-a-like mechanism
605to be used. The main items define the "from" text and the "to" replacement.
606Example:
607
608 SAL CIA X ~
609 SAL CH X ~
610 SAL C K ~
611 SAL K K ~
612
613TODO: explain how it works.
614
615There are a few special items:
616
617 SAL followup true ~
618 SAL collapse_result true ~
619 SAL remove_accents true ~
620
621"1" has the same meaning as "true". Any other value means "false".
622
Bram Moolenaar217ad922005-03-20 22:37:15 +0000623 vim:tw=78:sw=4:ts=8:ft=help:norl: