blob: 7aeeabf5351198c7e683ae8c9fb4d5bc10400e62 [file] [log] [blame]
Bram Moolenaarfc1421e2006-04-20 22:17:20 +00001*spell.txt* For Vim version 7.0e. Last change: 2006 Apr 20
Bram Moolenaar217ad922005-03-20 22:37:15 +00002
3
4 VIM REFERENCE MANUAL by Bram Moolenaar
5
6
7Spell checking *spell*
8
91. Quick start |spell-quickstart|
Bram Moolenaard042c562005-06-30 22:04:15 +0000102. Remarks on spell checking |spell-remarks|
113. Generating a spell file |spell-mkspell|
124. Spell file format |spell-file-format|
Bram Moolenaar217ad922005-03-20 22:37:15 +000013
14{Vi does not have any of these commands}
15
16Spell checking is not available when the |+syntax| feature has been disabled
17at compile time.
18
19==============================================================================
201. Quick start *spell-quickstart*
21
22This command switches on spell checking: >
23
24 :setlocal spell spelllang=en_us
25
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000026This switches on the 'spell' option and specifies to check for US English.
Bram Moolenaar217ad922005-03-20 22:37:15 +000027
28The words that are not recognized are highlighted with one of these:
Bram Moolenaar520470a2005-06-16 21:59:56 +000029 SpellBad word not recognized |hl-SpellBad|
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +000030 SpellCap word not capitalised |hl-SpellCap|
Bram Moolenaar520470a2005-06-16 21:59:56 +000031 SpellRare rare word |hl-SpellRare|
32 SpellLocal wrong spelling for selected region |hl-SpellLocal|
Bram Moolenaar217ad922005-03-20 22:37:15 +000033
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000034Vim only checks words for spelling, there is no grammar check.
35
Bram Moolenaar45360022005-07-21 21:08:21 +000036If the 'mousemodel' option is set to "popup" and the cursor is on a badly
37spelled word or it is "popup_setpos" and the mouse pointer is on a badly
Bram Moolenaar16d8f872005-11-26 23:46:11 +000038spelled word, then the popup menu will contain a submenu to replace the bad
Bram Moolenaar45360022005-07-21 21:08:21 +000039word. Note: this slows down the appearance of the popup menu.
40
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000041To search for the next misspelled word:
42
43 *]s* *E756*
44]s Move to next misspelled word after the cursor.
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000045 A count before the command can be used to repeat.
Bram Moolenaarac6e65f2005-08-29 22:25:38 +000046 'wrapscan' applies.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000047
48 *[s*
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000049[s Like "]s" but search backwards, find the misspelled
Bram Moolenaar30abd282005-06-22 22:35:10 +000050 word before the cursor. Doesn't recognize words
51 split over two lines, thus may stop at words that are
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000052 not highlighted as bad. Does not stop at word with
53 missing capital at the start of a line.
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000054
55 *]S*
56]S Like "]s" but only stop at bad words, not at rare
57 words or words for another region.
58
59 *[S*
60[S Like "]S" but search backwards.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000061
Bram Moolenaar217ad922005-03-20 22:37:15 +000062
Bram Moolenaarf75a9632005-09-13 21:20:47 +000063To add words to your own word list:
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000064
65 *zg*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000066zg Add word under the cursor as a good word to the first
Bram Moolenaarda2303d2005-08-30 21:55:26 +000067 name in 'spellfile'. A count may precede the command
68 to indicate the entry in 'spellfile' to be used. A
69 count of two uses the second entry.
70
71 In Visual mode the selected characters are added as a
72 word (including white space!).
73 When the cursor is on text that is marked as badly
74 spelled then the marked text is used.
75 Otherwise the word under the cursor, separated by
76 non-word characters, is used.
77
78 If the word is explicitly marked as bad word in
79 another spell file the result is unpredictable.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000080
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000081 *zG*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000082zG Like "zg" but add the word to the internal word list
83 |internal-wordlist|.
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000084
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000085 *zw*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000086zw Like "zg" but mark the word as a wrong (bad) word.
Bram Moolenaar87b5ca52006-03-04 21:55:31 +000087 If the word already appears in 'spellfile' it is
88 turned into a comment line. See |spellfile-cleanup|
89 for getting rid of those.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000090
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000091 *zW*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000092zW Like "zw" but add the word to the internal word list
93 |internal-wordlist|.
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000094
Bram Moolenaar87b5ca52006-03-04 21:55:31 +000095zuw *zug* *zuw*
96zug Undo |zw| and |zg|, remove the word from the entry in
97 'spellfile'. Count used as with |zg|.
98
99zuW *zuG* *zuW*
100zuG Undo |zW| and |zG|, remove the word from the internal
101 word list. Count used as with |zg|.
102
Bram Moolenaar520470a2005-06-16 21:59:56 +0000103 *:spe* *:spellgood*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000104:[count]spe[llgood] {word}
Bram Moolenaar53180ce2005-07-05 21:48:14 +0000105 Add {word} as a good word to 'spellfile', like with
Bram Moolenaar87b5ca52006-03-04 21:55:31 +0000106 |zg|. Without count the first name is used, with a
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000107 count of two the second entry, etc.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000108
Bram Moolenaar53180ce2005-07-05 21:48:14 +0000109:spe[llgood]! {word} Add {word} as a good word to the internal word list,
Bram Moolenaar87b5ca52006-03-04 21:55:31 +0000110 like with |zG|.
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000111
Bram Moolenaar520470a2005-06-16 21:59:56 +0000112 *:spellw* *:spellwrong*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000113:[count]spellw[rong] {word}
Bram Moolenaar53180ce2005-07-05 21:48:14 +0000114 Add {word} as a wrong (bad) word to 'spellfile', as
Bram Moolenaar87b5ca52006-03-04 21:55:31 +0000115 with |zw|. Without count the first name is used, with
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000116 a count of two the second entry, etc.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000117
Bram Moolenaar53180ce2005-07-05 21:48:14 +0000118:spellw[rong]! {word} Add {word} as a wrong (bad) word to the internal word
Bram Moolenaar87b5ca52006-03-04 21:55:31 +0000119 list, like with |zW|.
120
121:[count]spellu[ndo] {word} *:spellu* *:spellundo*
122 Like |zuw|. [count] used as with |:spellgood|.
123
124:spellu[ndo]! {word} Like |zuW|. [count] used as with |:spellgood|.
125
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000126
Bram Moolenaarf461c8e2005-06-25 23:04:51 +0000127After adding a word to 'spellfile' with the above commands its associated
Bram Moolenaard042c562005-06-30 22:04:15 +0000128".spl" file will automatically be updated and reloaded. If you change
129'spellfile' manually you need to use the |:mkspell| command. This sequence of
130commands mostly works well: >
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000131 :edit <file in 'spellfile'>
Bram Moolenaarf461c8e2005-06-25 23:04:51 +0000132< (make changes to the spell file) >
133 :mkspell! %
134
135More details about the 'spellfile' format below |spell-wordlist-format|.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000136
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000137 *internal-wordlist*
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000138The internal word list is used for all buffers where 'spell' is set. It is
139not stored, it is lost when you exit Vim. It is also cleared when 'encoding'
140is set.
141
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000142
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000143Finding suggestions for bad words:
Bram Moolenaarcc016f52005-12-10 20:23:46 +0000144 *z=*
145z= For the word under/after the cursor suggest correctly
Bram Moolenaard042c562005-06-30 22:04:15 +0000146 spelled words. This also works to find alternatives
147 for a word that is not highlighted as a bad word,
148 e.g., when the word after it is bad.
Bram Moolenaar7df351e2006-01-23 22:30:28 +0000149 In Visual mode the highlighted text is taken as the
150 word to be replaced.
151 The results are sorted on similarity to the word being
152 replaced.
Bram Moolenaar90915b52005-08-21 22:17:52 +0000153 This may take a long time. Hit CTRL-C when you get
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000154 bored.
Bram Moolenaar90915b52005-08-21 22:17:52 +0000155
156 If the command is used without a count the
157 alternatives are listed and you can enter the number
158 of your choice or press <Enter> if you don't want to
159 replace. You can also use the mouse to click on your
160 choice (only works if the mouse can be used in Normal
161 mode and when there are no line wraps). Click on the
162 first line (the header) to cancel.
163
Bram Moolenaarfc1421e2006-04-20 22:17:20 +0000164 The suggestions listed normally replace a highlighted
165 bad word. Sometimes they include other text, in that
166 case the replaced text is also listed after a "<".
167
Bram Moolenaar90915b52005-08-21 22:17:52 +0000168 If a count is used that suggestion is used, without
Bram Moolenaarcc016f52005-12-10 20:23:46 +0000169 prompting. For example, "1z=" always takes the first
Bram Moolenaar90915b52005-08-21 22:17:52 +0000170 suggestion.
171
172 If 'verbose' is non-zero a score will be displayed
173 with the suggestions to indicate the likeliness to the
174 badly spelled word (the higher the score the more
175 different).
Bram Moolenaard857f0e2005-06-21 22:37:39 +0000176 When a word was replaced the redo command "." will
177 repeat the word replacement. This works like "ciw",
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000178 the good word and <Esc>. This does NOT work for Thai
179 and other languages without spaces between words.
Bram Moolenaard857f0e2005-06-21 22:37:39 +0000180
Bram Moolenaar24bbcfe2005-06-28 23:32:02 +0000181 *:spellr* *:spellrepall* *E752* *E753*
Bram Moolenaarcc016f52005-12-10 20:23:46 +0000182:spellr[epall] Repeat the replacement done by |z=| for all matches
Bram Moolenaar24bbcfe2005-06-28 23:32:02 +0000183 with the replaced word in the current window.
184
Bram Moolenaar488c6512005-08-11 20:09:58 +0000185In Insert mode, when the cursor is after a badly spelled word, you can use
186CTRL-X s to find suggestions. This works like Insert mode completion. Use
187CTRL-N to use the next suggestion, CTRL-P to go back. |i_CTRL-X_s|
188
Bram Moolenaard857f0e2005-06-21 22:37:39 +0000189The 'spellsuggest' option influences how the list of suggestions is generated
190and sorted. See |'spellsuggest'|.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000191
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000192The 'spellcapcheck' option is used to check the first word of a sentence
193starts with a capital. This doesn't work for the first word in the file.
194When there is a line break right after a sentence the highlighting of the next
Bram Moolenaar90cfdbe2005-08-12 19:59:19 +0000195line may be postponed. Use |CTRL-L| when needed. Also see |set-spc-auto| for
196how it can be set automatically when 'spelllang' is set.
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000197
Bram Moolenaar4770d092006-01-12 23:22:24 +0000198Vim counts the number of times a good word is encountered. This is used to
199sort the suggestions: words that have been seen before get a small bonus,
200words that have been seen often get a bigger bonus. The COMMON item in the
201affix file can be used to define common words, so that this mechanism also
202works in a new or short file |spell-COMMON|.
203
Bram Moolenaard042c562005-06-30 22:04:15 +0000204==============================================================================
2052. Remarks on spell checking *spell-remarks*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000206
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000207PERFORMANCE
208
Bram Moolenaard042c562005-06-30 22:04:15 +0000209Vim does on-the-fly spell checking. To make this work fast the word list is
210loaded in memory. Thus this uses a lot of memory (1 Mbyte or more). There
211might also be a noticeable delay when the word list is loaded, which happens
212when 'spell' is set and when 'spelllang' is set while 'spell' was already set.
213To minimize the delay each word list is only loaded once, it is not deleted
214when 'spelllang' is made empty or 'spell' is reset. When 'encoding' is set
215all the word lists are reloaded, thus you may notice a delay then too.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000216
217
Bram Moolenaar217ad922005-03-20 22:37:15 +0000218REGIONS
219
220A word may be spelled differently in various regions. For example, English
221comes in (at least) these variants:
222
223 en all regions
Bram Moolenaar5c5474b2005-04-19 21:40:26 +0000224 en_au Australia
Bram Moolenaar217ad922005-03-20 22:37:15 +0000225 en_ca Canada
Bram Moolenaar5c5474b2005-04-19 21:40:26 +0000226 en_gb Great Britain
227 en_nz New Zealand
228 en_us USA
Bram Moolenaar217ad922005-03-20 22:37:15 +0000229
230Words that are not used in one region but are used in another region are
Bram Moolenaar520470a2005-06-16 21:59:56 +0000231highlighted with SpellLocal |hl-SpellLocal|.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000232
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000233Always use lowercase letters for the language and region names.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000234
Bram Moolenaar3638c682005-06-08 22:05:14 +0000235When adding a word with |zg| or another command it's always added for all
236regions. You can change that by manually editing the 'spellfile'. See
Bram Moolenaar0dc065e2005-07-04 22:49:24 +0000237|spell-wordlist-format|. Note that the regions as specified in the files in
Bram Moolenaar16d8f872005-11-26 23:46:11 +0000238'spellfile' are only used when all entries in 'spelllang' specify the same
Bram Moolenaar0dc065e2005-07-04 22:49:24 +0000239region (not counting files specified by their .spl name).
Bram Moolenaar3638c682005-06-08 22:05:14 +0000240
Bram Moolenaar5b8d8fd2005-08-16 23:01:50 +0000241 *spell-german*
Bram Moolenaarae5bce12005-08-15 21:41:48 +0000242Specific exception: For German these special regions are used:
243 de all German words accepted
244 de_de old and new spelling
245 de_19 old spelling
246 de_20 new spelling
247 de_at Austria
248 de_ch Switzerland
249
Bram Moolenaar92d640f2005-09-05 22:11:52 +0000250 *spell-russian*
251Specific exception: For Russian these special regions are used:
252 ru all Russian words accepted
253 ru_ru "IE" letter spelling
254 ru_yo "YO" letter spelling
255
Bram Moolenaar5b8d8fd2005-08-16 23:01:50 +0000256 *spell-yiddish*
257Yiddish requires using "utf-8" encoding, because of the special characters
258used. If you are using latin1 Vim will use transliterated (romanized) Yiddish
259instead. If you want to use transliterated Yiddish with utf-8 use "yi-tr".
260In a table:
261 'encoding' 'spelllang'
262 utf-8 yi Yiddish
263 latin1 yi transliterated Yiddish
264 utf-8 yi-tr transliterated Yiddish
265
Bram Moolenaar217ad922005-03-20 22:37:15 +0000266
Bram Moolenaar3b506942005-06-23 22:36:45 +0000267SPELL FILES *spell-load*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000268
269Vim searches for spell files in the "spell" subdirectory of the directories in
Bram Moolenaar3638c682005-06-08 22:05:14 +0000270'runtimepath'. The name is: LL.EEE.spl, where:
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000271 LL the language name
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000272 EEE the value of 'encoding'
Bram Moolenaar217ad922005-03-20 22:37:15 +0000273
Bram Moolenaar3b506942005-06-23 22:36:45 +0000274The value for "LL" comes from 'spelllang', but excludes the region name.
275Examples:
276 'spelllang' LL ~
277 en_us en
278 en-rare en-rare
279 medical_ca medical
280
Bram Moolenaar3638c682005-06-08 22:05:14 +0000281Only the first file is loaded, the one that is first in 'runtimepath'. If
282this succeeds then additionally files with the name LL.EEE.add.spl are loaded.
283All the ones that are found are used.
284
Bram Moolenaar1ef15e32006-02-01 21:56:25 +0000285If no spell file is found the |SpellFileMissing| autocommand event is
286triggered. This may trigger the |spellfile.vim| plugin to offer you
287downloading the spell file.
288
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000289Additionally, the files related to the names in 'spellfile' are loaded. These
290are the files that |zg| and |zw| add good and wrong words to.
Bram Moolenaar3b506942005-06-23 22:36:45 +0000291
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000292Exceptions:
293- Vim uses "latin1" when 'encoding' is "iso-8859-15". The euro sign doesn't
294 matter for spelling.
295- When no spell file for 'encoding' is found "ascii" is tried. This only
296 works for languages where nearly all words are ASCII, such as English. It
297 helps when 'encoding' is not "latin1", such as iso-8859-2, and English text
Bram Moolenaar3638c682005-06-08 22:05:14 +0000298 is being edited. For the ".add" files the same name as the found main
299 spell file is used.
300
301For example, with these values:
302 'runtimepath' is "~/.vim,/usr/share/vim70,~/.vim/after"
303 'encoding' is "iso-8859-2"
304 'spelllang' is "pl"
305
306Vim will look for:
3071. ~/.vim/spell/pl.iso-8859-2.spl
3082. /usr/share/vim70/spell/pl.iso-8859-2.spl
3093. ~/.vim/spell/pl.iso-8859-2.add.spl
3104. /usr/share/vim70/spell/pl.iso-8859-2.add.spl
3115. ~/.vim/after/spell/pl.iso-8859-2.add.spl
312
313This assumes 1. is not found and 2. is found.
314
315If 'encoding' is "latin1" Vim will look for:
3161. ~/.vim/spell/pl.latin1.spl
3172. /usr/share/vim70/spell/pl.latin1.spl
3183. ~/.vim/after/spell/pl.latin1.spl
3194. ~/.vim/spell/pl.ascii.spl
3205. /usr/share/vim70/spell/pl.ascii.spl
3216. ~/.vim/after/spell/pl.ascii.spl
322
323This assumes none of them are found (Polish doesn't make sense when leaving
324out the non-ASCII characters).
Bram Moolenaar217ad922005-03-20 22:37:15 +0000325
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000326Spelling for EBCDIC is currently not supported.
327
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000328A spell file might not be available in the current 'encoding'. See
329|spell-mkspell| about how to create a spell file. Converting a spell file
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000330with "iconv" will NOT work!
Bram Moolenaar217ad922005-03-20 22:37:15 +0000331
Bram Moolenaara40ceaf2006-01-13 22:35:40 +0000332 *spell-sug-file* *E781*
Bram Moolenaar4770d092006-01-12 23:22:24 +0000333If there is a file with exactly the same name as the ".spl" file but ending in
334".sug", that file will be used for giving better suggestions. It isn't loaded
335before suggestions are made to reduce memory use.
336
Bram Moolenaara40ceaf2006-01-13 22:35:40 +0000337 *E758* *E759* *E778* *E779* *E780* *E782*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000338When loading a spell file Vim checks that it is properly formatted. If you
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000339get an error the file may be truncated, modified or intended for another Vim
340version.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000341
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000342
Bram Moolenaar87b5ca52006-03-04 21:55:31 +0000343SPELLFILE CLEANUP *spellfile-cleanup*
344
345The |zw| command turns existing entries in 'spellfile' into comment lines.
346This avoids having to write a new file every time, but results in the file
347only getting longer, never shorter. To clean up the comment lines in all
348".add" spell files do this: >
349 :runtime spell/cleanadd.vim
350
351This deletes all comment lines, except the ones that start with "##". Use
352"##" lines to add comments that you want to keep.
353
354You can invoke this script as often as you like. A variable is provided to
355skip updating files that have been changed recently. Set it to the number of
356seconds that has passed since a file was changed before it will be cleaned.
357For example, to clean only files that were not changed in the last hour: >
358 let g:spell_clean_limit = 60 * 60
359The default is one second.
360
361
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000362WORDS
363
364Vim uses a fixed method to recognize a word. This is independent of
365'iskeyword', so that it also works in help files and for languages that
366include characters like '-' in 'iskeyword'. The word characters do depend on
367'encoding'.
368
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000369The table with word characters is stored in the main .spl file. Therefore it
370matters what the current locale is when generating it! A .add.spl file does
Bram Moolenaarf461c8e2005-06-25 23:04:51 +0000371not contain a word table though.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000372
Bram Moolenaar3638c682005-06-08 22:05:14 +0000373A word that starts with a digit is always ignored. That includes hex numbers
374in the form 0xff and 0XFF.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000375
376
Bram Moolenaar30abd282005-06-22 22:35:10 +0000377WORD COMBINATIONS
378
379It is possible to spell-check words that include a space. This is used to
380recognize words that are invalid when used by themselves, e.g. for "et al.".
381It can also be used to recognize "the the" and highlight it.
382
383The number of spaces is irrelevant. In most cases a line break may also
384appear. However, this makes it difficult to find out where to start checking
385for spelling mistakes. When you make a change to one line and only that line
386is redrawn Vim won't look in the previous line, thus when "et" is at the end
387of the previous line "al." will be flagged as an error. And when you type
388"the<CR>the" the highlighting doesn't appear until the first line is redrawn.
389Use |CTRL-L| to redraw right away. "[s" will also stop at a word combination
390with a line break.
391
392When encountering a line break Vim skips characters such as '*', '>' and '"',
393so that comments in C, shell and Vim code can be spell checked.
394
395
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +0000396SYNTAX HIGHLIGHTING *spell-syntax*
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000397
398Files that use syntax highlighting can specify where spell checking should be
399done:
400
Bram Moolenaar3638c682005-06-08 22:05:14 +00004011. everywhere default
4022. in specific items use "contains=@Spell"
4033. everywhere but specific items use "contains=@NoSpell"
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000404
Bram Moolenaar3638c682005-06-08 22:05:14 +0000405For the second method adding the @NoSpell cluster will disable spell checking
406again. This can be used, for example, to add @Spell to the comments of a
407program, and add @NoSpell for items that shouldn't be checked.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000408
Bram Moolenaar30abd282005-06-22 22:35:10 +0000409
410VIM SCRIPTS
411
412If you want to write a Vim script that does something with spelling, you may
413find these functions useful:
414
415 spellbadword() find badly spelled word at the cursor
416 spellsuggest() get list of spelling suggestions
Bram Moolenaard042c562005-06-30 22:04:15 +0000417 soundfold() get the sound-a-like version of a word
Bram Moolenaar30abd282005-06-22 22:35:10 +0000418
Bram Moolenaar90cfdbe2005-08-12 19:59:19 +0000419
420SETTING 'spellcapcheck' AUTOMATICALLY *set-spc-auto*
421
422After the 'spelllang' option has been set successfully, Vim will source the
423files "spell/LANG.vim" in 'runtimepath'. "LANG" is the value of 'spelllang'
424up to the first comma, dot or underscore. This can be used to set options
425specifically for the language, especially 'spellcapcheck'.
426
427The distribution includes a few of these files. Use this command to see what
428they do: >
429 :next $VIMRUNTIME/spell/*.vim
430
431Note that the default scripts don't set 'spellcapcheck' if it was changed from
432the default value. This assumes the user prefers another value then.
433
Bram Moolenaarae5bce12005-08-15 21:41:48 +0000434
435DOUBLE SCORING *spell-double-scoring*
436
437The 'spellsuggest' option can be used to select "double" scoring. This
438mechanism is based on the principle that there are two kinds of spelling
439mistakes:
440
4411. You know how to spell the word, but mistype something. This results in a
442 small editing distance (character swapped/omitted/inserted) and possibly a
443 word that sounds completely different.
444
4452. You don't know how to spell the word and type something that sounds right.
446 The edit distance can be big but the word is similar after sound-folding.
447
448Since scores for these two mistakes will be very different we use a list
449for each and mix them.
450
451The sound-folding is slow and people that know the language won't make the
452second kind of mistakes. Therefore 'spellsuggest' can be set to select the
453preferred method for scoring the suggestions.
454
Bram Moolenaar217ad922005-03-20 22:37:15 +0000455==============================================================================
Bram Moolenaard042c562005-06-30 22:04:15 +00004563. Generating a spell file *spell-mkspell*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000457
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000458Vim uses a binary file format for spelling. This greatly speeds up loading
459the word list and keeps it small.
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000460 *.aff* *.dic* *Myspell*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000461You can create a Vim spell file from the .aff and .dic files that Myspell
462uses. Myspell is used by OpenOffice.org and Mozilla. You should be able to
463find them here:
464 http://lingucomponent.openoffice.org/spell_dic.html
Bram Moolenaar30abd282005-06-22 22:35:10 +0000465You can also use a plain word list. The results are the same, the choice
Bram Moolenaard042c562005-06-30 22:04:15 +0000466depends on what word lists you can find.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000467
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +0000468If you install Aap (from www.a-a-p.org) you can use the recipes in the
469runtime/spell/??/ directories. Aap will take care of downloading the files,
470apply patches needed for Vim and build the .spl file.
471
Bram Moolenaare13305e2005-06-19 22:54:15 +0000472Make sure your current locale is set properly, otherwise Vim doesn't know what
473characters are upper/lower case letters. If the locale isn't available (e.g.,
474when using an MS-Windows codepage on Unix) add tables to the .aff file
Bram Moolenaar3b506942005-06-23 22:36:45 +0000475|spell-affix-chars|. If the .aff file doesn't define a table then the word
476table of the currently active spelling is used. If spelling is not active
477then Vim will try to guess.
Bram Moolenaare13305e2005-06-19 22:54:15 +0000478
Bram Moolenaar3b506942005-06-23 22:36:45 +0000479 *:mksp* *:mkspell*
480:mksp[ell][!] [-ascii] {outname} {inname} ...
Bram Moolenaar16d8f872005-11-26 23:46:11 +0000481 Generate a Vim spell file from word lists. Example: >
Bram Moolenaard042c562005-06-30 22:04:15 +0000482 :mkspell /tmp/nl nl_NL.words
Bram Moolenaar3b506942005-06-23 22:36:45 +0000483< *E751*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000484 When {outname} ends in ".spl" it is used as the output
485 file name. Otherwise it should be a language name,
Bram Moolenaar3b506942005-06-23 22:36:45 +0000486 such as "en", without the region name. The file
487 written will be "{outname}.{encoding}.spl", where
488 {encoding} is the value of the 'encoding' option.
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000489
Bram Moolenaard042c562005-06-30 22:04:15 +0000490 When the output file already exists [!] must be used
Bram Moolenaar520470a2005-06-16 21:59:56 +0000491 to overwrite it.
492
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000493 When the [-ascii] argument is present, words with
494 non-ascii characters are skipped. The resulting file
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000495 ends in "ascii.spl".
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000496
497 The input can be the Myspell format files {inname}.aff
498 and {inname}.dic. If {inname}.aff does not exist then
499 {inname} is used as the file name of a plain word
500 list.
501
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000502 Multiple {inname} arguments can be given to combine
503 regions into one Vim spell file. Example: >
504 :mkspell ~/.vim/spell/en /tmp/en_US /tmp/en_CA /tmp/en_AU
505< This combines the English word lists for US, CA and AU
506 into one en.spl file.
Bram Moolenaar910f66f2006-04-05 20:41:53 +0000507 Up to eight regions can be combined. *E754* *E755*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000508 The REP and SAL items of the first .aff file where
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000509 they appear are used. |spell-REP| |spell-SAL|
Bram Moolenaar217ad922005-03-20 22:37:15 +0000510
Bram Moolenaar30abd282005-06-22 22:35:10 +0000511 This command uses a lot of memory, required to find
Bram Moolenaar8aff23a2005-08-19 20:40:30 +0000512 the optimal word tree (Polish, Italian and Hungarian
513 require several hundred Mbyte). The final result will
514 be much smaller, because compression is used. To
515 avoid running out of memory compression will be done
516 now and then. This can be tuned with the 'mkspellmem'
517 option.
Bram Moolenaar30abd282005-06-22 22:35:10 +0000518
Bram Moolenaard042c562005-06-30 22:04:15 +0000519 After the spell file was written and it was being used
520 in a buffer it will be reloaded automatically.
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000521
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000522:mksp[ell] [-ascii] {name}.{enc}.add
523 Like ":mkspell" above, using {name}.{enc}.add as the
Bram Moolenaard042c562005-06-30 22:04:15 +0000524 input file and producing an output file in the same
525 directory that has ".spl" appended.
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000526
527:mksp[ell] [-ascii] {name}
528 Like ":mkspell" above, using {name} as the input file
Bram Moolenaard042c562005-06-30 22:04:15 +0000529 and producing an output file in the same directory
530 that has ".{enc}.spl" appended.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000531
Bram Moolenaarae5bce12005-08-15 21:41:48 +0000532Vim will report the number of duplicate words. This might be a mistake in the
533list of words. But sometimes it is used to have different prefixes and
534suffixes for the same basic word to avoid them combining (e.g. Czech uses
Bram Moolenaar8aff23a2005-08-19 20:40:30 +0000535this). If you want Vim to report all duplicate words set the 'verbose'
536option.
Bram Moolenaarae5bce12005-08-15 21:41:48 +0000537
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000538Since you might want to change a Myspell word list for use with Vim the
539following procedure is recommended:
Bram Moolenaar217ad922005-03-20 22:37:15 +0000540
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +00005411. Obtain the xx_YY.aff and xx_YY.dic files from Myspell.
5422. Make a copy of these files to xx_YY.orig.aff and xx_YY.orig.dic.
5433. Change the xx_YY.aff and xx_YY.dic files to remove bad words, add missing
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000544 words, define word characters with FOL/LOW/UPP, etc. The distributed
545 "src/spell/*.diff" files can be used.
Bram Moolenaard042c562005-06-30 22:04:15 +00005464. Start Vim with the right locale and use |:mkspell| to generate the Vim
547 spell file.
5485. Try out the spell file with ":set spell spelllang=xx" if you wrote it in
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +0000549 a spell directory in 'runtimepath', or ":set spelllang=xx.enc.spl" if you
Bram Moolenaard042c562005-06-30 22:04:15 +0000550 wrote it somewhere else.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000551
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000552When the Myspell files are updated you can merge the differences:
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00005531. Obtain the new Myspell files as xx_YY.new.aff and xx_UU.new.dic.
5542. Use Vimdiff to see what changed: >
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000555 vimdiff xx_YY.orig.dic xx_YY.new.dic
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00005563. Take over the changes you like in xx_YY.dic.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000557 You may also need to change xx_YY.aff.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00005584. Rename xx_YY.new.dic to xx_YY.orig.dic and xx_YY.new.aff to xx_YY.new.aff.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000559
Bram Moolenaar3b506942005-06-23 22:36:45 +0000560
Bram Moolenaar8aff23a2005-08-19 20:40:30 +0000561SPELL FILE VERSIONS *E770* *E771* *E772*
562
563Spell checking is a relatively new feature in Vim, thus it's possible that the
564.spl file format will be changed to support more languages. Vim will check
565the validity of the spell file and report anything wrong.
566
567 E771: Old spell file, needs to be updated ~
568This spell file is older than your Vim. You need to update the .spl file.
569
570 E772: Spell file is for newer version of Vim ~
571This means the spell file was made for a later version of Vim. You need to
572update Vim.
573
574 E770: Unsupported section in spell file ~
575This means the spell file was made for a later version of Vim and contains a
576section that is required for the spell file to work. In this case it's
577probably a good idea to upgrade your Vim.
578
579
Bram Moolenaar3b506942005-06-23 22:36:45 +0000580SPELL FILE DUMP
581
582If for some reason you want to check what words are supported by the currently
583used spelling files, use this command:
584
585 *:spelldump* *:spelld*
586:spelld[ump] Open a new window and fill it with all currently valid
Bram Moolenaarac6e65f2005-08-29 22:25:38 +0000587 words. Compound words are not included.
Bram Moolenaard042c562005-06-30 22:04:15 +0000588 Note: For some languages the result may be enormous,
589 causing Vim to run out of memory.
Bram Moolenaar3b506942005-06-23 22:36:45 +0000590
Bram Moolenaar4770d092006-01-12 23:22:24 +0000591:spelld[ump]! Like ":spelldump" and include the word count. This is
592 the number of times the word was found while
593 updating the screen. Words that are in COMMON items
594 get a starting count of 10.
595
Bram Moolenaar3b506942005-06-23 22:36:45 +0000596The format of the word list is used |spell-wordlist-format|. You should be
597able to read it with ":mkspell" to generate one .spl file that includes all
598the words.
599
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000600When all entries to 'spelllang' use the same regions or no regions at all then
601the region information is included in the dumped words. Otherwise only words
602for the current region are included and no "/regions" line is generated.
Bram Moolenaar3b506942005-06-23 22:36:45 +0000603
Bram Moolenaard042c562005-06-30 22:04:15 +0000604Comment lines with the name of the .spl file are used as a header above the
605words that were generated from that .spl file.
Bram Moolenaar3b506942005-06-23 22:36:45 +0000606
Bram Moolenaar1ef15e32006-02-01 21:56:25 +0000607
608SPELL FILE MISSING *spell-SpellFileMissing* *spellfile.vim*
609
610If the spell file for the language you are using is not available, you will
611get an error message. But if the "spellfile.vim" plugin is active it will
612offer you to download the spell file. Just follow the instructions, it will
613ask you where to write the file.
614
615The plugin has a default place where to look for spell files, on the Vim ftp
616server. If you want to use another location or another protocol, set the
617g:spellfile_URL variable to the directory that holds the spell files. The
618|netrw| plugin is used for getting the file, look there for the speficic
619syntax of the URL. Example: >
620 let g:spellfile_URL = 'http://ftp.vim.org/vim/runtime/spell'
621You may need to escape special characters.
622
623The plugin will only ask about downloading a language once. If you want to
624try again anyway restart Vim, or set g:spellfile_URL to another value (e.g.,
625prepend a space).
626
627To avoid using the "spellfile.vim" plugin do this in your vimrc file: >
628
629 let loaded_spellfile_plugin = 1
630
631Instead of using the plugin you can define a |SpellFileMissing| autocommand to
632handle the missing file yourself. You can use it like this: >
633
634 :au SpellFileMissing * call Download_spell_file(expand('<amatch>'))
635
636Thus the <amatch> item contains the name of the language. Another important
637value is 'encoding', since every encoding has its own spell file. With two
638exceptions:
639- For ISO-8859-15 (latin9) the name "latin1" is used (the encodings only
640 differ in characters not used in dictionary words).
641- The name "ascii" may also be used for some languages where the words use
642 only ASCII letters for most of the words.
643
644The default "spellfile.vim" plugin uses this autocommand, if you define your
645autocommand afterwars you may want to use ":au! SpellFileMissing" to overrule
646it. If you define your autocommand before the plugin is loaded it will notice
647this and not do anything.
648
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000649==============================================================================
Bram Moolenaard042c562005-06-30 22:04:15 +00006504. Spell file format *spell-file-format*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000651
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000652This is the format of the files that are used by the person who creates and
653maintains a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000654
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000655Note that we avoid the word "dictionary" here. That is because the goal of
656spell checking differs from writing a dictionary (as in the book). For
Bram Moolenaar16d8f872005-11-26 23:46:11 +0000657spelling we need a list of words that are OK, thus should not be highlighted.
658Person and company names will not appear in a dictionary, but do appear in a
659word list. And some old words are rarely used while they are common
660misspellings. These do appear in a dictionary but not in a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000661
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +0000662There are two formats: A straight list of words and a list using affix
Bram Moolenaard042c562005-06-30 22:04:15 +0000663compression. The files with affix compression are used by Myspell (Mozilla
664and OpenOffice.org). This requires two files, one with .aff and one with .dic
665extension.
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000666
667
Bram Moolenaard042c562005-06-30 22:04:15 +0000668FORMAT OF STRAIGHT WORD LIST *spell-wordlist-format*
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000669
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000670The words must appear one per line. That is all that is required.
Bram Moolenaard042c562005-06-30 22:04:15 +0000671
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000672Additionally the following items are recognized:
Bram Moolenaard042c562005-06-30 22:04:15 +0000673
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000674- Empty and blank lines are ignored.
Bram Moolenaard042c562005-06-30 22:04:15 +0000675
Bram Moolenaar4770d092006-01-12 23:22:24 +0000676 # comment ~
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000677- Lines starting with a # are ignored (comment lines).
Bram Moolenaard042c562005-06-30 22:04:15 +0000678
Bram Moolenaar4770d092006-01-12 23:22:24 +0000679 /encoding=utf-8 ~
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000680- A line starting with "/encoding=", before any word, specifies the encoding
681 of the file. After the second '=' comes an encoding name. This tells Vim
Bram Moolenaard042c562005-06-30 22:04:15 +0000682 to setup conversion from the specified encoding to 'encoding'. Thus you can
683 use one word list for several target encodings.
684
Bram Moolenaar4770d092006-01-12 23:22:24 +0000685 /regions=usca ~
Bram Moolenaar3638c682005-06-08 22:05:14 +0000686- A line starting with "/regions=" specifies the region names that are
687 supported. Each region name must be two ASCII letters. The first one is
688 region 1. Thus "/regions=usca" has region 1 "us" and region 2 "ca".
Bram Moolenaard042c562005-06-30 22:04:15 +0000689 In an addition word list the region names should be equal to the main word
690 list!
691
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000692- Other lines starting with '/' are reserved for future use. The ones that
Bram Moolenaar4770d092006-01-12 23:22:24 +0000693 are not recognized are ignored. You do get a warning message, so that you
694 know something won't work.
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000695
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000696- A "/" may follow the word with the following items:
697 = Case must match exactly.
698 ? Rare word.
699 ! Bad (wrong) word.
700 digit A region in which the word is valid. If no regions are
701 specified the word is valid in all regions.
702
Bram Moolenaar3638c682005-06-08 22:05:14 +0000703Example:
704
705 # This is an example word list comment
706 /encoding=latin1 encoding of the file
707 /regions=uscagb regions "us", "ca" and "gb"
708 example word for all regions
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000709 blah/12 word for regions "us" and "ca"
710 vim/! bad word
711 Campbell/?3 rare word in region 3 "gb"
712 's mornings/= keep-case word
Bram Moolenaar3638c682005-06-08 22:05:14 +0000713
Bram Moolenaar0dc065e2005-07-04 22:49:24 +0000714Note that when "/=" is used the same word with all upper-case letters is not
715accepted. This is different from a word with mixed case that is automatically
716marked as keep-case, those words may appear in all upper-case letters.
717
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000718
Bram Moolenaar8dff8182006-04-06 20:18:50 +0000719FORMAT WITH .AFF AND .DIC FILES *aff-dic-format*
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000720
Bram Moolenaar4770d092006-01-12 23:22:24 +0000721There are two files: the basic word list and an affix file. The affix file
722specifies settings for the language and can contain affixes. The affixes are
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000723used to modify the basic words to get the full word list. This significantly
724reduces the number of words, especially for a language like Polish. This is
725called affix compression.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000726
Bram Moolenaar4770d092006-01-12 23:22:24 +0000727The basic word list and the affix file are combined with the ":mkspell"
728command and results in a binary spell file. All the preprocessing has been
729done, thus this file loads fast. The binary spell file format is described in
730the source code (src/spell.c). But only developers need to know about it.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000731
732The preprocessing also allows us to take the Myspell language files and modify
733them before the Vim word list is made. The tools for this can be found in the
734"src/spell" directory.
735
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000736The format for the affix and word list files is based on what Myspell uses
737(the spell checker of Mozilla and OpenOffice.org). A description can be found
738here:
739 http://lingucomponent.openoffice.org/affix.readme ~
740Note that affixes are case sensitive, this isn't obvious from the description.
741
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000742Vim supports quite a few extras. They are described below |spell-affix-vim|.
743Attempts have been made to keep this compatible with other spell checkers, so
Bram Moolenaar4770d092006-01-12 23:22:24 +0000744that the same files can often be used. One other project that offers more
745than Myspell is Hunspell ( http://hunspell.sf.net ).
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000746
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000747
Bram Moolenaar3638c682005-06-08 22:05:14 +0000748WORD LIST FORMAT *spell-dic-format*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000749
Bram Moolenaar4770d092006-01-12 23:22:24 +0000750A short example, with line numbers:
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000751
Bram Moolenaar4770d092006-01-12 23:22:24 +0000752 1 1234 ~
753 2 aan ~
754 3 Als ~
755 4 Etten-Leur ~
756 5 et al. ~
757 6 's-Gravenhage ~
758 7 's-Gravenhaags ~
759 8 # word that differs between regions ~
760 9 kado/1 ~
761 10 cadeau/2 ~
762 11 TCP,IP ~
763 12 /the S affix may add a 's' ~
764 13 bedel/S ~
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000765
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000766The first line contains the number of words. Vim ignores it, but you do get
767an error message if it's not there. *E760*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000768
Bram Moolenaar4770d092006-01-12 23:22:24 +0000769What follows is one word per line. White space at the end of the line is
770ignored, all other white space matters. The encoding is specified in the
771affix file |spell-SET|.
772
773Comment lines start with '#' or '/'. See the example lines 8 and 12. Note
774that putting a comment after a word is NOT allowed:
775
776 someword # comment that causes an error! ~
777
778After the word there is an optional slash and flags. Most of these flags are
779letters that indicate the affixes that can be used with this word. These are
780specified with SFX and PFX lines in the .aff file, see |spell-SFX| and
781|spell-PFX|. Vim allows using other flag types with the FLAG item in the
782affix file |spell-FLAG|.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000783
784When the word only has lower-case letters it will also match with the word
785starting with an upper-case letter.
786
787When the word includes an upper-case letter, this means the upper-case letter
788is required at this position. The same word with a lower-case letter at this
789position will not match. When some of the other letters are upper-case it will
790not match either.
791
Bram Moolenaar4770d092006-01-12 23:22:24 +0000792The word with all upper-case characters will always be OK,
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000793
794 word list matches does not match ~
795 als als Als ALS ALs AlS aLs aLS
796 Als Als ALS als ALs AlS aLs aLS
797 ALS ALS als Als ALs AlS aLs aLS
798 AlS AlS ALS als Als ALs aLs aLS
799
Bram Moolenaar1cbe5f72005-12-29 22:51:09 +0000800The KEEPCASE affix ID can be used to specifically match a word with identical
801case only, see below |spell-KEEPCASE|.
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000802
Bram Moolenaar4770d092006-01-12 23:22:24 +0000803Note: in line 5 to 7 non-word characters are used. You can include any
804character in a word. When checking the text a word still only matches when it
805appears with a non-word character before and after it. For Myspell a word
806starting with a non-word character probably won't work.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000807
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000808In line 12 the word "TCP/IP" is defined. Since the slash has a special
809meaning the comma is used instead. This is defined with the SLASH item in the
Bram Moolenaar4770d092006-01-12 23:22:24 +0000810affix file, see |spell-SLASH|. Note that without this SLASH item the word
811will be "TCP,IP".
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000812
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000813
Bram Moolenaar4770d092006-01-12 23:22:24 +0000814AFFIX FILE FORMAT *spell-aff-format* *spell-affix-vim*
Bram Moolenaar0dc065e2005-07-04 22:49:24 +0000815
Bram Moolenaar4770d092006-01-12 23:22:24 +0000816 *spell-affix-comment*
817Comment lines in the .aff file start with a '#':
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000818
Bram Moolenaar4770d092006-01-12 23:22:24 +0000819 # comment line ~
820
821With some items it's also possible to put a comment after it, but this isn't
822supported in general.
823
824
825ENCODING *spell-SET*
826
827The affix file can be in any encoding that is supported by "iconv". However,
828in some cases the current locale should also be set properly at the time
829|:mkspell| is invoked. Adding FOL/LOW/UPP lines removes this requirement
830|spell-FOL|.
831
832The encoding should be specified before anything where the encoding matters.
833The encoding applies both to the affix file and the dictionary file. It is
834done with a SET line:
835
836 SET utf-8 ~
837
838The encoding can be different from the value of the 'encoding' option at the
839time ":mkspell" is used. Vim will then convert everything to 'encoding' and
840generate a spell file for 'encoding'. If some of the used characters to not
841fit in 'encoding' you will get an error message.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000842 *spell-affix-mbyte*
Bram Moolenaar4770d092006-01-12 23:22:24 +0000843When using a multi-byte encoding it's possible to use more different affix
844flags. But Myspell doesn't support that, thus you may not want to use it
845anyway. For compatibility use an 8-bit encoding.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000846
Bram Moolenaare13305e2005-06-19 22:54:15 +0000847
Bram Moolenaar362e1a32006-03-06 23:29:24 +0000848INFORMATION
849
850These entries in the affix file can be used to add information to the spell
851file. There are no restrictions on the format, but they should be in the
852right encoding.
853
854 *spell-NAME* *spell-VERSION* *spell-HOME*
855 *spell-AUTHOR* *spell-EMAIL* *spell-COPYRIGHT*
856 NAME Name of the language
857 VERSION 1.0.1 with fixes
858 HOME http://www.myhome.eu
859 AUTHOR John Doe
860 EMAIL john AT Doe DOT net
861 COPYRIGHT LGPL
862
863These fields are put in the .spl file as-is. The |:spellinfo| command can be
864used to view the info.
865
866 *:spellinfo* *:spelli*
867:spelli[nfo] Display the information for the spell file(s) used for
868 the current buffer.
869
870
Bram Moolenaare13305e2005-06-19 22:54:15 +0000871CHARACTER TABLES
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000872 *spell-affix-chars*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000873When using an 8-bit encoding the affix file should define what characters are
Bram Moolenaar4770d092006-01-12 23:22:24 +0000874word characters. This is because the system where ":mkspell" is used may not
875support a locale with this encoding and isalpha() won't work. For example
876when using "cp1250" on Unix.
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000877 *E761* *E762* *spell-FOL*
878 *spell-LOW* *spell-UPP*
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000879Three lines in the affix file are needed. Simplistic example:
880
Bram Moolenaare13305e2005-06-19 22:54:15 +0000881 FOL áëñ ~
882 LOW áëñ ~
883 UPP ÁËÑ ~
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000884
885All three lines must have exactly the same number of characters.
886
887The "FOL" line specifies the case-folded characters. These are used to
888compare words while ignoring case. For most encodings this is identical to
889the lower case line.
890
891The "LOW" line specifies the characters in lower-case. Mostly it's equal to
892the "FOL" line.
893
894The "UPP" line specifies the characters with upper-case. That is, a character
895is upper-case where it's different from the character at the same position in
896"FOL".
897
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000898An exception is made for the German sharp s ß. The upper-case version is
899"SS". In the FOL/LOW/UPP lines it should be included, so that it's recognized
900as a word character, but use the ß character in all three.
901
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000902ASCII characters should be omitted, Vim always handles these in the same way.
903When the encoding is UTF-8 no word characters need to be specified.
904
905 *E763*
Bram Moolenaar3b506942005-06-23 22:36:45 +0000906Vim allows you to use spell checking for several languages in the same file.
907You can list them in the 'spelllang' option. As a consequence all spell files
908for the same encoding must use the same word characters, otherwise they can't
909be combined without errors. If you get a warning that the word tables differ
910you may need to generate the .spl file again with |:mkspell|. Check the FOL,
911LOW and UPP lines in the used .aff file.
912
913The XX.ascii.spl spell file generated with the "-ascii" argument will not
914contain the table with characters, so that it can be combine with spell files
915for any encoding. The .add.spl files also do not contain the table.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000916
Bram Moolenaare7566042005-06-17 22:00:15 +0000917
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000918MID-WORD CHARACTERS
919 *spell-midword*
920Some characters are only to be considered word characters if they are used in
921between two ordinary word characters. An example is the single quote: It is
922often used to put text in quotes, thus it can't be recognized as a word
923character, but when it appears in between word characters it must be part of
924the word. This is needed to detect a spelling error such as they'are. That
925should be they're, but since "they" and "are" are words themselves that would
926go unnoticed.
927
Bram Moolenaar4770d092006-01-12 23:22:24 +0000928These characters are defined with MIDWORD in the .aff file. Example:
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000929
930 MIDWORD '- ~
931
932
Bram Moolenaar6e7c7f32005-08-24 22:16:11 +0000933FLAG TYPES *spell-FLAG*
934
935Flags are used to specify the affixes that can be used with a word and for
936other properties of the word. Normally single-character flags are used. This
937limits the number of possible flags, especially for 8-bit encodings. The FLAG
938item can be used if more affixes are to be used. Possible values:
939
940 FLAG long use two-character flags
941 FLAG num use numbers, from 1 up to 65000
Bram Moolenaar81f1ecb2005-08-25 21:27:31 +0000942 FLAG caplong use one-character flags without A-Z and two-character
Bram Moolenaar6e7c7f32005-08-24 22:16:11 +0000943 flags that start with A-Z
944
945With "FLAG num" the numbers in a list of affixes need to be separated with a
946comma: "234,2143,1435". This method is inefficient, but useful if the file is
947generated with a program.
948
Bram Moolenaar81f1ecb2005-08-25 21:27:31 +0000949When using "caplong" the two-character flags all start with a capital: "Aa",
950"B1", "BB", etc. This is useful to use one-character flags for the most
951common items and two-character flags for uncommon items.
Bram Moolenaar6e7c7f32005-08-24 22:16:11 +0000952
953Note: When using utf-8 only characters up to 65000 may be used for flags.
954
955
Bram Moolenaare13305e2005-06-19 22:54:15 +0000956AFFIXES
Bram Moolenaar6f16eb82005-08-23 21:02:42 +0000957 *spell-PFX* *spell-SFX*
Bram Moolenaare13305e2005-06-19 22:54:15 +0000958The usual PFX (prefix) and SFX (suffix) lines are supported (see the Myspell
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000959documentation or the Aspell manual:
960http://aspell.net/man-html/Affix-Compression.html).
Bram Moolenaare13305e2005-06-19 22:54:15 +0000961
Bram Moolenaar4770d092006-01-12 23:22:24 +0000962Summary:
963 SFX L Y 2 ~
964 SFX L 0 re [^x] ~
965 SFX L 0 ro x ~
966
967The first line is a header and has four fields:
968 SFX {flag} {combine} {count}
969
970{flag} The name used for the suffix. Mostly it's a single letter,
971 but other characters can be used, see |spell-FLAG|.
972
973{combine} Can be 'Y' or 'N'. When 'Y' then the word plus suffix can
974 also have a prefix. When 'N' then a prefix is not allowed.
975
976{count} The number of lines following. If this is wrong you will get
977 an error message.
978
979For PFX the fields are exactly the same.
980
981The basic format for the following lines is:
Bram Moolenaar899dddf2006-03-26 21:06:50 +0000982 SFX {flag} {strip} {add} {condition} {extra}
Bram Moolenaar4770d092006-01-12 23:22:24 +0000983
984{flag} Must be the same as the {flag} used in the first line.
985
986{strip} Characters removed from the basic word. There is no check if
987 the characters are actually there, only the length is used (in
988 bytes). This better match the {condition}, otherwise strange
989 things may happen. If the {strip} length is equal to or
990 longer than the basic word the suffix won't be used.
991 When {strip} is 0 (zero) then nothing is stripped.
992
993{add} Characters added to the basic word, after removing {strip}.
Bram Moolenaar899dddf2006-03-26 21:06:50 +0000994 Optionally there is a '/' followed by flags. The flags apply
995 to the word plus affix. See |spell-affix-flags|
Bram Moolenaar4770d092006-01-12 23:22:24 +0000996
997{condition} A simplistic pattern. Only when this matches with a basic
998 word will the suffix be used for that word. This is normally
999 for using one suffix letter with different {add} and {strip}
1000 fields for words with different endings.
1001 When {condition} is a . (dot) there is no condition.
1002 The pattern may contain:
1003 - Literal characters.
1004 - A set of characters in []. [abc] matches a, b and c.
1005 A dash is allowed for a range [a-c], but this is
1006 Vim-specific.
1007 - A set of characters that starts with a ^, meaning the
1008 complement of the specified characters. [^abc] matches any
1009 character but a, b and c.
1010
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001011{extra} Optional extra text:
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001012 # comment Comment is ignored
1013 - Hunspell uses this, ignored
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001014
Bram Moolenaar4770d092006-01-12 23:22:24 +00001015For PFX the fields are the same, but the {strip}, {add} and {condition} apply
1016to the start of the word.
1017
1018Note: Myspell ignores any extra text after the relevant info. Vim requires
1019this text to start with a "#" so that mistakes don't go unnoticed. Example:
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +00001020
1021 SFX F 0 in [^i]n # Spion > Spionin ~
1022 SFX F 0 nen in # Bauerin > Bauerinnen ~
1023
Bram Moolenaar81f1ecb2005-08-25 21:27:31 +00001024Apparently Myspell allows an affix name to appear more than once. Since this
1025might also be a mistake, Vim checks for an extra "S". The affix files for
1026Myspell that use this feature apparently have this flag. Example:
1027
1028 SFX a Y 1 S ~
1029 SFX a 0 an . ~
1030
1031 SFX a Y 2 S ~
1032 SFX a 0 en . ~
1033 SFX a 0 on . ~
1034
Bram Moolenaar4770d092006-01-12 23:22:24 +00001035
1036AFFIX FLAGS *spell-affix-flags*
1037
1038This is a feature that comes from Hunspell: The affix may specify flags. This
1039works similar to flags specified on a basic word. The flags apply to the
Bram Moolenaar8dff8182006-04-06 20:18:50 +00001040basic word plus the affix (but there are restrictions). Example:
Bram Moolenaar4770d092006-01-12 23:22:24 +00001041
1042 SFX S Y 1 ~
1043 SFX S 0 s . ~
1044
1045 SFX A Y 1 ~
1046 SFX A 0 able/S . ~
1047
1048When the dictionary file contains "drink/AS" then these words are possible:
1049
1050 drink
1051 drinks uses S suffix
1052 drinkable uses A suffix
1053 drinkables uses A suffix and then S suffix
1054
1055Generally the flags of the suffix are added to the flags of the basic word,
1056both are used for the word plus suffix. But the flags of the basic word are
1057only used once for affixes, except that both one prefix and one suffix can be
1058used when both support combining.
1059
1060Specifically, the affix flags can be used for:
Bram Moolenaar8dff8182006-04-06 20:18:50 +00001061- Suffixes on suffixes, as in the example above. This works once, thus you
1062 can have two suffixes on a word (plus one prefix).
Bram Moolenaar4770d092006-01-12 23:22:24 +00001063- Making the word with the affix rare, by using the |spell-RARE| flag.
1064- Exclude the word with the affix from compounding, by using the
1065 |spell-COMPOUNDFORBIDFLAG| flag.
Bram Moolenaar910f66f2006-04-05 20:41:53 +00001066- Allow the word with the affix to be part of a compound word on the side of
1067 the affix with the |spell-COMPOUNDPERMITFLAG|.
Bram Moolenaar8dff8182006-04-06 20:18:50 +00001068- Use the NEEDCOMPOUND flag: word plus affix can only be used as part of a
1069 compound word. |spell-NEEDCOMPOUND|
1070- Compound flags: word plus affix can be part of a compound word at the end,
1071 middle, start, etc. The flags are combined with the flags of the basic
1072 word. |spell-compound|
1073- NEEDAFFIX: another affix is needed to make a valid word.
1074- CIRCUMFIX, as explained just below.
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001075
Bram Moolenaar8dff8182006-04-06 20:18:50 +00001076
1077CIRCUMFIX *spell-CIRCUMFIX*
1078
1079The CIRCUMFIX flag means a prefix and suffix must be added at the same time.
1080If a prefix has the CIRCUMFIX flag than only suffixes with the CIRCUMFIX flag
1081can be added, and the other way around.
1082An alternative is to only specify the suffix, and give the that suffix two
1083flags: The required prefix and the NEEDAFFIX flag. |spell-NEEDAFFIX|
1084
1085
1086PFXPOSTPONE *spell-PFXPOSTPONE*
1087
Bram Moolenaare13305e2005-06-19 22:54:15 +00001088When an affix file has very many prefixes that apply to many words it's not
1089possible to build the whole word list in memory. This applies to Hebrew (a
1090list with all words is over a Gbyte). In that case applying prefixes must be
1091postponed. This makes spell checking slower. It is indicated by this keyword
1092in the .aff file:
1093
1094 PFXPOSTPONE ~
1095
Bram Moolenaar8dff8182006-04-06 20:18:50 +00001096Only prefixes without a chop string and without flags can be postponed.
1097Prefixes with a chop string or with flags will still be included in the word
1098list. An exception if the chop string is one character and equal to the last
1099character of the added string, but in lower case. Thus when the chop string
1100is used to allow the following word to start with an upper case letter.
Bram Moolenaare13305e2005-06-19 22:54:15 +00001101
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001102
Bram Moolenaar6f16eb82005-08-23 21:02:42 +00001103WORDS WITH A SLASH *spell-SLASH*
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001104
1105The slash is used in the .dic file to separate the basic word from the affix
Bram Moolenaar33aec762006-01-22 23:30:12 +00001106letters and other flags. Unfortunately, this means you cannot use a slash in
1107a word. Thus "TCP/IP" is not a word but "TCP with the flags "IP". To include
1108a slash in the word put a backslash before it: "TCP\/IP". In the rare case
1109you want to use a backslash inside a word you need to use two backslashes.
1110Any other use of the backslash is reserved for future expansion.
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001111
1112
Bram Moolenaar1cbe5f72005-12-29 22:51:09 +00001113KEEP-CASE WORDS *spell-KEEPCASE*
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001114
Bram Moolenaar1cbe5f72005-12-29 22:51:09 +00001115In the affix file a KEEPCASE line can be used to define the affix name used
1116for keep-case words. Example:
Bram Moolenaar45eeb132005-06-06 21:59:07 +00001117
Bram Moolenaar1cbe5f72005-12-29 22:51:09 +00001118 KEEPCASE = ~
Bram Moolenaar45eeb132005-06-06 21:59:07 +00001119
Bram Moolenaar4770d092006-01-12 23:22:24 +00001120This flag is not supported by Myspell. It has the meaning that case matters.
1121This can be used if the word does not have the first letter in upper case at
1122the start of a sentence. Example:
1123
1124 word list matches does not match ~
1125 's morgens/= 's morgens 'S morgens 's Morgens 'S MORGENS
1126 's Morgens 's Morgens 'S MORGENS 'S morgens 's morgens
1127
1128The flag can also be used to avoid that the word matches when it is in all
1129upper-case letters.
Bram Moolenaar45eeb132005-06-06 21:59:07 +00001130
Bram Moolenaare13305e2005-06-19 22:54:15 +00001131
Bram Moolenaar1cbe5f72005-12-29 22:51:09 +00001132RARE WORDS *spell-RARE*
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001133
Bram Moolenaar1cbe5f72005-12-29 22:51:09 +00001134In the affix file a RARE line can be used to define the affix name used for
Bram Moolenaar45eeb132005-06-06 21:59:07 +00001135rare words. Example:
1136
Bram Moolenaar1cbe5f72005-12-29 22:51:09 +00001137 RARE ? ~
Bram Moolenaar45eeb132005-06-06 21:59:07 +00001138
1139Rare words are highlighted differently from bad words. This is to be used for
1140words that are correct for the language, but are hardly ever used and could be
Bram Moolenaar30abd282005-06-22 22:35:10 +00001141a typing mistake anyway. When the same word is found as good it won't be
1142highlighted as rare.
1143
Bram Moolenaar910f66f2006-04-05 20:41:53 +00001144This flag can also be used on an affix, so that a basic word is not rare but
1145the basic word plus affix is rare |spell-affix-flags|. However, if the word
1146also appears as a good word in another way (e.g., in another region) it won't
1147be marked as rare.
1148
Bram Moolenaar30abd282005-06-22 22:35:10 +00001149
Bram Moolenaar6f16eb82005-08-23 21:02:42 +00001150BAD WORDS *spell-BAD*
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001151
Bram Moolenaar30abd282005-06-22 22:35:10 +00001152In the affix file a BAD line can be used to define the affix name used for
1153bad words. Example:
1154
1155 BAD ! ~
1156
1157This can be used to exclude words that would otherwise be good. For example
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +00001158"the the" in the .dic file:
1159
1160 the the/! ~
1161
1162Once a word has been marked as bad it won't be undone by encountering the same
1163word as good.
Bram Moolenaar45eeb132005-06-06 21:59:07 +00001164
Bram Moolenaar4770d092006-01-12 23:22:24 +00001165The flag also applies to the word with affixes, thus this can be used to mark
1166a whole bunch of related words as bad.
1167
Bram Moolenaar6f16eb82005-08-23 21:02:42 +00001168 *spell-NEEDAFFIX*
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001169The NEEDAFFIX flag is used to require that a word is used with an affix. The
Bram Moolenaar4770d092006-01-12 23:22:24 +00001170word itself is not a good word (unless there is an empty affix). Example:
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001171
1172 NEEDAFFIX + ~
1173
Bram Moolenaar45eeb132005-06-06 21:59:07 +00001174
Bram Moolenaar6f16eb82005-08-23 21:02:42 +00001175COMPOUND WORDS *spell-compound*
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001176
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001177A compound word is a longer word made by concatenating words that appear in
1178the .dic file. To specify which words may be concatenated a character is
1179used. This character is put in the list of affixes after the word. We will
1180call this character a flag here. Obviously these flags must be different from
1181any affix IDs used.
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001182
1183 *spell-COMPOUNDFLAG*
Bram Moolenaar4770d092006-01-12 23:22:24 +00001184The Myspell compatible method uses one flag, specified with COMPOUNDFLAG. All
1185words with this flag combine in any order. This means there is no control
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001186over which word comes first. Example:
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001187 COMPOUNDFLAG c ~
1188
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001189 *spell-COMPOUNDRULE*
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001190A more advanced method to specify how compound words can be formed uses
1191multiple items with multiple flags. This is not compatible with Myspell 3.0.
1192Let's start with an example:
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001193 COMPOUNDRULE c+ ~
1194 COMPOUNDRULE se ~
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001195
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001196The first line defines that words with the "c" flag can be concatenated in any
1197order. The second line defines compound words that are made of one word with
1198the "s" flag and one word with the "e" flag. With this dictionary:
1199 bork/c ~
1200 onion/s ~
1201 soup/e ~
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001202
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001203You can make these words:
1204 bork
1205 borkbork
1206 borkborkbork
1207 (etc.)
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001208 onion
1209 soup
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001210 onionsoup
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001211
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001212The COMPOUNDRULE item may appear multiple times. The argument is made out of
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001213one or more groups, where each group can be:
1214 one flag e.g., c
1215 alternate flags inside [] e.g., [abc]
1216Optionally this may be followed by:
1217 * the group appears zero or more times, e.g., sm*e
1218 + the group appears one or more times, e.g., c+
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001219
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001220This is similar to the regexp pattern syntax (but not the same!). A few
1221examples with the sequence of word flags they require:
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001222 COMPOUNDRULE x+ x xx xxx etc.
1223 COMPOUNDRULE yz yz
1224 COMPOUNDRULE x+z xz xxz xxxz etc.
1225 COMPOUNDRULE yx+ yx yxx yxxx etc.
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001226
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001227 COMPOUNDRULE [abc]z az bz cz
1228 COMPOUNDRULE [abc]+z az aaz abaz bz baz bcbz cz caz cbaz etc.
1229 COMPOUNDRULE a[xyz]+ ax axx axyz ay ayx ayzz az azy azxy etc.
1230 COMPOUNDRULE sm*e se sme smme smmme etc.
1231 COMPOUNDRULE s[xyz]*e se sxe sxye sxyxe sye syze sze szye szyxe etc.
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001232
Bram Moolenaara6c840d2005-08-22 22:59:46 +00001233A specific example: Allow a compound to be made of two words and a dash:
1234 In the .aff file:
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001235 COMPOUNDRULE sde ~
Bram Moolenaara6c840d2005-08-22 22:59:46 +00001236 NEEDAFFIX x ~
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001237 COMPOUNDWORDMAX 3 ~
Bram Moolenaara6c840d2005-08-22 22:59:46 +00001238 COMPOUNDMIN 1 ~
1239 In the .dic file:
1240 start/s ~
1241 end/e ~
1242 -/xd ~
1243
1244This allows for the word "start-end", but not "startend".
1245
Bram Moolenaar910f66f2006-04-05 20:41:53 +00001246An additional implied rule is that, without further flags, a word with a
1247prefix cannot be compounded after another word, and a word with a suffix
1248cannot be compounded with a following word. Thus the affix cannot appear
1249on the inside of a compound word. This can be changed with the
1250|spell-COMPOUNDPERMITFLAG|.
1251
Bram Moolenaar4770d092006-01-12 23:22:24 +00001252 *spell-NEEDCOMPOUND*
1253The NEEDCOMPOUND flag is used to require that a word is used as part of a
1254compound word. The word itself is not a good word. Example:
1255
1256 NEEDCOMPOUND & ~
1257
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001258 *spell-COMPOUNDMIN*
Bram Moolenaarac6e65f2005-08-29 22:25:38 +00001259The minimal character length of a word used for compounding is specified with
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001260COMPOUNDMIN. Example:
1261 COMPOUNDMIN 5 ~
1262
Bram Moolenaarac6e65f2005-08-29 22:25:38 +00001263When omitted there is no minimal length. Obviously you could just leave out
1264the compound flag from short words instead, this feature is present for
1265compatibility with Myspell.
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001266
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001267 *spell-COMPOUNDWORDMAX*
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001268The maximum number of words that can be concatenated into a compound word is
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001269specified with COMPOUNDWORDMAX. Example:
1270 COMPOUNDWORDMAX 3 ~
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001271
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001272When omitted there is no maximum. It applies to all compound words.
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001273
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001274To set a limit for words with specific flags make sure the items in
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001275COMPOUNDRULE where they appear don't allow too many words.
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001276
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001277 *spell-COMPOUNDSYLMAX*
1278The maximum number of syllables that a compound word may contain is specified
1279with COMPOUNDSYLMAX. Example:
1280 COMPOUNDSYLMAX 6 ~
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001281
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001282This has no effect if there is no SYLLABLE item. Without COMPOUNDSYLMAX there
1283is no limit on the number of syllables.
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001284
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001285If both COMPOUNDWORDMAX and COMPOUNDSYLMAX are defined, a compound word is
Bram Moolenaara6c840d2005-08-22 22:59:46 +00001286accepted if it fits one of the criteria, thus is either made from up to
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001287COMPOUNDWORDMAX words or contains up to COMPOUNDSYLMAX syllables.
Bram Moolenaara6c840d2005-08-22 22:59:46 +00001288
Bram Moolenaar4770d092006-01-12 23:22:24 +00001289 *spell-COMPOUNDFORBIDFLAG*
1290The COMPOUNDFORBIDFLAG specifies a flag that can be used on an affix. It
Bram Moolenaar910f66f2006-04-05 20:41:53 +00001291means that the word plus affix cannot be used in a compound word. Example:
1292 affix file:
1293 COMPOUNDFLAG c ~
1294 COMPOUNDFORBIDFLAG x ~
1295 SFX a Y 2 ~
1296 SFX a 0 s . ~
1297 SFX a 0 ize/x . ~
1298 dictionary:
1299 word/c ~
1300 util/ac ~
1301
1302This allows for "wordutil" and "wordutils" but not "wordutilize".
Bram Moolenaar18144c82006-04-12 21:52:12 +00001303Note: this doesn't work for postponed prefixes yet.
Bram Moolenaar4770d092006-01-12 23:22:24 +00001304
1305 *spell-COMPOUNDPERMITFLAG*
1306The COMPOUNDPERMITFLAG specifies a flag that can be used on an affix. It
1307means that the word plus affix can also be used in a compound word in a way
Bram Moolenaar910f66f2006-04-05 20:41:53 +00001308where the affix ends up halfway the word. Without this flag that is not
1309allowed.
Bram Moolenaar18144c82006-04-12 21:52:12 +00001310Note: this doesn't work for postponed prefixes yet.
Bram Moolenaar4770d092006-01-12 23:22:24 +00001311
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001312 *spell-COMPOUNDROOT*
1313The COMPOUNDROOT flag is used for words in the dictionary that are already a
1314compound. This means it counts for two words when checking the compounding
1315rules. Can also be used for an affix to count the affix as a compounding
1316word.
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001317
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001318 *spell-SYLLABLE*
1319The SYLLABLE item defines characters or character sequences that are used to
1320count the number of syllables in a word. Example:
1321 SYLLABLE aáeéiíoóöõuúüûy/aa/au/ea/ee/ei/ie/oa/oe/oo/ou/uu/ui ~
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001322
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001323Before the first slash is the set of characters that are counted for one
1324syllable, also when repeated and mixed, until the next character that is not
1325in this set. After the slash come sequences of characters that are counted
1326for one syllable. These are preferred over using characters from the set.
1327With the example "ideeen" has three syllables, counted by "i", "ee" and "e".
1328
1329Only case-folded letters need to be included.
1330
Bram Moolenaar910f66f2006-04-05 20:41:53 +00001331Above another way to restrict compounding was mentioned above: Adding the
1332|spell-COMPOUNDFORBIDFLAG| flag to an affix causes all words that are made
1333with that affix not be be used for compounding.
Bram Moolenaar8aff23a2005-08-19 20:40:30 +00001334
Bram Moolenaar6f16eb82005-08-23 21:02:42 +00001335
1336UNLIMITED COMPOUNDING *spell-NOBREAK*
1337
1338For some languages, such as Thai, there is no space in between words. This
1339looks like all words are compounded. To specify this use the NOBREAK item in
1340the affix file, without arguments:
1341 NOBREAK ~
1342
1343Vim will try to figure out where one word ends and a next starts. When there
1344are spelling mistakes this may not be quite right.
1345
Bram Moolenaarae5bce12005-08-15 21:41:48 +00001346
Bram Moolenaar4770d092006-01-12 23:22:24 +00001347 *spell-COMMON*
1348Common words can be specified with the COMMON item. This will give better
1349suggestions when editing a short file. Example:
1350
1351 COMMON the of to and a in is it you that he was for on are ~
1352
1353The words must be separated by white space, up to 25 per line.
1354When multiple regions are specified in a ":mkspell" command the common words
1355for all regions are combined and used for all regions.
1356
1357 *spell-NOSPLITSUGS*
Bram Moolenaarfd2ac762006-03-01 22:09:21 +00001358This item indicates that splitting a word to make suggestions is not a good
1359idea. Split-word suggestions will appear only when there are few similar
1360words.
Bram Moolenaar4770d092006-01-12 23:22:24 +00001361
1362 NOSPLITSUGS ~
1363
1364 *spell-NOSUGGEST*
1365The flag specified with NOSUGGEST can be used for words that will not be
1366suggested. Can be used for obscene words.
1367
1368 NOSUGGEST % ~
1369
Bram Moolenaar4770d092006-01-12 23:22:24 +00001370
Bram Moolenaar6f16eb82005-08-23 21:02:42 +00001371REPLACEMENTS *spell-REP*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001372
1373In the affix file REP items can be used to define common mistakes. This is
1374used to make spelling suggestions. The items define the "from" text and the
1375"to" replacement. Example:
1376
1377 REP 4 ~
1378 REP f ph ~
1379 REP ph f ~
1380 REP k ch ~
1381 REP ch k ~
1382
Bram Moolenaar6e7c7f32005-08-24 22:16:11 +00001383The first line specifies the number of REP lines following. Vim ignores the
Bram Moolenaar4770d092006-01-12 23:22:24 +00001384number, but it must be there (for compatibility with Myspell).
Bram Moolenaar6e7c7f32005-08-24 22:16:11 +00001385
Bram Moolenaard042c562005-06-30 22:04:15 +00001386Don't include simple one-character replacements or swaps. Vim will try these
1387anyway. You can include whole words if you want to, but you might want to use
1388the "file:" item in 'spellsuggest' instead.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001389
Bram Moolenaar1e015462005-09-25 22:16:38 +00001390You can include a space by using an underscore:
1391
1392 REP the_the the ~
1393
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001394
Bram Moolenaara40ceaf2006-01-13 22:35:40 +00001395SIMILAR CHARACTERS *spell-MAP* *E783*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001396
Bram Moolenaard042c562005-06-30 22:04:15 +00001397In the affix file MAP items can be used to define letters that are very much
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001398alike. This is mostly used for a letter with different accents. This is used
1399to prefer suggestions with these letters substituted. Example:
1400
1401 MAP 2 ~
1402 MAP eéëêè ~
1403 MAP uüùúû ~
1404
Bram Moolenaar6e7c7f32005-08-24 22:16:11 +00001405The first line specifies the number of MAP lines following. Vim ignores the
1406number, but the line must be there.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001407
Bram Moolenaard042c562005-06-30 22:04:15 +00001408Each letter must appear in only one of the MAP items. It's a bit more
1409efficient if the first letter is ASCII or at least one without accents.
Bram Moolenaare7566042005-06-17 22:00:15 +00001410
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001411
Bram Moolenaar4770d092006-01-12 23:22:24 +00001412.SUG FILE *spell-NOSUGFILE*
1413
1414When soundfolding is specified in the affix file then ":mkspell" will normally
Bram Moolenaard12f5c12006-01-25 22:10:52 +00001415produce a .sug file next to the .spl file. This file is used to find
1416suggestions by their sound-a-like form quickly. At the cost of a lot of
1417memory (the amount depends on the number of words, |:mkspell| will display an
1418estimate when it's done).
Bram Moolenaar4770d092006-01-12 23:22:24 +00001419
1420To avoid producing a .sug file use this item in the affix file:
1421
1422 NOSUGFILE ~
1423
Bram Moolenaard12f5c12006-01-25 22:10:52 +00001424Users can simply omit the .sug file if they don't want to use it.
1425
Bram Moolenaar4770d092006-01-12 23:22:24 +00001426
Bram Moolenaar6f16eb82005-08-23 21:02:42 +00001427SOUND-A-LIKE *spell-SAL*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001428
1429In the affix file SAL items can be used to define the sounds-a-like mechanism
1430to be used. The main items define the "from" text and the "to" replacement.
Bram Moolenaard042c562005-06-30 22:04:15 +00001431Simplistic example:
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001432
1433 SAL CIA X ~
1434 SAL CH X ~
1435 SAL C K ~
1436 SAL K K ~
1437
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +00001438There are a few rules and this can become quite complicated. An explanation
Bram Moolenaard042c562005-06-30 22:04:15 +00001439how it works can be found in the Aspell manual:
Bram Moolenaar42eeac32005-06-29 22:40:58 +00001440http://aspell.net/man-html/Phonetic-Code.html.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +00001441
1442There are a few special items:
1443
1444 SAL followup true ~
1445 SAL collapse_result true ~
1446 SAL remove_accents true ~
1447
1448"1" has the same meaning as "true". Any other value means "false".
1449
Bram Moolenaar42eeac32005-06-29 22:40:58 +00001450
Bram Moolenaar6f16eb82005-08-23 21:02:42 +00001451SIMPLE SOUNDFOLDING *spell-SOFOFROM* *spell-SOFOTO*
Bram Moolenaar42eeac32005-06-29 22:40:58 +00001452
1453The SAL mechanism is complex and slow. A simpler mechanism is mapping all
1454characters to another character, mapping similar sounding characters to the
1455same character. At the same time this does case folding. You can not have
Bram Moolenaard042c562005-06-30 22:04:15 +00001456both SAL items and simple soundfolding.
Bram Moolenaar42eeac32005-06-29 22:40:58 +00001457
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +00001458There are two items required: one to specify the characters that are mapped
Bram Moolenaar42eeac32005-06-29 22:40:58 +00001459and one that specifies the characters they are mapped to. They must have
1460exactly the same number of characters. Example:
1461
1462 SOFOFROM abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ ~
1463 SOFOTO ebctefghejklnnepkrstevvkesebctefghejklnnepkrstevvkes ~
1464
1465In the example all vowels are mapped to the same character 'e'. Another
Bram Moolenaard042c562005-06-30 22:04:15 +00001466method would be to leave out all vowels. Some characters that sound nearly
1467the same and are often mixed up, such as 'm' and 'n', are mapped to the same
1468character. Don't do this too much, all words will start looking alike.
Bram Moolenaar42eeac32005-06-29 22:40:58 +00001469
1470Characters that do not appear in SOFOFROM will be left out, except that all
1471white space is replaced by one space. Sequences of the same character in
1472SOFOFROM are replaced by one.
1473
1474You can use the |soundfold()| function to try out the results. Or set the
Bram Moolenaarcc016f52005-12-10 20:23:46 +00001475'verbose' option to see the score in the output of the |z=| command.
Bram Moolenaar42eeac32005-06-29 22:40:58 +00001476
1477
Bram Moolenaar4770d092006-01-12 23:22:24 +00001478UNSUPPORTED ITEMS *spell-affix-not-supported*
1479
1480These items appear in the affix file of other spell checkers. In Vim they are
1481ignored, not supported or defined in another way.
1482
1483ACCENT (Hunspell) *spell-ACCENT*
1484 Use MAP instead. |spell-MAP|
1485
1486CHECKCOMPOUNDCASE (Hunspell) *spell-CHECKCOMPOUNDCASE*
1487 Disallow uppercase letters at compound word boundaries.
1488 Not supported.
1489
1490CHECKCOMPOUNDDUP (Hunspell) *spell-CHECKCOMPOUNDDUP*
1491 Disallow using the same word twice in a compound. Not
1492 supported.
1493
1494CHECKCOMPOUNDREP (Hunspell) *spell-CHECKCOMPOUNDREP*
1495 Something about using REP items and compound words. Not
1496 supported.
1497
1498CHECKCOMPOUNDTRIPLE (Hunspell) *spell-CHECKCOMPOUNDTRIPLE*
1499 Forbid three identical characters when compounding. Not
1500 supported.
1501
1502CHECKCOMPOUNDPATTERN (Hunspell) *spell-CHECKCOMPOUNDPATTERN*
1503 Forbid compounding when patterns match. Not supported.
1504
Bram Moolenaar4770d092006-01-12 23:22:24 +00001505COMPLEXPREFIXES (Hunspell) *spell-COMPLEXPREFIXES*
1506 Enables using two prefixes. Not supported.
1507
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001508COMPOUND (Hunspell) *spell-COMPOUND*
1509 This is one line with the count of COMPOUND items, followd by
1510 that many COMPOUND lines with a pattern.
1511 Remove the first line with the count and rename the other
1512 items to COMPOUNDRULE |spell-COMPOUNDRULE|
1513
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001514COMPOUNDFIRST (Hunspell) *spell-COMPOUNDFIRST*
1515 Use COMPOUNDRULE instead. |spell-COMPOUNDRULE|
1516
Bram Moolenaar4770d092006-01-12 23:22:24 +00001517COMPOUNDBEGIN (Hunspell) *spell-COMPOUNDBEGIN*
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001518 Use COMPOUNDRULE instead. |spell-COMPOUNDRULE|
Bram Moolenaar4770d092006-01-12 23:22:24 +00001519
1520COMPOUNDEND (Hunspell) *spell-COMPOUNDEND*
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001521 Use COMPOUNDRULE instead. |spell-COMPOUNDRULE|
Bram Moolenaar4770d092006-01-12 23:22:24 +00001522
1523COMPOUNDMIDDLE (Hunspell) *spell-COMPOUNDMIDDLE*
Bram Moolenaar362e1a32006-03-06 23:29:24 +00001524 Use COMPOUNDRULE instead. |spell-COMPOUNDRULE|
Bram Moolenaar4770d092006-01-12 23:22:24 +00001525
Bram Moolenaar4770d092006-01-12 23:22:24 +00001526COMPOUNDSYLLABLE (Hunspell) *spell-COMPOUNDSYLLABLE*
1527 Use SYLLABLE and COMPOUNDSYLMAX instead. |spell-SYLLABLE|
1528 |spell-COMPOUNDSYLMAX|
1529
Bram Moolenaar4770d092006-01-12 23:22:24 +00001530FORBIDDENWORD (Hunspell) *spell-FORBIDDENWORD*
1531 Use BAD instead. |spell-BAD|
1532
Bram Moolenaar4770d092006-01-12 23:22:24 +00001533LANG (Hunspell) *spell-LANG*
1534 This specifies language-specific behavior. This actually
1535 moves part of the language knowledge into the program,
1536 therefore Vim does not support it. Each language property
1537 must be specified separately.
1538
1539LEMMA_PRESENT (Hunspell) *spell-LEMMA_PRESENT*
Bram Moolenaar899dddf2006-03-26 21:06:50 +00001540 Only needed for morphological analysis.
Bram Moolenaar4770d092006-01-12 23:22:24 +00001541
1542MAXNGRAMSUGS (Hunspell) *spell-MAXNGRAMSUGS*
1543 Not supported.
1544
Bram Moolenaar4770d092006-01-12 23:22:24 +00001545ONLYINCOMPOUND (Hunspell) *spell-ONLYINCOMPOUND*
1546 Use NEEDCOMPOUND instead. |spell-NEEDCOMPOUND|
1547
1548PSEUDOROOT (Hunspell) *spell-PSEUDOROOT*
1549 Use NEEDAFFIX instead. |spell-NEEDAFFIX|
1550
1551SUGSWITHDOTS (Hunspell) *spell-SUGSWITHDOTS*
1552 Adds dots to suggestions. Vim doesn't need this.
1553
1554SYLLABLENUM (Hunspell) *spell-SYLLABLENUM*
1555 Not supported.
1556
1557TRY (Myspell, Hunspell, others) *spell-TRY*
1558 Vim does not use the TRY item, it is ignored. For making
Bram Moolenaara94bc432006-03-10 21:42:59 +00001559 suggestions the actual characters in the words are used, that
1560 is much more efficient.
Bram Moolenaar4770d092006-01-12 23:22:24 +00001561
Bram Moolenaar4770d092006-01-12 23:22:24 +00001562WORDCHARS (Hunspell) *spell-WORDCHARS*
1563 Used to recognize words. Vim doesn't need it, because there
1564 is no need to separate words before checking them (using a
1565 trie instead of a hashtable).
1566
Bram Moolenaar217ad922005-03-20 22:37:15 +00001567 vim:tw=78:sw=4:ts=8:ft=help:norl: