blob: 7d8a51a338b28c0ea1e5561e1b483b86f702b420 [file] [log] [blame]
Bram Moolenaar90cfdbe2005-08-12 19:59:19 +00001*spell.txt* For Vim version 7.0aa. Last change: 2005 Aug 12
Bram Moolenaar217ad922005-03-20 22:37:15 +00002
3
4 VIM REFERENCE MANUAL by Bram Moolenaar
5
6
7Spell checking *spell*
8
91. Quick start |spell-quickstart|
Bram Moolenaard042c562005-06-30 22:04:15 +0000102. Remarks on spell checking |spell-remarks|
113. Generating a spell file |spell-mkspell|
124. Spell file format |spell-file-format|
Bram Moolenaar217ad922005-03-20 22:37:15 +000013
14{Vi does not have any of these commands}
15
16Spell checking is not available when the |+syntax| feature has been disabled
17at compile time.
18
19==============================================================================
201. Quick start *spell-quickstart*
21
22This command switches on spell checking: >
23
24 :setlocal spell spelllang=en_us
25
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000026This switches on the 'spell' option and specifies to check for US English.
Bram Moolenaar217ad922005-03-20 22:37:15 +000027
28The words that are not recognized are highlighted with one of these:
Bram Moolenaar520470a2005-06-16 21:59:56 +000029 SpellBad word not recognized |hl-SpellBad|
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +000030 SpellCap word not capitalised |hl-SpellCap|
Bram Moolenaar520470a2005-06-16 21:59:56 +000031 SpellRare rare word |hl-SpellRare|
32 SpellLocal wrong spelling for selected region |hl-SpellLocal|
Bram Moolenaar217ad922005-03-20 22:37:15 +000033
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000034Vim only checks words for spelling, there is no grammar check.
35
Bram Moolenaar45360022005-07-21 21:08:21 +000036If the 'mousemodel' option is set to "popup" and the cursor is on a badly
37spelled word or it is "popup_setpos" and the mouse pointer is on a badly
38spelled word, then the popup menu will contain an submenu to replace the bad
39word. Note: this slows down the appearance of the popup menu.
40
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000041To search for the next misspelled word:
42
43 *]s* *E756*
44]s Move to next misspelled word after the cursor.
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000045 A count before the command can be used to repeat.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000046
47 *[s*
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000048[s Like "]s" but search backwards, find the misspelled
Bram Moolenaar30abd282005-06-22 22:35:10 +000049 word before the cursor. Doesn't recognize words
50 split over two lines, thus may stop at words that are
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000051 not highlighted as bad. Does not stop at word with
52 missing capital at the start of a line.
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +000053
54 *]S*
55]S Like "]s" but only stop at bad words, not at rare
56 words or words for another region.
57
58 *[S*
59[S Like "]S" but search backwards.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +000060
Bram Moolenaar217ad922005-03-20 22:37:15 +000061
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +000062To add words to your own word list: *E764*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000063
64 *zg*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000065zg Add word under the cursor as a good word to the first
66 name in 'spellfile'. In Visual mode the selected
67 characters are added as a word (including white
68 space!). If the word is explicitly marked as bad word
69 in another spell file the result is unpredictable.
70 A count may precede the command to indicate the entry
71 in 'spellfile' to be used. A count of two uses the
72 second entry.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000073
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000074 *zG*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000075zG Like "zg" but add the word to the internal word list
76 |internal-wordlist|.
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000077
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000078 *zw*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000079zw Like "zg" but mark the word as a wrong (bad) word.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000080
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000081 *zW*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000082zW Like "zw" but add the word to the internal word list
83 |internal-wordlist|.
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000084
Bram Moolenaar520470a2005-06-16 21:59:56 +000085 *:spe* *:spellgood*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000086:[count]spe[llgood] {word}
Bram Moolenaar53180ce2005-07-05 21:48:14 +000087 Add {word} as a good word to 'spellfile', like with
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000088 "zg". Without count the first name is used, with a
89 count of two the second entry, etc.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000090
Bram Moolenaar53180ce2005-07-05 21:48:14 +000091:spe[llgood]! {word} Add {word} as a good word to the internal word list,
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000092 like with "zG".
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +000093
Bram Moolenaar520470a2005-06-16 21:59:56 +000094 *:spellw* *:spellwrong*
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000095:[count]spellw[rong] {word}
Bram Moolenaar53180ce2005-07-05 21:48:14 +000096 Add {word} as a wrong (bad) word to 'spellfile', as
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +000097 with "zw". Without count the first name is used, with
98 a count of two the second entry, etc.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +000099
Bram Moolenaar53180ce2005-07-05 21:48:14 +0000100:spellw[rong]! {word} Add {word} as a wrong (bad) word to the internal word
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000101 list.
102
Bram Moolenaarf461c8e2005-06-25 23:04:51 +0000103After adding a word to 'spellfile' with the above commands its associated
Bram Moolenaard042c562005-06-30 22:04:15 +0000104".spl" file will automatically be updated and reloaded. If you change
105'spellfile' manually you need to use the |:mkspell| command. This sequence of
106commands mostly works well: >
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000107 :edit <file in 'spellfile'>
Bram Moolenaarf461c8e2005-06-25 23:04:51 +0000108< (make changes to the spell file) >
109 :mkspell! %
110
111More details about the 'spellfile' format below |spell-wordlist-format|.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000112
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000113 *internal-wordlist*
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000114The internal word list is used for all buffers where 'spell' is set. It is
115not stored, it is lost when you exit Vim. It is also cleared when 'encoding'
116is set.
117
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000118
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000119Finding suggestions for bad words:
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000120 *z?*
Bram Moolenaar30abd282005-06-22 22:35:10 +0000121z? For the word under/after the cursor suggest correctly
Bram Moolenaard042c562005-06-30 22:04:15 +0000122 spelled words. This also works to find alternatives
123 for a word that is not highlighted as a bad word,
124 e.g., when the word after it is bad.
Bram Moolenaar30abd282005-06-22 22:35:10 +0000125 The results are sorted on similarity to the word
126 under/after the cursor.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000127 This may take a long time. Hit CTRL-C when you are
128 bored.
Bram Moolenaar24bbcfe2005-06-28 23:32:02 +0000129 This does not work when there is a line break halfway
130 a bad word (e.g., "the the").
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000131 You can enter the number of your choice or press
Bram Moolenaar24bbcfe2005-06-28 23:32:02 +0000132 <Enter> if you don't want to replace. You can also
133 use the mouse to click on your choice (only works if
134 the mouse can be used in Normal mode and when there
Bram Moolenaard042c562005-06-30 22:04:15 +0000135 are no line wraps). Click on the first (header) line
Bram Moolenaar24bbcfe2005-06-28 23:32:02 +0000136 to cancel.
Bram Moolenaarf3bd51a2005-06-14 22:11:18 +0000137 If 'verbose' is non-zero a score will be displayed to
138 indicate the likeliness to the badly spelled word (the
139 higher the score the more different).
Bram Moolenaard857f0e2005-06-21 22:37:39 +0000140 When a word was replaced the redo command "." will
141 repeat the word replacement. This works like "ciw",
142 the good word and <Esc>.
143
Bram Moolenaar24bbcfe2005-06-28 23:32:02 +0000144 *:spellr* *:spellrepall* *E752* *E753*
145:spellr[epall] Repeat the replacement done by |z?| for all matches
146 with the replaced word in the current window.
147
Bram Moolenaar488c6512005-08-11 20:09:58 +0000148In Insert mode, when the cursor is after a badly spelled word, you can use
149CTRL-X s to find suggestions. This works like Insert mode completion. Use
150CTRL-N to use the next suggestion, CTRL-P to go back. |i_CTRL-X_s|
151
Bram Moolenaard857f0e2005-06-21 22:37:39 +0000152The 'spellsuggest' option influences how the list of suggestions is generated
153and sorted. See |'spellsuggest'|.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000154
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000155The 'spellcapcheck' option is used to check the first word of a sentence
156starts with a capital. This doesn't work for the first word in the file.
157When there is a line break right after a sentence the highlighting of the next
Bram Moolenaar90cfdbe2005-08-12 19:59:19 +0000158line may be postponed. Use |CTRL-L| when needed. Also see |set-spc-auto| for
159how it can be set automatically when 'spelllang' is set.
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000160
Bram Moolenaard042c562005-06-30 22:04:15 +0000161==============================================================================
1622. Remarks on spell checking *spell-remarks*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000163
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000164PERFORMANCE
165
Bram Moolenaard042c562005-06-30 22:04:15 +0000166Vim does on-the-fly spell checking. To make this work fast the word list is
167loaded in memory. Thus this uses a lot of memory (1 Mbyte or more). There
168might also be a noticeable delay when the word list is loaded, which happens
169when 'spell' is set and when 'spelllang' is set while 'spell' was already set.
170To minimize the delay each word list is only loaded once, it is not deleted
171when 'spelllang' is made empty or 'spell' is reset. When 'encoding' is set
172all the word lists are reloaded, thus you may notice a delay then too.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000173
174
Bram Moolenaar217ad922005-03-20 22:37:15 +0000175REGIONS
176
177A word may be spelled differently in various regions. For example, English
178comes in (at least) these variants:
179
180 en all regions
Bram Moolenaar5c5474b2005-04-19 21:40:26 +0000181 en_au Australia
Bram Moolenaar217ad922005-03-20 22:37:15 +0000182 en_ca Canada
Bram Moolenaar5c5474b2005-04-19 21:40:26 +0000183 en_gb Great Britain
184 en_nz New Zealand
185 en_us USA
Bram Moolenaar217ad922005-03-20 22:37:15 +0000186
187Words that are not used in one region but are used in another region are
Bram Moolenaar520470a2005-06-16 21:59:56 +0000188highlighted with SpellLocal |hl-SpellLocal|.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000189
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000190Always use lowercase letters for the language and region names.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000191
Bram Moolenaar3638c682005-06-08 22:05:14 +0000192When adding a word with |zg| or another command it's always added for all
193regions. You can change that by manually editing the 'spellfile'. See
Bram Moolenaar0dc065e2005-07-04 22:49:24 +0000194|spell-wordlist-format|. Note that the regions as specified in the files in
195'spellfile' are only used when all entries in "spelllang" specify the same
196region (not counting files specified by their .spl name).
Bram Moolenaar3638c682005-06-08 22:05:14 +0000197
Bram Moolenaar217ad922005-03-20 22:37:15 +0000198
Bram Moolenaar3b506942005-06-23 22:36:45 +0000199SPELL FILES *spell-load*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000200
201Vim searches for spell files in the "spell" subdirectory of the directories in
Bram Moolenaar3638c682005-06-08 22:05:14 +0000202'runtimepath'. The name is: LL.EEE.spl, where:
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000203 LL the language name
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000204 EEE the value of 'encoding'
Bram Moolenaar217ad922005-03-20 22:37:15 +0000205
Bram Moolenaar3b506942005-06-23 22:36:45 +0000206The value for "LL" comes from 'spelllang', but excludes the region name.
207Examples:
208 'spelllang' LL ~
209 en_us en
210 en-rare en-rare
211 medical_ca medical
212
Bram Moolenaar3638c682005-06-08 22:05:14 +0000213Only the first file is loaded, the one that is first in 'runtimepath'. If
214this succeeds then additionally files with the name LL.EEE.add.spl are loaded.
215All the ones that are found are used.
216
Bram Moolenaar0d9c26d2005-07-02 23:19:16 +0000217Additionally, the files related to the names in 'spellfile' are loaded. These
218are the files that |zg| and |zw| add good and wrong words to.
Bram Moolenaar3b506942005-06-23 22:36:45 +0000219
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000220Exceptions:
221- Vim uses "latin1" when 'encoding' is "iso-8859-15". The euro sign doesn't
222 matter for spelling.
223- When no spell file for 'encoding' is found "ascii" is tried. This only
224 works for languages where nearly all words are ASCII, such as English. It
225 helps when 'encoding' is not "latin1", such as iso-8859-2, and English text
Bram Moolenaar3638c682005-06-08 22:05:14 +0000226 is being edited. For the ".add" files the same name as the found main
227 spell file is used.
228
229For example, with these values:
230 'runtimepath' is "~/.vim,/usr/share/vim70,~/.vim/after"
231 'encoding' is "iso-8859-2"
232 'spelllang' is "pl"
233
234Vim will look for:
2351. ~/.vim/spell/pl.iso-8859-2.spl
2362. /usr/share/vim70/spell/pl.iso-8859-2.spl
2373. ~/.vim/spell/pl.iso-8859-2.add.spl
2384. /usr/share/vim70/spell/pl.iso-8859-2.add.spl
2395. ~/.vim/after/spell/pl.iso-8859-2.add.spl
240
241This assumes 1. is not found and 2. is found.
242
243If 'encoding' is "latin1" Vim will look for:
2441. ~/.vim/spell/pl.latin1.spl
2452. /usr/share/vim70/spell/pl.latin1.spl
2463. ~/.vim/after/spell/pl.latin1.spl
2474. ~/.vim/spell/pl.ascii.spl
2485. /usr/share/vim70/spell/pl.ascii.spl
2496. ~/.vim/after/spell/pl.ascii.spl
250
251This assumes none of them are found (Polish doesn't make sense when leaving
252out the non-ASCII characters).
Bram Moolenaar217ad922005-03-20 22:37:15 +0000253
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000254Spelling for EBCDIC is currently not supported.
255
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000256A spell file might not be available in the current 'encoding'. See
257|spell-mkspell| about how to create a spell file. Converting a spell file
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000258with "iconv" will NOT work!
Bram Moolenaar217ad922005-03-20 22:37:15 +0000259
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000260 *E758* *E759*
261When loading a spell file Vim checks that it is properly formatted. If you
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000262get an error the file may be truncated, modified or intended for another Vim
263version.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000264
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000265
266WORDS
267
268Vim uses a fixed method to recognize a word. This is independent of
269'iskeyword', so that it also works in help files and for languages that
270include characters like '-' in 'iskeyword'. The word characters do depend on
271'encoding'.
272
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000273The table with word characters is stored in the main .spl file. Therefore it
274matters what the current locale is when generating it! A .add.spl file does
Bram Moolenaarf461c8e2005-06-25 23:04:51 +0000275not contain a word table though.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000276
Bram Moolenaar3638c682005-06-08 22:05:14 +0000277A word that starts with a digit is always ignored. That includes hex numbers
278in the form 0xff and 0XFF.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000279
280
Bram Moolenaar30abd282005-06-22 22:35:10 +0000281WORD COMBINATIONS
282
283It is possible to spell-check words that include a space. This is used to
284recognize words that are invalid when used by themselves, e.g. for "et al.".
285It can also be used to recognize "the the" and highlight it.
286
287The number of spaces is irrelevant. In most cases a line break may also
288appear. However, this makes it difficult to find out where to start checking
289for spelling mistakes. When you make a change to one line and only that line
290is redrawn Vim won't look in the previous line, thus when "et" is at the end
291of the previous line "al." will be flagged as an error. And when you type
292"the<CR>the" the highlighting doesn't appear until the first line is redrawn.
293Use |CTRL-L| to redraw right away. "[s" will also stop at a word combination
294with a line break.
295
296When encountering a line break Vim skips characters such as '*', '>' and '"',
297so that comments in C, shell and Vim code can be spell checked.
298
299
Bram Moolenaar9d0ec2e2005-04-20 19:45:58 +0000300SYNTAX HIGHLIGHTING *spell-syntax*
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000301
302Files that use syntax highlighting can specify where spell checking should be
303done:
304
Bram Moolenaar3638c682005-06-08 22:05:14 +00003051. everywhere default
3062. in specific items use "contains=@Spell"
3073. everywhere but specific items use "contains=@NoSpell"
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000308
Bram Moolenaar3638c682005-06-08 22:05:14 +0000309For the second method adding the @NoSpell cluster will disable spell checking
310again. This can be used, for example, to add @Spell to the comments of a
311program, and add @NoSpell for items that shouldn't be checked.
Bram Moolenaar6bb68362005-03-22 23:03:44 +0000312
Bram Moolenaar30abd282005-06-22 22:35:10 +0000313
314VIM SCRIPTS
315
316If you want to write a Vim script that does something with spelling, you may
317find these functions useful:
318
319 spellbadword() find badly spelled word at the cursor
320 spellsuggest() get list of spelling suggestions
Bram Moolenaard042c562005-06-30 22:04:15 +0000321 soundfold() get the sound-a-like version of a word
Bram Moolenaar30abd282005-06-22 22:35:10 +0000322
Bram Moolenaar90cfdbe2005-08-12 19:59:19 +0000323
324SETTING 'spellcapcheck' AUTOMATICALLY *set-spc-auto*
325
326After the 'spelllang' option has been set successfully, Vim will source the
327files "spell/LANG.vim" in 'runtimepath'. "LANG" is the value of 'spelllang'
328up to the first comma, dot or underscore. This can be used to set options
329specifically for the language, especially 'spellcapcheck'.
330
331The distribution includes a few of these files. Use this command to see what
332they do: >
333 :next $VIMRUNTIME/spell/*.vim
334
335Note that the default scripts don't set 'spellcapcheck' if it was changed from
336the default value. This assumes the user prefers another value then.
337
Bram Moolenaar217ad922005-03-20 22:37:15 +0000338==============================================================================
Bram Moolenaard042c562005-06-30 22:04:15 +00003393. Generating a spell file *spell-mkspell*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000340
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000341Vim uses a binary file format for spelling. This greatly speeds up loading
342the word list and keeps it small.
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000343 *.aff* *.dic* *Myspell*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000344You can create a Vim spell file from the .aff and .dic files that Myspell
345uses. Myspell is used by OpenOffice.org and Mozilla. You should be able to
346find them here:
347 http://lingucomponent.openoffice.org/spell_dic.html
Bram Moolenaar30abd282005-06-22 22:35:10 +0000348You can also use a plain word list. The results are the same, the choice
Bram Moolenaard042c562005-06-30 22:04:15 +0000349depends on what word lists you can find.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000350
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +0000351If you install Aap (from www.a-a-p.org) you can use the recipes in the
352runtime/spell/??/ directories. Aap will take care of downloading the files,
353apply patches needed for Vim and build the .spl file.
354
Bram Moolenaare13305e2005-06-19 22:54:15 +0000355Make sure your current locale is set properly, otherwise Vim doesn't know what
356characters are upper/lower case letters. If the locale isn't available (e.g.,
357when using an MS-Windows codepage on Unix) add tables to the .aff file
Bram Moolenaar3b506942005-06-23 22:36:45 +0000358|spell-affix-chars|. If the .aff file doesn't define a table then the word
359table of the currently active spelling is used. If spelling is not active
360then Vim will try to guess.
Bram Moolenaare13305e2005-06-19 22:54:15 +0000361
Bram Moolenaar3b506942005-06-23 22:36:45 +0000362 *:mksp* *:mkspell*
363:mksp[ell][!] [-ascii] {outname} {inname} ...
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000364 Generate a Vim spell file word lists. Example: >
Bram Moolenaard042c562005-06-30 22:04:15 +0000365 :mkspell /tmp/nl nl_NL.words
Bram Moolenaar3b506942005-06-23 22:36:45 +0000366< *E751*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000367 When {outname} ends in ".spl" it is used as the output
368 file name. Otherwise it should be a language name,
Bram Moolenaar3b506942005-06-23 22:36:45 +0000369 such as "en", without the region name. The file
370 written will be "{outname}.{encoding}.spl", where
371 {encoding} is the value of the 'encoding' option.
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000372
Bram Moolenaard042c562005-06-30 22:04:15 +0000373 When the output file already exists [!] must be used
Bram Moolenaar520470a2005-06-16 21:59:56 +0000374 to overwrite it.
375
Bram Moolenaar0e21a3f2005-04-17 20:28:32 +0000376 When the [-ascii] argument is present, words with
377 non-ascii characters are skipped. The resulting file
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000378 ends in "ascii.spl".
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000379
380 The input can be the Myspell format files {inname}.aff
381 and {inname}.dic. If {inname}.aff does not exist then
382 {inname} is used as the file name of a plain word
383 list.
384
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000385 Multiple {inname} arguments can be given to combine
386 regions into one Vim spell file. Example: >
387 :mkspell ~/.vim/spell/en /tmp/en_US /tmp/en_CA /tmp/en_AU
388< This combines the English word lists for US, CA and AU
389 into one en.spl file.
390 Up to eight regions can be combined. *E754* *755*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000391 The REP and SAL items of the first .aff file where
392 they appear are used. |spell-affix-REP|
393 |spell-affix-SAL|
Bram Moolenaar217ad922005-03-20 22:37:15 +0000394
Bram Moolenaar30abd282005-06-22 22:35:10 +0000395 This command uses a lot of memory, required to find
396 the optimal word tree (Polish requires a few hundred
397 Mbyte). The final result will be much smaller.
398
Bram Moolenaard042c562005-06-30 22:04:15 +0000399 After the spell file was written and it was being used
400 in a buffer it will be reloaded automatically.
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000401
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000402:mksp[ell] [-ascii] {name}.{enc}.add
403 Like ":mkspell" above, using {name}.{enc}.add as the
Bram Moolenaard042c562005-06-30 22:04:15 +0000404 input file and producing an output file in the same
405 directory that has ".spl" appended.
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000406
407:mksp[ell] [-ascii] {name}
408 Like ":mkspell" above, using {name} as the input file
Bram Moolenaard042c562005-06-30 22:04:15 +0000409 and producing an output file in the same directory
410 that has ".{enc}.spl" appended.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000411
412Since you might want to change a Myspell word list for use with Vim the
413following procedure is recommended:
Bram Moolenaar217ad922005-03-20 22:37:15 +0000414
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +00004151. Obtain the xx_YY.aff and xx_YY.dic files from Myspell.
4162. Make a copy of these files to xx_YY.orig.aff and xx_YY.orig.dic.
4173. Change the xx_YY.aff and xx_YY.dic files to remove bad words, add missing
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000418 words, define word characters with FOL/LOW/UPP, etc. The distributed
419 "src/spell/*.diff" files can be used.
Bram Moolenaard042c562005-06-30 22:04:15 +00004204. Start Vim with the right locale and use |:mkspell| to generate the Vim
421 spell file.
4225. Try out the spell file with ":set spell spelllang=xx" if you wrote it in
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +0000423 a spell directory in 'runtimepath', or ":set spelllang=xx.enc.spl" if you
Bram Moolenaard042c562005-06-30 22:04:15 +0000424 wrote it somewhere else.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000425
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000426When the Myspell files are updated you can merge the differences:
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00004271. Obtain the new Myspell files as xx_YY.new.aff and xx_UU.new.dic.
4282. Use Vimdiff to see what changed: >
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000429 vimdiff xx_YY.orig.dic xx_YY.new.dic
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00004303. Take over the changes you like in xx_YY.dic.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000431 You may also need to change xx_YY.aff.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +00004324. Rename xx_YY.new.dic to xx_YY.orig.dic and xx_YY.new.aff to xx_YY.new.aff.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000433
Bram Moolenaar3b506942005-06-23 22:36:45 +0000434
435SPELL FILE DUMP
436
437If for some reason you want to check what words are supported by the currently
438used spelling files, use this command:
439
440 *:spelldump* *:spelld*
441:spelld[ump] Open a new window and fill it with all currently valid
442 words.
Bram Moolenaard042c562005-06-30 22:04:15 +0000443 Note: For some languages the result may be enormous,
444 causing Vim to run out of memory.
Bram Moolenaar3b506942005-06-23 22:36:45 +0000445
446The format of the word list is used |spell-wordlist-format|. You should be
447able to read it with ":mkspell" to generate one .spl file that includes all
448the words.
449
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000450When all entries to 'spelllang' use the same regions or no regions at all then
451the region information is included in the dumped words. Otherwise only words
452for the current region are included and no "/regions" line is generated.
Bram Moolenaar3b506942005-06-23 22:36:45 +0000453
Bram Moolenaard042c562005-06-30 22:04:15 +0000454Comment lines with the name of the .spl file are used as a header above the
455words that were generated from that .spl file.
Bram Moolenaar3b506942005-06-23 22:36:45 +0000456
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000457==============================================================================
Bram Moolenaard042c562005-06-30 22:04:15 +00004584. Spell file format *spell-file-format*
Bram Moolenaar217ad922005-03-20 22:37:15 +0000459
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000460This is the format of the files that are used by the person who creates and
461maintains a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000462
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000463Note that we avoid the word "dictionary" here. That is because the goal of
464spell checking differs from writing a dictionary (as in the book). For
Bram Moolenaard042c562005-06-30 22:04:15 +0000465spelling we need a list of words that are OK, thus should not to be
466highlighted. Person and company names will not appear in a dictionary, but do
467appear in a word list. And some old words are rarely used while they are
468common misspellings. These do appear in a dictionary but not in a word list.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000469
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +0000470There are two formats: A straight list of words and a list using affix
Bram Moolenaard042c562005-06-30 22:04:15 +0000471compression. The files with affix compression are used by Myspell (Mozilla
472and OpenOffice.org). This requires two files, one with .aff and one with .dic
473extension.
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000474
475
Bram Moolenaard042c562005-06-30 22:04:15 +0000476FORMAT OF STRAIGHT WORD LIST *spell-wordlist-format*
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000477
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000478The words must appear one per line. That is all that is required.
Bram Moolenaard042c562005-06-30 22:04:15 +0000479
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000480Additionally the following items are recognized:
Bram Moolenaard042c562005-06-30 22:04:15 +0000481
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000482- Empty and blank lines are ignored.
Bram Moolenaard042c562005-06-30 22:04:15 +0000483
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000484- Lines starting with a # are ignored (comment lines).
Bram Moolenaard042c562005-06-30 22:04:15 +0000485
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000486- A line starting with "/encoding=", before any word, specifies the encoding
487 of the file. After the second '=' comes an encoding name. This tells Vim
Bram Moolenaard042c562005-06-30 22:04:15 +0000488 to setup conversion from the specified encoding to 'encoding'. Thus you can
489 use one word list for several target encodings.
490
Bram Moolenaar3638c682005-06-08 22:05:14 +0000491- A line starting with "/regions=" specifies the region names that are
492 supported. Each region name must be two ASCII letters. The first one is
493 region 1. Thus "/regions=usca" has region 1 "us" and region 2 "ca".
Bram Moolenaard042c562005-06-30 22:04:15 +0000494 In an addition word list the region names should be equal to the main word
495 list!
496
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000497- Other lines starting with '/' are reserved for future use. The ones that
498 are not recognized are ignored (but you do get a warning message).
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000499
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000500- A "/" may follow the word with the following items:
501 = Case must match exactly.
502 ? Rare word.
503 ! Bad (wrong) word.
504 digit A region in which the word is valid. If no regions are
505 specified the word is valid in all regions.
506
Bram Moolenaar3638c682005-06-08 22:05:14 +0000507Example:
508
509 # This is an example word list comment
510 /encoding=latin1 encoding of the file
511 /regions=uscagb regions "us", "ca" and "gb"
512 example word for all regions
Bram Moolenaar1f8a5f02005-07-01 22:41:52 +0000513 blah/12 word for regions "us" and "ca"
514 vim/! bad word
515 Campbell/?3 rare word in region 3 "gb"
516 's mornings/= keep-case word
Bram Moolenaar3638c682005-06-08 22:05:14 +0000517
Bram Moolenaar0dc065e2005-07-04 22:49:24 +0000518Note that when "/=" is used the same word with all upper-case letters is not
519accepted. This is different from a word with mixed case that is automatically
520marked as keep-case, those words may appear in all upper-case letters.
521
Bram Moolenaar75c50c42005-06-04 22:06:24 +0000522
523FORMAT WITH AFFIX COMPRESSION
524
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000525There are two files: the basic word list and an affix file. The affixes are
526used to modify the basic words to get the full word list. This significantly
527reduces the number of words, especially for a language like Polish. This is
528called affix compression.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000529
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000530The format for the affix and word list files is mostly identical to what
531Myspell uses (the spell checker of Mozilla and OpenOffice.org). A description
532can be found here:
533 http://lingucomponent.openoffice.org/affix.readme ~
534Note that affixes are case sensitive, this isn't obvious from the description.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000535
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000536Vim supports a few extras. Hopefully Myspell will support these too some day.
537See |spell-affix-vim|.
Bram Moolenaar217ad922005-03-20 22:37:15 +0000538
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000539The basic word list and the affix file are combined and turned into a binary
540spell file. All the preprocessing has been done, thus this file loads fast.
541The binary spell file format is described in the source code (src/spell.c).
542But only developers need to know about it.
543
544The preprocessing also allows us to take the Myspell language files and modify
545them before the Vim word list is made. The tools for this can be found in the
546"src/spell" directory.
547
548
Bram Moolenaar3638c682005-06-08 22:05:14 +0000549WORD LIST FORMAT *spell-dic-format*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000550
551A very short example, with line numbers:
552
553 1 1234
554 2 aan
555 3 Als
556 4 Etten-Leur
557 5 et al.
558 6 's-Gravenhage
559 7 's-Gravenhaags
560 8 bedel/P
561 9 kado/1
562 10 cadeau/2
563
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000564The first line contains the number of words. Vim ignores it, but you do get
565an error message if it's not there. *E760*
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000566
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000567What follows is one word per line. There should be no white space before or
568after the word.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000569
570When the word only has lower-case letters it will also match with the word
571starting with an upper-case letter.
572
573When the word includes an upper-case letter, this means the upper-case letter
574is required at this position. The same word with a lower-case letter at this
575position will not match. When some of the other letters are upper-case it will
576not match either.
577
Bram Moolenaard042c562005-06-30 22:04:15 +0000578The word with all upper-case characters will always be OK.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000579
580 word list matches does not match ~
581 als als Als ALS ALs AlS aLs aLS
582 Als Als ALS als ALs AlS aLs aLS
583 ALS ALS als Als ALs AlS aLs aLS
584 AlS AlS ALS als Als ALs aLs aLS
585
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000586The KEP affix ID can be used to specifically match a word with identical case
Bram Moolenaare7566042005-06-17 22:00:15 +0000587only, see below |spell-affix-KEP|.
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000588
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000589Note in line 5 to 7 that non-word characters are used. You can include
590any character in a word. When checking the text a word still only matches
591when it appears with a non-word character before and after it. For Myspell a
592word starting with a non-word character probably won't work.
593
594After the word there is an optional slash and flags. Most of these flags are
Bram Moolenaard042c562005-06-30 22:04:15 +0000595letters that indicate the affixes that can be used with this word. These are
596specified with SFX and PFX lines in the .aff file. See the Myspell
597documentation.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000598
599 *spell-affix-vim*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000600A flag that Vim adds and is not in Myspell is the flag defined with KEP in the
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000601affix file. This has the meaning that case matters. This can be used if the
602word does not have the first letter in upper case at the start of a sentence.
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000603Example (assuming that = was used for KEP):
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000604
Bram Moolenaar0dc065e2005-07-04 22:49:24 +0000605 word list matches does not match ~
606 's morgens/= 's morgens 'S morgens 's Morgens 'S MORGENS
607 's Morgens 's Morgens 'S MORGENS 'S morgens 's morgens
608
609The flag can also be used to avoid that the word matches when it is in all
610upper-case letters.
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000611
612 *spell-affix-mbyte*
613The basic word list is normally in an 8-bit encoding, which is mentioned in
614the affix file. The affix file must always be in the same encoding as the
615word list. This is compatible with Myspell. For Vim the encoding may also be
616something else, any encoding that "iconv" supports. The "SET" line must
617specify the name of the encoding. When using a multi-byte encoding it's
Bram Moolenaard042c562005-06-30 22:04:15 +0000618possible to use more different affixes (but Myspell doesn't support that, thus
619you may not want to use it anyway).
Bram Moolenaar13fcaaf2005-04-15 21:13:42 +0000620
Bram Moolenaare13305e2005-06-19 22:54:15 +0000621
622CHARACTER TABLES
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000623 *spell-affix-chars*
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000624When using an 8-bit encoding the affix file should define what characters are
625word characters (as specified with ENC). This is because the system where
626":mkspell" is used may not support a locale with this encoding and isalpha()
627won't work. For example when using "cp1250" on Unix.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000628
Bram Moolenaare7566042005-06-17 22:00:15 +0000629 *E761* *E762* *spell-affix-FOL*
630 *spell-affix-LOW* *spell-affix-UPP*
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000631Three lines in the affix file are needed. Simplistic example:
632
Bram Moolenaare13305e2005-06-19 22:54:15 +0000633 FOL áëñ ~
634 LOW áëñ ~
635 UPP ÁËÑ ~
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000636
637All three lines must have exactly the same number of characters.
638
639The "FOL" line specifies the case-folded characters. These are used to
640compare words while ignoring case. For most encodings this is identical to
641the lower case line.
642
643The "LOW" line specifies the characters in lower-case. Mostly it's equal to
644the "FOL" line.
645
646The "UPP" line specifies the characters with upper-case. That is, a character
647is upper-case where it's different from the character at the same position in
648"FOL".
649
650ASCII characters should be omitted, Vim always handles these in the same way.
651When the encoding is UTF-8 no word characters need to be specified.
652
653 *E763*
Bram Moolenaar3b506942005-06-23 22:36:45 +0000654Vim allows you to use spell checking for several languages in the same file.
655You can list them in the 'spelllang' option. As a consequence all spell files
656for the same encoding must use the same word characters, otherwise they can't
657be combined without errors. If you get a warning that the word tables differ
658you may need to generate the .spl file again with |:mkspell|. Check the FOL,
659LOW and UPP lines in the used .aff file.
660
661The XX.ascii.spl spell file generated with the "-ascii" argument will not
662contain the table with characters, so that it can be combine with spell files
663for any encoding. The .add.spl files also do not contain the table.
Bram Moolenaar0cb032e2005-04-23 20:52:00 +0000664
Bram Moolenaare7566042005-06-17 22:00:15 +0000665
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000666MID-WORD CHARACTERS
667 *spell-midword*
668Some characters are only to be considered word characters if they are used in
669between two ordinary word characters. An example is the single quote: It is
670often used to put text in quotes, thus it can't be recognized as a word
671character, but when it appears in between word characters it must be part of
672the word. This is needed to detect a spelling error such as they'are. That
673should be they're, but since "they" and "are" are words themselves that would
674go unnoticed.
675
676These characters are defined with MIDWORD in the .aff file:
677
678 MIDWORD '- ~
679
680
Bram Moolenaare13305e2005-06-19 22:54:15 +0000681AFFIXES
682 *spell-affix-PFX* *spell-affix-SFX*
683The usual PFX (prefix) and SFX (suffix) lines are supported (see the Myspell
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000684documentation or the Aspell manual:
685http://aspell.net/man-html/Affix-Compression.html).
Bram Moolenaare13305e2005-06-19 22:54:15 +0000686
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000687Note that Myspell ignores any extra text after the relevant info. Vim
688requires this text to start with a "#" so that mistakes don't go unnoticed.
689Example:
690
691 SFX F 0 in [^i]n # Spion > Spionin ~
692 SFX F 0 nen in # Bauerin > Bauerinnen ~
693
694An extra item for Vim is the "rare" flag. It must come after the other
695fields, before a comment. When used then all words that use the affix will be
696marked as rare words. Example:
697
698 PFX F 0 nene . rare ~
699 SFX F 0 oin n rare # hardly ever used ~
700
701However, if the word also appears as a good word in another way it won't be
702marked as rare.
Bram Moolenaare13305e2005-06-19 22:54:15 +0000703
704 *spell-affix-PFXPOSTPONE*
705When an affix file has very many prefixes that apply to many words it's not
706possible to build the whole word list in memory. This applies to Hebrew (a
707list with all words is over a Gbyte). In that case applying prefixes must be
708postponed. This makes spell checking slower. It is indicated by this keyword
709in the .aff file:
710
711 PFXPOSTPONE ~
712
713Only prefixes without a chop string can be postponed, prefixes with a chop
Bram Moolenaar78984f52005-08-01 07:19:10 +0000714string will still be included in the word list. An exception if the chop
715string is one character and equal to the last character of the added string,
716but in lower case. Thus when the chop string is used to allow the following
717word to start with an upper case letter.
Bram Moolenaare13305e2005-06-19 22:54:15 +0000718
719
720KEEP-CASE WORDS
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000721 *spell-affix-KEP*
722In the affix file a KEP line can be used to define the affix name used for
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000723keep-case words. Example:
724
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000725 KEP = ~
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000726
727See above for an example |spell-affix-vim|.
728
Bram Moolenaare13305e2005-06-19 22:54:15 +0000729
730RARE WORDS
Bram Moolenaar82cf9b62005-06-07 21:09:25 +0000731 *spell-affix-RAR*
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000732In the affix file a RAR line can be used to define the affix name used for
733rare words. Example:
734
735 RAR ? ~
736
737Rare words are highlighted differently from bad words. This is to be used for
738words that are correct for the language, but are hardly ever used and could be
Bram Moolenaar30abd282005-06-22 22:35:10 +0000739a typing mistake anyway. When the same word is found as good it won't be
740highlighted as rare.
741
742
743BAD WORDS
744 *spell-affix-BAD*
745In the affix file a BAD line can be used to define the affix name used for
746bad words. Example:
747
748 BAD ! ~
749
750This can be used to exclude words that would otherwise be good. For example
Bram Moolenaar9a50b1b2005-06-27 22:48:21 +0000751"the the" in the .dic file:
752
753 the the/! ~
754
755Once a word has been marked as bad it won't be undone by encountering the same
756word as good.
Bram Moolenaar45eeb132005-06-06 21:59:07 +0000757
758
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000759REPLACEMENTS *spell-affix-REP*
760
761In the affix file REP items can be used to define common mistakes. This is
762used to make spelling suggestions. The items define the "from" text and the
763"to" replacement. Example:
764
765 REP 4 ~
766 REP f ph ~
767 REP ph f ~
768 REP k ch ~
769 REP ch k ~
770
771The first line specifies the number of REP lines following. Vim ignores it.
Bram Moolenaard042c562005-06-30 22:04:15 +0000772Don't include simple one-character replacements or swaps. Vim will try these
773anyway. You can include whole words if you want to, but you might want to use
774the "file:" item in 'spellsuggest' instead.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000775
776
777SIMILAR CHARACTERS *spell-affix-MAP*
778
Bram Moolenaard042c562005-06-30 22:04:15 +0000779In the affix file MAP items can be used to define letters that are very much
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000780alike. This is mostly used for a letter with different accents. This is used
781to prefer suggestions with these letters substituted. Example:
782
783 MAP 2 ~
784 MAP eéëêè ~
785 MAP uüùúû ~
786
787The first line specifies the number of MAP lines following. Vim ignores it.
788
Bram Moolenaard042c562005-06-30 22:04:15 +0000789Each letter must appear in only one of the MAP items. It's a bit more
790efficient if the first letter is ASCII or at least one without accents.
Bram Moolenaare7566042005-06-17 22:00:15 +0000791
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000792
Bram Moolenaard042c562005-06-30 22:04:15 +0000793SOUND-A-LIKE *spell-affix-SAL*
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000794
795In the affix file SAL items can be used to define the sounds-a-like mechanism
796to be used. The main items define the "from" text and the "to" replacement.
Bram Moolenaard042c562005-06-30 22:04:15 +0000797Simplistic example:
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000798
799 SAL CIA X ~
800 SAL CH X ~
801 SAL C K ~
802 SAL K K ~
803
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +0000804There are a few rules and this can become quite complicated. An explanation
Bram Moolenaard042c562005-06-30 22:04:15 +0000805how it works can be found in the Aspell manual:
Bram Moolenaar42eeac32005-06-29 22:40:58 +0000806http://aspell.net/man-html/Phonetic-Code.html.
Bram Moolenaar9ba0eb82005-06-13 22:28:56 +0000807
808There are a few special items:
809
810 SAL followup true ~
811 SAL collapse_result true ~
812 SAL remove_accents true ~
813
814"1" has the same meaning as "true". Any other value means "false".
815
Bram Moolenaar42eeac32005-06-29 22:40:58 +0000816
817SIMPLE SOUNDFOLDING *spell-affix-SOFOFROM* *spell-affix-SOFOTO*
818
819The SAL mechanism is complex and slow. A simpler mechanism is mapping all
820characters to another character, mapping similar sounding characters to the
821same character. At the same time this does case folding. You can not have
Bram Moolenaard042c562005-06-30 22:04:15 +0000822both SAL items and simple soundfolding.
Bram Moolenaar42eeac32005-06-29 22:40:58 +0000823
Bram Moolenaar7d1f5db2005-07-03 21:39:27 +0000824There are two items required: one to specify the characters that are mapped
Bram Moolenaar42eeac32005-06-29 22:40:58 +0000825and one that specifies the characters they are mapped to. They must have
826exactly the same number of characters. Example:
827
828 SOFOFROM abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ ~
829 SOFOTO ebctefghejklnnepkrstevvkesebctefghejklnnepkrstevvkes ~
830
831In the example all vowels are mapped to the same character 'e'. Another
Bram Moolenaard042c562005-06-30 22:04:15 +0000832method would be to leave out all vowels. Some characters that sound nearly
833the same and are often mixed up, such as 'm' and 'n', are mapped to the same
834character. Don't do this too much, all words will start looking alike.
Bram Moolenaar42eeac32005-06-29 22:40:58 +0000835
836Characters that do not appear in SOFOFROM will be left out, except that all
837white space is replaced by one space. Sequences of the same character in
838SOFOFROM are replaced by one.
839
840You can use the |soundfold()| function to try out the results. Or set the
841'verbose' option to see the score in the output of the |z?| command.
842
843
Bram Moolenaar217ad922005-03-20 22:37:15 +0000844 vim:tw=78:sw=4:ts=8:ft=help:norl: