patch 9.1.1276: inline word diff treats multibyte chars as word char

Problem:  inline word diff treats multibyte chars as word char
          (after 9.1.1243)
Solution: treat all non-alphanumeric characters as non-word characters
          (Yee Cheng Chin)

Previously inline word diff simply used Vim's definition of keyword to
determine what is a word, which leads to multi-byte character classes
such as emojis and CJK (Chinese/Japanese/Korean) characters all
classifying as word characters, leading to entire sentences being
grouped as a single word which does not provide meaningful information
in a diff highlight.

Fix this by treating all non-alphanumeric characters (with class number
above 2) as non-word characters, as there is usually no benefit in using
word diff on them. These include CJK characters, emojis, and also
subscript/superscript numbers. Meanwhile, multi-byte characters like
Cyrillic and Greek letters will still continue to considered as words.

Note that this is slightly inconsistent with how words are defined
elsewhere, as Vim usually considers any character with class >=2 to be
a "word".

related: #16881 (diff inline highlight)
closes: #17050

Signed-off-by: Yee Cheng Chin <ychin.git@gmail.com>
Signed-off-by: Christian Brabandt <cb@256bit.org>
diff --git a/runtime/doc/options.txt b/runtime/doc/options.txt
index 84deeca..e2206f0 100644
--- a/runtime/doc/options.txt
+++ b/runtime/doc/options.txt
@@ -1,4 +1,4 @@
-*options.txt*	For Vim version 9.1.  Last change: 2025 Mar 28
+*options.txt*	For Vim version 9.1.  Last change: 2025 Apr 04
 
 
 		  VIM REFERENCE MANUAL	  by Bram Moolenaar
@@ -2989,7 +2989,10 @@
 					difference.
 				word    Use internal diff to perform a
 					|word|-wise diff and highlight the
-					difference.
+					difference.  Non-alphanumeric
+					multi-byte characters such as emoji
+					and CJK characters are considered
+					individual words.
 
 		internal	Use the internal diff library.  This is
 				ignored when 'diffexpr' is set.  *E960*
diff --git a/src/diff.c b/src/diff.c
index 3adcdb7..e694cf2 100644
--- a/src/diff.c
+++ b/src/diff.c
@@ -3309,10 +3309,17 @@
 	    char_u *s;
 	    for (s = curline; *s != NUL;)
 	    {
-		// Always use the first buffer's 'iskeyword' to have a consistent diff
 		int new_in_keyword = FALSE;
 		if (diff_flags & DIFF_INLINE_WORD)
-		    new_in_keyword = vim_iswordp_buf(s, curtab->tp_diffbuf[file1_idx]);
+		{
+		    // Always use the first buffer's 'iskeyword' to have a
+		    // consistent diff.
+		    // For multibyte chars, only treat alphanumeric chars
+		    // (class 2) as "word", as other classes such as emojis and
+		    // CJK ideographs do not usually benefit from word diff as
+		    // Vim doesn't have a good way to segment them.
+		    new_in_keyword = (mb_get_class_buf(s, curtab->tp_diffbuf[file1_idx]) == 2);
+		}
 		if (in_keyword && !new_in_keyword)
 		{
 		    ga_append(curstr, NL);
diff --git a/src/mbyte.c b/src/mbyte.c
index a38ab24..cc8d628 100644
--- a/src/mbyte.c
+++ b/src/mbyte.c
@@ -828,8 +828,8 @@
  * Get class of pointer:
  * 0 for blank or NUL
  * 1 for punctuation
- * 2 for an (ASCII) word character
- * >2 for other word characters
+ * 2 for an alphanumeric word character
+ * >2 for other word characters, including CJK and emoji
  */
     int
 mb_get_class(char_u *p)
diff --git a/src/testdir/dumps/Test_diff_inline_word_03.dump b/src/testdir/dumps/Test_diff_inline_word_03.dump
new file mode 100644
index 0000000..30efaed
--- /dev/null
+++ b/src/testdir/dumps/Test_diff_inline_word_03.dump
@@ -0,0 +1,20 @@
+| +0#0000e05#a8a8a8255@1|🚀*0#0000000#ffd7ff255|⛵️*2&#ff404010|一*0&#ffd7ff255|二|三*2&#ff404010|ひ*0&#ffd7ff255|ら|が*0&#4040ff13|な*0&#ffd7ff255|Δ+2&#ff404010|έ|λ|τ|α| +0&#ffd7ff255|Δ+2&#ff404010|e|l|t|a| +0&#ffd7ff255|f|o@1|b|a||+1&#ffffff0| +0#0000e05#a8a8a8255@1|🚀*0#0000000#ffd7ff255|🛸*2&#ff404010|一*0&#ffd7ff255|二|四*2&#ff404010|ひ*0&#ffd7ff255|ら|な|δ+2&#ff404010|έ|λ|τ|α| +0&#ffd7ff255|δ+2&#ff404010|e|l|t|a| +0&#ffd7ff255|f|o@1|b|a|r| 
+|~+0#4040ff13#ffffff0| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|~| @35||+1#0000000&|~+0#4040ff13&| @35
+|X+3#0000000&|d|i|f|i|l|e|1| @10|1|,|1| @11|A|l@1| |X+1&&|d|i|f|i|l|e|2| @10|1|,|1| @11|A|l@1
+|:+0&&> @73
diff --git a/src/testdir/test_diffmode.vim b/src/testdir/test_diffmode.vim
index 1b5e5c0..d0c2f18 100644
--- a/src/testdir/test_diffmode.vim
+++ b/src/testdir/test_diffmode.vim
Binary files differ
diff --git a/src/version.c b/src/version.c
index d1ba7ad..3e45e2f 100644
--- a/src/version.c
+++ b/src/version.c
@@ -705,6 +705,8 @@
 static int included_patches[] =
 {   /* Add new patch number below this line */
 /**/
+    1276,
+/**/
     1275,
 /**/
     1274,