patch 9.1.1243: diff mode is lacking for changes within lines

Problem:  Diff mode's inline highlighting is lackluster. It only
          performs a line-by-line comparison, and calculates a single
          shortest range within a line that could encompass all the
          changes. In lines with multiple changes, or those that span
          multiple lines, this approach tends to end up highlighting
          much more than necessary.

Solution: Implement new inline highlighting modes by doing per-character
          or per-word diff within the diff block, and highlight only the
          relevant parts, add "inline:simple" to the defaults (which is
          the old behaviour)

This change introduces a new diffopt option "inline:<type>". Setting to
"none" will disable all inline highlighting, "simple" (the default) will
use the old behavior, "char" / "word" will perform a character/word-wise
diff of the texts within each diff block and only highlight the
differences.

The new char/word inline diff only use the internal xdiff, and will
respect diff options such as algorithm choice, icase, and misc iwhite
options. indent-heuristics is always on to perform better sliding.

For character highlight, a post-process of the diff results is first
applied before we show the highlight. This is because a naive diff will
create a result with a lot of small diff chunks and gaps, due to the
repetitive nature of individual characters. The post-process is a
heuristic-based refinement that attempts to merge adjacent diff blocks
if they are separated by a short gap (1-3 characters), and can be
further tuned in the future for better results. This process results in
more characters than necessary being highlighted but overall less visual
noise.

For word highlight, always use first buffer's iskeyword definition.
Otherwise if each buffer has different iskeyword settings we would not
be able to group words properly.

The char/word diffing is always per-diff block, not per line, meaning
that changes that span multiple lines will show up correctly.
Added/removed newlines are not shown by default, but if the user has
'list' set (with "eol" listchar defined), the eol character will be be
highlighted correctly for the specific newline characters.

Also, add a new "DiffTextAdd" highlight group linked to "DiffText" by
default. It allows color schemes to use different colors for texts that
have been added within a line versus modified.

This doesn't interact with linematch perfectly currently. The linematch
feature splits up diff blocks into multiple smaller blocks for better
visual matching, which makes inline highlight less useful especially for
multi-line change (e.g. a line is broken into two lines). This could be
addressed in the future.

As a side change, this also removes the bounds checking introduced to
diff_read() as they were added to mask existing logic bugs that were
properly fixed in #16768.

closes: #16881

Signed-off-by: Yee Cheng Chin <ychin.git@gmail.com>
Signed-off-by: Christian Brabandt <cb@256bit.org>
diff --git a/src/structs.h b/src/structs.h
index 1a3abcb..33c26dc 100644
--- a/src/structs.h
+++ b/src/structs.h
@@ -3571,14 +3571,17 @@
  * and how many lines it occupies in that buffer.  When the lines are missing
  * in the buffer the df_count[] is zero.  This is all counted in
  * buffer lines.
- * There is always at least one unchanged line in between the diffs.
- * Otherwise it would have been included in the diff above or below it.
+ * There is always at least one unchanged line in between the diffs (unless
+ * linematch is used).  Otherwise it would have been included in the diff above
+ * or below it.
  * df_lnum[] + df_count[] is the lnum below the change.  When in one buffer
  * lines have been inserted, in the other buffer df_lnum[] is the line below
  * the insertion and df_count[] is zero.  When appending lines at the end of
  * the buffer, df_lnum[] is one beyond the end!
  * This is using a linked list, because the number of differences is expected
  * to be reasonable small.  The list is sorted on lnum.
+ * Each diffblock also contains a cached list of inline diff of changes within
+ * the block, used for highlighting.
  */
 typedef struct diffblock_S diff_T;
 struct diffblock_S
@@ -3588,6 +3591,37 @@
     linenr_T	df_count[DB_COUNT];	// nr of inserted/changed lines
     int is_linematched;  // has the linematch algorithm ran on this diff hunk to divide it into
 			  // smaller diff hunks?
+
+    int		has_changes;		// has cached list of inline changes
+    garray_T	df_changes;		// list of inline changes (diffline_change_T)
+};
+
+/*
+ * Each entry stores a single inline change within a diff block. Line numbers
+ * are recorded as relative offsets, and columns are byte offsets, not
+ * character counts.
+ * Ranges are [start,end), with the end being exclusive.
+ */
+typedef struct diffline_change_S diffline_change_T;
+struct diffline_change_S
+{
+    colnr_T	dc_start[DB_COUNT];	// byte offset of start of range in the line
+    colnr_T	dc_end[DB_COUNT];	// 1 paste byte offset of end of range in line
+    int		dc_start_lnum_off[DB_COUNT];	// starting line offset
+    int		dc_end_lnum_off[DB_COUNT];	// end line offset
+};
+
+/*
+ * Describes a single line's list of inline changes. Use diff_change_parse() to
+ * parse this.
+ */
+typedef struct diffline_S diffline_T;
+struct diffline_S
+{
+    diffline_change_T *changes;
+    int num_changes;
+    int bufidx;
+    int lineoff;
 };
 #endif