文書の差分の取得


 Algorithm::Diffを使うと簡単です。
use strict;
use warnings;
use Algorithm::Diff;

my @doc1 = qw(
    123
    345
    456
    abc
    ABC
    666
    777
    888
    999
);

my @doc2 = qw(
    123
    456
    aBc
    ABC
    555
    666
    777
    9999
    000
);

print "==================== [sdiff]\n";

my @sdiffs = Algorithm::Diff::sdiff(\@doc1, \@doc2);
foreach my $diff (@sdiffs) {
    my $op = $diff->[0];
    my $dat1 = $diff->[1];
    my $dat2 = $diff->[2];
    printf "%s %5s %5s\n", $op, $dat1, $dat2; 
}

print "==================== [diff]\n";

my @diffs = Algorithm::Diff::diff(\@doc1, \@doc2);
foreach my $chunk (@diffs) {
    foreach my $diff (@$chunk) {
        my $op = $diff->[0];
        my $num = $diff->[1];
        my $dat = $diff->[2];
        printf "%s %d %s\n", $op, $num, $dat; 
    }
    print "----------\n";
}
以下は実行結果です。
==================== [sdiff]
u   123   123
-   345      
u   456   456
c   abc   aBc
u   ABC   ABC
+         555
u   666   666
u   777   777
c   888  9999
c   999   000
==================== [diff]
- 1 345
----------
- 3 abc
+ 2 aBc
----------
+ 4 555
----------
- 7 888
+ 7 9999
- 8 999
+ 8 000
----------

関連項目

文書の類似度を取得