To compare files, there are different approaches - GUI, command line, or using existing software. But to deal with huge files (meansured using Gb), the GUI/software may encounter problems of hanging and unresponsive. The command line must be the only option left.
The well-known command line of 'diff' is available in any Linux or Unix based operating systems. For Windows, it could be installed as well using some sorts of portings. Another way is to install the GitHub desktop. The 'diff' is immediately available in the command window in the same way as in Linux or Unix.
Here, I'd like to share some Windows-native commands for comparing huge file: the 'fc' and 'comp' commands. There usages could be checked by providing the '/?' switch as shown below:
C:\Users\Daniel>comp /? Compares the contents of two files or sets of files. COMP [data1] [data2] [/D] [/A] [/L] [/N=number] [/C] [/OFF[LINE]] [/M] data1 Specifies location and name(s) of first file(s) to compare. data2 Specifies location and name(s) of second files to compare. /D Displays differences in decimal format. /A Displays differences in ASCII characters. /L Displays line numbers for differences. /N=number Compares only the first specified number of lines in each file. /C Disregards case of ASCII letters when comparing files. /OFF[LINE] Do not skip files with offline attribute set. /M Do not prompt for compare more files. To compare sets of files, use wildcards in data1 and data2 parameters.
C:\Users\Daniel>fc /? Compares two files or sets of files and displays the differences between them FC [/A] [/C] [/L] [/LBn] [/N] [/OFF[LINE]] [/T] [/U] [/W] [/nnnn] [drive1:][path1]filename1 [drive2:][path2]filename2 FC /B [drive1:][path1]filename1 [drive2:][path2]filename2 /A Displays only first and last lines for each set of differences. /B Performs a binary comparison. /C Disregards the case of letters. /L Compares files as ASCII text. /LBn Sets the maximum consecutive mismatches to the specified number of lines. /N Displays the line numbers on an ASCII comparison. /OFF[LINE] Do not skip files with offline attribute set. /T Does not expand tabs to spaces. /U Compare files as UNICODE text files. /W Compresses white space (tabs and spaces) for comparison. /nnnn Specifies the number of consecutive lines that must match after a mismatch. [drive1:][path1]filename1 Specifies the first file or set of files to compare. [drive2:][path2]filename2 Specifies the second file or set of files to compare.
According to my tests, the 'fc' works better than 'comp' as I am comparing huge text files. The '/L' option has to be included for this purpose. However, 'fc' really returned meaningful results including line number and the contents. The speed surprised me I have to admit. For two 1.17 Gb files comparison, it finished in about one second.
No comments :
Post a Comment