Visit How to generate hash from a file or string in Node.js to see the full tutorial on Node.js Crypto
module which helps us to make a hash from a string or file.
Open the walker.js
, which we created on Creating pause and resume-able file walker, and add crypto = require('crypto')
as mentioned in following code with bold font to include crypto
module:
//walker.js const fs = require('fs'), crypto = require('crypto'), path = require('path'), {EventEmitter} = require('events'); constructor (debug)... ...
Next, edit the start
method and add this.generateHash(file, stat);
after the this.emit('file',entry,stat);
line to invoke the generateHash
method as shown below:
if (stat.isFile()){ if (this.filter_file(entry,stat)){ this.debug&&console.log('filterFile: '+entry); return this.next(); } this.debug&&console.log('File: '+entry); this.emit('file',entry,stat); this.generateHash(entry,stat); this.next(); }
Now, we’ll create the generateHash method, as shown below:
generateHash(file,stat){ const defaultLength = 4200, len = stat.size < defaultLength ? stat.size : defaultLength, pos = 0, offset =0; fs.open(file, 'r', (err, fd) => { if (err) { this.emit('error',err,file,stat); this.debug&&console.log(err); return; } const buffer = Buffer.alloc(len); fs.read(fd, buffer, offset, len, pos, (err, bytesRead, buffer) => { if (err){ this.emit('error',err,file,stat); this.debug&&console.log(err); return; } fs.close(fd, (err) => { if (err){ this.emit('error',err,file,stat); this.debug&&console.log(err); return; } }); const hash = crypto .createHash('whirlpool') .update(buffer) .digest('hex'); this.emit('hash',file,stat,buffer,hash); this.debug&&console.log('hash emitted'); return; }); }) }
The defaultLength = 4200
shows we’ll read the first 4200 bytes from a file or entire file if the file is small in size, see code: len = stat.size < defaultLength ? stat.size : defaultLength
. The rest of code is already described on following pages:
Next, we’ll modify the constructor method by moving all properties to new method except super()
and debug = debug ? true : false
:
constructor (entry,debug){ super(); this.isPaused = false; this.queue = []; this.debug = debug ? true : false; this.filter_dir = () => false; this.filter_file = () => false; this.start(entry); }
The modified constructor has only one parameter and it not accept an entry:
constructor (debug){ super(); this.debug = debug ? true : false; this.reset(debug); }
We’ll create the deleted properties inside the reset()
method, it is useful when we externally need to reset the FileWalker class:
reset(){ this.isPaused = false; this.queue = []; this.filter_dir = () => false; this.filter_file = () => false; }
Next, we create a new method to accept an entry to scan a directory:
addToQueue(entry){ Array.isArray(entry) ? Array.prototype.push.apply (this.queue, entry) : this.queue.push (entry) }
The addToQueue
method accepts a string or array and add the received entry into the queue.
Now how we’ll use this FileWalker class:
//walkerHelper.js const walker = new walkerClass(); walker.addToQueue(['/a/path','/b/path']); walker.next();
Our FileWalker class has been almost completed. Click here to download the new FileWalker class.
Next we’ll create a walkerHelper.js
file which runs as a child process to separate the GUI thread (Electron renderer process) from the fs extensive processing. And then we create a Storage
class to store files, hashes and stats received by FileWalker class and use these information to find duplicate files.