How to Merge Multiple HTML Files Using Notepad++

merge html files

So the other day I was referring to an offline document that was actually just a compiled source of many HTML pages. Opening each page every time from the index proved to be quite tasking not to mention time intensive. For that reason I set out to compile all these pages into one HTML so that I could easily search through one page instead of the then 22 pages.

At first I thought I could get a free software online that would do just that and lucky enough I did get HTML Merge from SourceForge. However, my luck did not last that long on account of this software not living up to this task - it refused these particular HTML pages on account of them being in an unsupported encoding and then forgot to mention what it supported. So I tried saving a few of them in what I thought was the standard (UTF-8) but that threw me the same exact error.



Left with no option I decided to go the manual route - open each HTML individually then use the godsent that is copy and paste. Having gotten through some few pages is when I recalled that Notepad++ had once aided me not long ago to combne multiple plain text files. So what about HTML? Turned out it could handle those too.

Note:
When I say combine or merge, I mean just that - appending one file after the other with no kind of html tag editing whatsoever. A less ambigous though unpopular term for this I believe is concatenating. As such I would't advise using this method for content you plan publsihing on a website. However, for offline HTML documents from the same source (like a book) I don't see the harm.

This is what I did:


STEPS

1. Open Notepad++ first. You can get the portable or installable version here.

2. Now we need to install a plugin called Combine (NPP Combine) for this to work. You can do that in either to ways:

a. While connected to the Internet, go to Plugins in the menu and under Plugin Manager select Show Plugin Manager. The plugin manager will automatically fetch all available plugins and list them there. Look for and select Combine then hit the Install button.

b. Get the plugin manually from the developers page and install it. To install, just copy the downloaded file (combine.dll) in the plugins subfolder located inside Notepad++ installation folder. Restart the program to load the plugin.

3. Open all the HTML files you need to merge using Notepad++. To do this the easy way, just select all of them from your file manager then drag and drop them inside Notepad++ window.

4. Now go to the Plugins menu and select Start under the Combine plugin. That will launch the plugins window with some few settings. Since this is HTML I don't think it's wise to add anything so just hit the OK button.
combine plugin
Start Plugin
combine plugin options
Click OK to Start

5. Doing that will combine all the opened files into one large file in the order they've been opened (i.e. from the first to the last tab). To finish, save this new file and you're done.

You can now go ahead and open the merged HTML page with your browser to see the output. If you need to remove any repeated element from the pages (like navigation linking to other pages that are now non-existent or imagees), just open the merged file using Notepad++ and use the Replace function (Ctrl + H) to remove the elements in one go.

After that, if it's a book like in my case, I presume you'd like to convert the merged HTML into a more portable fromat like PDF or Word if you wish to edit the content.



PDF:

For PDF, I would recommend opening the HTML file using Chrome Browser and using it's superb export to PDF feature which also offers some neat customizations. If you're on Windows 8/10, you can also use Windows in-built PDF printer to export to PDF from any Browser. There are also plenty of free software and online services that can help you with that.

WORD:

For Word, the good news is that pretty much any MS Word version handles HTML files by default.  MS Word actually renders the HTML rather than displaying the raw HTML output. So just open the HTML file using MS Word then save the document in an editable Word Format (*docx, *.doc).


4 comments

Very useful article, save my life, thank you!

Reply

Thank you. Your post is a God-send. Wish you more strength to make life easy for us computer geeks.

Reply

Glad to help. More strength to you too brother.

Reply

Why not just old-fashioned copy?:

copy *.h* big

Then rename big to whatever.htm.

Nicholas Kormanik

Reply

Leave your comment below. Spammers are advised to file a missing comment report in not less than a weeks time.