Author Topic: Sorting Google Chrome MHTML files  (Read 1244 times)

Offline OS923

  • Platinum Member (500+ Posts)
  • *****
  • Posts: 785
  • Liked:
  • Likes Given: 8
Sorting Google Chrome MHTML files
« on: October 02, 2020, 08:57:30 AM »
With Google Chrome you can save a page as an MHTML file. If you've downloaded many pages and they end up in the same folder then it's a dull job to sort them out especially if you have to do this every day. MHTML sorter can do this for you. There's also a version for Windows, tested on XP and Windows 10. This program is a companion for my next program, MHTML converter, which aims to convert them to HTML that you can view on OS 9.

Offline OS923

  • Platinum Member (500+ Posts)
  • *****
  • Posts: 785
  • Liked:
  • Likes Given: 8
Re: Sorting Google Chrome MHTML files
« Reply #1 on: October 28, 2020, 08:15:01 AM »
I abandoned the MHTML converter project. It had to be configured with so many regular expressions that I'm sure that nobody will ever use this program. It's easier to make a list of the URLs of your MHTML files and download them with a different browser.

Offline OS923

  • Platinum Member (500+ Posts)
  • *****
  • Posts: 785
  • Liked:
  • Likes Given: 8
Re: Sorting Google Chrome MHTML files
« Reply #2 on: July 13, 2021, 07:31:10 AM »
I rewrote this program in C++.
Instead of complex configuration I follow a simple scheme to rename the files.
I replace & with & in the URL.
The new filename is (CRC of the URL)-(length of the URL).(extension that matches the MIME)
The links are files instead of paths.
You can merge folders and the links will still work.
I tried this with a complex page with frames, fonts, SVG in XML and so on.
When it's opened in Chrome it's just like the original.

Offline OS923

  • Platinum Member (500+ Posts)
  • *****
  • Posts: 785
  • Liked:
  • Likes Given: 8
Re: Sorting Google Chrome MHTML files
« Reply #3 on: August 18, 2021, 12:12:48 PM »
I fixed 3 problems with this program:
- I found a URL of 1713 bytes but it was limited to 1020. I changed this into 4096.
- It works now with titles in UTF-8.
- It works now with parts like:
Content-Type: text/css
Content-Transfer-Encoding: binary
Content-Location: cid:css-b11088bb-b961-4fd8-bbc9-ee6a6da027db@mhtml.blink
Here's an example of all the things that it understands:
Code: [Select]
From: <Saved by Blink>
Snapshot-Content-Location: https://www.androidplanet.nl/apps/videoder-youtube-videos-downloaden/
Subject: Videoder: de beste app om YouTube-video's te downloaden
Date: Wed, 3 Jun 2020 15:45:28 -0000
MIME-Version: 1.0
Content-Type: multipart/related;
type="text/html";
boundary="----MultipartBoundary--z2KZBdRlbm0FPAMVKnvWJxa8EMsG50uNOoNN0BSK01----"


From: <Saved by Blink>
Snapshot-Content-Location: http://macos9lives.com/smforum/index.php/topic,5189.0.htm
Subject: =?utf-8?Q?Timsort=E2=80=8A=E2=80=94=E2=80=8Athe=20fastest=20sorting=20alg?=
 =?utf-8?Q?orithm=20you=E2=80=99ve=20never=20heard=20of?=
Date: Tue, 3 Aug 2021 17:07:03 -0000
MIME-Version: 1.0
Content-Type: multipart/related;
type="text/html";
boundary="----MultipartBoundary--ApEEiBc8n2kauavw76i4GkY0cDzMhYwsVGJ7X6fY0z----"


------MultipartBoundary--z2KZBdRlbm0FPAMVKnvWJxa8EMsG50uNOoNN0BSK01----
Content-Type: text/html
Content-ID: <frame-D7C7D37F1A287023648F4FA81A7CB70F@mhtml.blink>
Content-Transfer-Encoding: binary
Content-Location: https://www.androidplanet.nl/apps/videoder-youtube-videos-downloaden/

binary
------MultipartBoundary--z2KZBdRlbm0FPAMVKnvWJxa8EMsG50uNOoNN0BSK01----
Content-Type: text/css
Content-Transfer-Encoding: binary
Content-Location: https://www.androidplanet.nl/builds/style.884be201.css?ver=5.4

binary
------MultipartBoundary--z2KZBdRlbm0FPAMVKnvWJxa8EMsG50uNOoNN0BSK01----
Content-Type: text/html
Content-ID: <frame-DA7201720B5F9297963CAFBE54DFAAFF@mhtml.blink>
Content-Transfer-Encoding: binary

binary
------MultipartBoundary--z2KZBdRlbm0FPAMVKnvWJxa8EMsG50uNOoNN0BSK01----
Content-Type: text/html
Content-ID: <frame-D0E563903854FDAC2FC7FE8CAE5CCF09@mhtml.blink>
Content-Transfer-Encoding: binary
Content-Location: https://d3186xq5v1iosf.cloudfront.net/index.html

binary
------MultipartBoundary--z2KZBdRlbm0FPAMVKnvWJxa8EMsG50uNOoNN0BSK01----
Content-Type: text/css
Content-Transfer-Encoding: binary
Content-Location: cid:css-39a783e6-e99b-411a-acf5-50b0588e1e09@mhtml.blink

binary
------MultipartBoundary--z2KZBdRlbm0FPAMVKnvWJxa8EMsG50uNOoNN0BSK01------


src="cid:frame-9ED95DF7A909C93E2F07F401A124A6CA@mhtml.blink"
href="cid:css-39a783e6-e99b-411a-acf5-50b0588e1e09@mhtml.blink"

Offline OS923

  • Platinum Member (500+ Posts)
  • *****
  • Posts: 785
  • Liked:
  • Likes Given: 8
Re: Sorting Google Chrome MHTML files
« Reply #4 on: October 16, 2021, 03:36:34 AM »
I improved Convert MHTML. I used it for several months without issues.

Offline OS923

  • Platinum Member (500+ Posts)
  • *****
  • Posts: 785
  • Liked:
  • Likes Given: 8
Re: Sorting Google Chrome MHTML files
« Reply #5 on: November 03, 2021, 11:03:22 AM »
It can now convert MHTML files that were saved by Google Chrome for Android as well as Windows because it can now decode Base64 and quoted-printable. It will now accept ".mht" extensions too instead of only ".mhtml". I improved the error messages.

 


SimplePortal 2.3.6 © 2008-2014, SimplePortal