View Page Source Obfuscated? Topic is solved

Support and discussions for the x86/x64 Linux version of Pale Moon.

Moderators: trava90, satrow

User avatar
cosmo666
Apollo supporter
Apollo supporter
Posts: 35
Joined: 2018-10-05, 20:06

View Page Source Obfuscated?

Unread post by cosmo666 » 2019-10-26, 16:44

I'm finding more and more pages (typically, those most heavily laden with active commercial content) where it appears I can't do a 'view page source' and actually see anything. I'm wondering if I may have configured Pale Moon incorrectly, or, are there now standard HTML/CSS features which enable the originators of the content to prevent viewing it?

If Pale Moon provided a 'raw-dump' option, which did nothing but dump the original content in 'view-source' format to a text-page, that would be ideal. If anyone has a quickie Perl/Python script able to do that, much appreciated.
(V 28.4.1/Ubuntu 16.04)

User avatar
SpockFan02
Astronaut
Astronaut
Posts: 505
Joined: 2017-09-24, 16:35
Location: Mak pupulusšum, California

Re: View Page Source Obfuscated?

Unread post by SpockFan02 » 2019-10-26, 17:24

Post an example, we can only guess what might be the problem if we can't see ourselves. If I right-click on a Web page and select View Page Source in the context menu, it opens a new window with the source of the page in it. You can also save pages and open up the files locally. Does View Page Source bring up an empty window for you? Or is the source just more sparse than you'd expect?

User avatar
cosmo666
Apollo supporter
Apollo supporter
Posts: 35
Joined: 2018-10-05, 20:06

Re: View Page Source Obfuscated?

Unread post by cosmo666 » 2019-10-26, 19:21

Thanks very much for the prompt reply. I have only one example which is repeatable, but I'm going to have to review my archives to find it. In one case I got a blank View Source page, but I'll have to find the URL. The other failure mode involves 'save-to-archives' , at which point the download task starts, then aborts with a 'failed' message. That occurs more often with MAFF. In maybe 50% of those cases, MHT does work.

FYI, I doubt this is a bug in Pale Moon. My general concern is how to deal with websites which a) takeover the browser, often using up significant processing resources, b) effectively disable View Source, and/or make it impossible to even unload the page without killing the entire browser task externally. That might be Pale Moon's fault, but I suspect more likely its due to the resource hogging of the page itself. My main reason for posting this was to find out if there was some new feature of HTML5 which enables the content provider to prevent SaveAs, or View Source.

I will put together some examples and post the links, but it might take a couple of days to pull it all together.
Very much appreciate your willingness to help track this down.

User avatar
SpockFan02
Astronaut
Astronaut
Posts: 505
Joined: 2017-09-24, 16:35
Location: Mak pupulusšum, California

Re: View Page Source Obfuscated?

Unread post by SpockFan02 » 2019-10-26, 21:10

AFAIK saving as MAFF or MHTML is a feature provided by extensions? When you've found the problem pages I'd be happy to try to reproduce your issue(s) and we can go from there. As for a
cosmo666 wrote:
2019-10-26, 19:21
new feature of HTML5 which enables the content provider to prevent SaveAs, or View Source.
, it would be (shocking) news to me.

User avatar
cosmo666
Apollo supporter
Apollo supporter
Posts: 35
Joined: 2018-10-05, 20:06

Re: View Page Source Obfuscated?

Unread post by cosmo666 » 2019-10-26, 22:52

Okay, now I remember where I encountered the extreme example. I was trying to save a copy of a DuckDuckGo search. I'm going to attach copies of the exemplars, but I'm pretty sure they just ginned up some HTML which was obfuscated by way of <script> tags, in which case this is essentially a non-issue, because I was able to save the page using MAFF.

Attachments:
obfuscated.v3.txt is just a raw text file I created by doing a view-source, select-all, then save as text.
obfuscated.v4.zip is actually obfuscated.v4.maff (but the board attachment option wouldn't take it as .maff)

I didn't do a detailed exam of the obfuscated text itself, but found it consisted of mostly <script> blocks,
(looks like 4), and probably not worth noodling over.

Sorry I wasted your time on this. Thanks again for picking it up.
Attachments
obfuscated.v4.zip
(118.81 KiB) Downloaded 4 times

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 24838
Joined: 2011-08-28, 17:27
Location: 58°2'16"N 14°58'31"E
Contact:

Re: View Page Source Obfuscated?

Unread post by Moonchild » 2019-10-26, 22:58

It's not obfuscated. I think what you may have a problem with is the results being sent on extremely long lines without line breaks. Doesn't mean the content is hidden, just means it's not the most convenient output to parse in editors ;-)

In the "View Source" window, go to the View menu and select "Wrap long lines" and tada.wav
"If you want to build a better world for yourself, you have to be willing to build one for everybody." -- Coyote Osborne
Image

User avatar
cosmo666
Apollo supporter
Apollo supporter
Posts: 35
Joined: 2018-10-05, 20:06

Re: View Page Source Obfuscated?

Unread post by cosmo666 » 2019-10-27, 06:00

Thanks for that tip, very helpful. Just wanted to pass along some testing results.

My original goal was to save the first full page of search results page from a DuckDuckGo search.

If I do a view source, I don't get any of the actual search page, instead I get some sort of stub page.
(attached as: ddgo.view_text.txt)
Same result if I do a Save As HTML Only
(attached as: ddgo.html_only.txt)

Seems to me the DDGO search page must have established some sort of event trap in order to do that. (?)
I haven't done any Javascript coding in about 10 years, so not in a position to speculate off-hand as to why this might be happening.
However, if you can suggest a way of sleuthing it out, happy to wade into it. I would very much like to know how its done.

Re: save to archive maff/mht
Both seem to work fine, even on very complex pages (such as this one).
(attached as: pale_moon_forum.html.zip)

Thanks again for the assist, and for these outstanding well-managed projects.
Attachments
pale_moon_forum.html.zip
(261.57 KiB) Downloaded 2 times

User avatar
mmouse
Moongazer
Moongazer
Posts: 9
Joined: 2019-02-13, 06:47

Re: View Page Source Obfuscated?

Unread post by mmouse » 2019-10-28, 15:01

cosmo666 wrote:
2019-10-26, 22:52
I was trying to save a copy of a DuckDuckGo search.
does it change anything if you use DuckDuckGo html only (https://start.duckduckgo.com/html) ?

User avatar
cosmo666
Apollo supporter
Apollo supporter
Posts: 35
Joined: 2018-10-05, 20:06

Re: View Page Source Obfuscated?

Unread post by cosmo666 » 2019-10-28, 16:16

Apropos...
by mmouse » Mon, 28 Oct 2019, 08:01

cosmo666 wrote: ↑
Sat, 26 Oct 2019, 15:52
I was trying to save a copy of a DuckDuckGo search.

does it change anything if you use DuckDuckGo html only (https://start.duckduckgo.com/html) ?
Indeed it does. I was unaware of that start option for DDGO (thank you!). After using the URL you suggested,
I got slightly less colorful but apparently identical content. However, when I did a view-source, nothing happened.
However, when I did a SaveAs Html Only under Pale Moon, I got what I suppose you could call
'raw-text-html', which was still markup, but generic.
See attached:
ddgo_htmo_start.txt, which is the complete .html [only] saved to disk by Pale-Moon.
sshot.pale_moon_raw.v1.png shows how it displayed under Pale-Moon when loaded as a file.

However, I when I tried to do a view-source from Pale-Moon of the generic result page, it still didn't work. So I loaded
that same generic page (the Pale-Moon generated html-only) into Firefox (70.0), and it displayed the same as in Pale-Moon.
However, under Firefox, I was able to do a view-source. (sshot.pale_moon.v2.png)
(And its possible I have a Pale Moon add-on loaded which is causing the view-source to fail.)

My main concern: It should be possible, at a minimum, to do a view-source on any instantiated page. And, IMHO (as a suggestion), it ought to be possible to load any page in what I'll call 'view-source-first' mode, meaning, before the browser executes any part of the downloaded markup, it displays a text window containing that which its about to exec, thereby giving us the ability to look for possible exploits before they happen.

Thanks 'mmouse' for cluing me in on the alternate ddgo load option. All of the above has been very helpful in my ongoing quest to retain control over web-pages which seek to take over my desktop.

Attachments:
ddgo_htmo_start.txt -- the result of the SaveAs HTML Only from pale-moon of the html-only DDGO URL returned content.
sshot.pale_moon_raw.v1.png -- the 'raw-text-html visible content loaded from the .html file created by pale-moon.
sshot.pale_moon_raw.v2.png -- the result of the 'view-source' after loading the .html file into Firefox.
Attachments
sshot.pale_moon_raw.v2.png
sshot.pale_moon_raw.v1.png

User avatar
mmouse
Moongazer
Moongazer
Posts: 9
Joined: 2019-02-13, 06:47

Re: View Page Source Obfuscated?

Unread post by mmouse » 2019-10-28, 18:23

cosmo666 wrote:
2019-10-28, 16:16

(...)
After using the URL you suggested,
I got slightly less colorful but apparently identical content. However, when I did a view-source, nothing happened.
This definitely does not happen for me (Palemoon 28.7.1 Linux 64 bits running from the tar.gz)
As this page has no javascript, I think it can be ruled out. That leaves your config and add-ons as first suspects.
I'd try to do a clean install in case you did not already try it.
For the record, I have seen this very behaviour with Palemoon, with a page I had saved on disk and slightly changed by hand. Unfortunately I had done a mistake and deleted a closing tag, so the resulting html was invalid. It still loaded as a html page but not anymore in view source, so it could be a plugin messing with your accessed pages.

BTW you announced 3 attachments but I have seen only two. No great loss since I don't think it could have done much good anyway.

User avatar
cosmo666
Apollo supporter
Apollo supporter
Posts: 35
Joined: 2018-10-05, 20:06

Re: View Page Source Obfuscated?

Unread post by cosmo666 » 2019-10-28, 20:08

Thanks, I'll upgrade Pale Moon, see if that helps.

Post Reply