Detect IPv6 not encapsulated in [] and correct them

Talk about code development, features, specific bugs, enhancements, patches, and similar things.
Forum rules
Please keep everything here strictly on-topic.
This board is meant for Pale Moon source code development related subjects only like code snippets, patches, specific bugs, git, the repositories, etc.

This is not for tech support! Please do not post tech support questions in the "Development" board!
Please make sure not to use this board for support questions. Please post issues with specific websites, extensions, etc. in the relevant boards for those topics.

Please keep things on-topic as this forum will be used for reference for Pale Moon development. Expect topics that aren't relevant as such to be moved or deleted.
Exagone313

Detect IPv6 not encapsulated in [] and correct them

Unread post by Exagone313 » 2016-02-12, 16:08

Hi,

If you type an IPv6 address you the address bar and forget to encapsulate it with [ and ] around, i.e. ::1 instead of [::1], it makes a search engine search directly.
My idea is too detect these addresses and encapsulate them to avoid this search.

Toa-Nuva
Fanatic
Fanatic
Posts: 200
Joined: 2015-06-04, 18:12

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Toa-Nuva » 2016-02-12, 16:27

Those brackets are necessary to keep the URL from becoming ambiguous, as the colon is also used to separate the address from the port. For example, ::1:1 could mean [::1]:1 (i.e. address ::1 at port 1) or [::1:1] (i.e. address ::1:1 at the default port).

Exagone313

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Exagone313 » 2016-02-13, 13:13

At least an error detection should be better than a non private way like search.

Like :

URL Malformed
Try [k::a]:b or [k::a:b]

Note that most of users would know how to use IPv6 addresses if they don't use standard web ports.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35650
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Moonchild » 2016-02-13, 14:23

This might get pretty complex and/or easy to have false positives on. How are you going to "detect" this properly? Easy for a human brain to do, not so much for a URL parsing computer program ;)

Patches are welcome, though.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

Exagone313

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Exagone313 » 2016-02-13, 14:27

Oh yeah, it has to check many possibilities. Need to think about it.
Also, I haven't read anything in PM source code, so I don't know how to find the part about URI parsing.

User avatar
Sob__
Lunatic
Lunatic
Posts: 251
Joined: 2014-02-17, 01:12
Location: CZ

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Sob__ » 2016-02-14, 19:48

Detection part should be easy, I did a quick test (relatively speaking) with following code and it seems to work fine. When it gets there, protocol and authentication is already parsed and it knows where possible hostname and port ends, so even something ugly like http://2001:db8::1@2001:db8::1@2001:db8::1/some/path is no problem and it finds the last "2001:db8::1" correctly. Using net_IsValidIPv6Addr() makes sure that invalid addresses (e.g. 2001:db8:::1) don't pass.

The real problem is what to do next. It would need to return some new code intead of NS_ERROR_MALFORMED_URI, something else would need to check for it, wrap hostname in brackets (i.e. rewrite the URL) and do something with it. The question is what exactly and it depends on what we want to solve.

Simpler case would be to assume just a simple user error, e.g. you copy IPv6 address from other program like PuTTY, paste it in browser and forget to add brackets. That's an understandable mistake. More complicated case is if you also want to add port and the ambiguity does not hit you. To solve the simpler case, silent rewrite would be enough. But to account for port, there would have to be some error message or question what user wants, because it's impossible to tell (well, yes in some cases, but not always) if the last part is still address or it's supposed to be port. And that means more code and messing with more parts of browser. If the port case was ignored and someone happened to enter port this way, then it would either be invalid address if it was too long (1:2:3:4:5:6:7:8:8080) or for shorter ones (2001:db8::1:8080) it would connect to wrong address. But that address would be unlikely to exist and even if it did, it would be still part of same /64, so it doesn't seem like a security problem. And last, it should probably be used only for URLs entered manually by user, because doing it for all would be wrong (it should be a convenience for user, not a way to accept invalid URLs).

But all this together needs someone familiar with PM code. I myself, knowing close to nothing about it, would spend a week finding and understanding all required code parts and I still wouldn't be sure about the result.

Code: Select all

diff --git a/netwerk/base/src/nsURLParsers.cpp b/netwerk/base/src/nsURLParsers.cpp
index 3e6d8db..2edd9b2 100644
--- a/netwerk/base/src/nsURLParsers.cpp
+++ b/netwerk/base/src/nsURLParsers.cpp
@@ -553,12 +553,14 @@ nsAuthURLParser::ParseServerInfo(const char *serverinfo, int32_t serverinfoLen,
     // delimiter).  check for illegal characters in the hostname.
     const char *p = serverinfo + serverinfoLen - 1;
     const char *colon = nullptr, *bracket = nullptr;
+    int colons = 0;
     for (; p > serverinfo; --p) {
         switch (*p) {
             case ']':
                 bracket = p;
                 break;
             case ':':
+                colons++;
                 if (bracket == nullptr)
                     colon = p;
                 break;
@@ -569,6 +571,11 @@ nsAuthURLParser::ParseServerInfo(const char *serverinfo, int32_t serverinfoLen,
         }
     }
 
+    if (colons > 1 && net_IsValidIPv6Addr(serverinfo, serverinfoLen)) {
+       //OutputDebugStringA("IPv6 address without brackets");
+       return NS_ERROR_MALFORMED_URI;
+    }
+
     if (colon) {
         // serverinfo = <hostname:port>
         SET_RESULT(hostname, 0, colon - serverinfo);

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35650
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Moonchild » 2016-02-14, 20:26

I guess if you just want to catch presumed-valid IPv6 addresses, then this patch should indeed do just fine. The check for a valid IPv6 address isn't exactly meant to be used in this kind of situation with an otherwise unsanitized string, I think, so I'll have to have a look at the code safety there before allowing this.

As an aside: rewriting based on guesswork is a definite no-no. I won't have it. :)
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
Sob__
Lunatic
Lunatic
Posts: 251
Joined: 2014-02-17, 01:12
Location: CZ

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Sob__ » 2016-02-14, 23:37

IMHO the simple solution (if it looks like valid IPv6 address, then assume it's one and don't care about port) would be enough. I see it as feature similar to prepending protocol if you don't type it. That can produce unexpected results too (e.g. enter ftp.example.net, while for some reason you want http://ftp.example.net, but end up with ftp://ftp.example.net instead), but it works as expected in most cases.
Moonchild wrote:..., so I'll have to have a look at the code safety there before allowing this.
That's one of reasons why I think it should only work with URLs typed by user. If it also worked with <a href="http://2001:db8::1">, it would be actually harmful. Such URL is clearly invalid, so it might be an attempt to do something bad to browser. And if someone created a page with such URL, they'd think it's correct if it worked in PM, but it would not work anywhere else.

The other approach would be to detect these mistakes and try to educate users by showing some warning page with explanation ("You entered http://2001:db8::1:8080, but it's invalid. The correct address is either http://[2001:db8::1:8080] or http://[2001:db8::1]:8080."), but I seriously doubt that many people would appreciate it.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35650
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Moonchild » 2016-02-15, 01:43

I don't think assuming scheme based on common host name portions is actually the same as guessing addresses, which is quite a bit more harmful.
Scheme guessing won't send people to somewhere else. Address guessing would.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
Sob__
Lunatic
Lunatic
Posts: 251
Joined: 2014-02-17, 01:12
Location: CZ

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Sob__ » 2016-02-15, 02:37

I wouldn't fear it a bit. For start, it's unlikely that anyone will type invalid address with port. If you need to specifically type port number, the ambiguity really is obvious. Even if it happens, chance that the wrong address would exist is very low. And if even that happened, it's not like the address can easily be owned by evil hacker located somewhere else, it would be in same subnet as correct one. I think my chance to win a lottery is higher. :)

BUT you're of course right, it would not be 100%.

I'd be all for a warning page instead of automatic rewrite, but it seems more complicated to me. Although it's just my wild guess, because I'm not familiar with PM code. It's possible that it could be actually easier. At least it would be more proper solution, that's for sure.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35650
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Moonchild » 2016-02-15, 15:19

Sob__ : there's a little kink in your logic in the patch, I think.

You're increasing the colons count whenever a colon is encountered, ignoring if a bracket was already found. That would mean it would throw "malformed" on e.g. [::1]:80 since it checks backwards, and in that case finds 3 colons -- but this is still a valid IPv6 address. I think you got lucky because net_IsValidIPv6Addr() expects only a raw address (without brackets or port) but this isn't guaranteed to remain this way, so let's try to avoid future pitfalls here. ;)

So, what's probably needed is checking for colons only if a bracket has not yet been found (scanning backwards), moving the colons++; inside the if (bracket == nullptr) conditional.
Let me know what you think.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
Sob__
Lunatic
Lunatic
Posts: 251
Joined: 2014-02-17, 01:12
Location: CZ

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Sob__ » 2016-02-15, 18:51

Sure, there's IPv6 address in string "[::1]:80", but the same string as whole is not IPv6 address and never will. If you wanted to change what exactly net_IsValidIPv6Addr() does, you would need to check all places where it is called anyway, and you would include this one among the others. I still think it's perfectly safe.

Correctness and effectivity is a little different story. You're right again, looking for colons, even though it's already clear that input can't be the type of mistyped IPv6 address we're looking for, is wrong. And later unnecessary net_IsValidIPv6Addr() call even more. Your suggested fix is the way to go. Perhaps one extra sanity check can be added to skip things like "http://[::1]:1:2" (original code would catch it in if(colon) branch and would not let it go to net_IsValidIPv6Addr()):

Code: Select all

if (colons > 1 && !bracket && net_IsValidIPv6Addr(serverinfo, serverinfoLen)) {
So it looks like the easy part might be done, we can detect such addresses. Now what we can do with them and where. I tried to look, but I'm completely lost.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35650
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Moonchild » 2016-02-15, 19:55

You're right about the need to check all occurrences if the method of a function changes, of course, but as you said as well there's no real need to call the IPv6 checker when it's obviously not going to be a raw IPv6 address (it's a lot more complex than the code touched here, so no need to pass into it more than needed).

What to do with it? well, print a warning and return malformed URL is the right approach.

Your additional optimization makes sense, so putting it all together we have: Commit 94ee196
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
Sob__
Lunatic
Lunatic
Posts: 251
Joined: 2014-02-17, 01:12
Location: CZ

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Sob__ » 2016-02-15, 21:22

There's one little problem, this alone doesn't really change anything from user perspective, it still redirects to search for this kind of wrong addresses.

Previously the function also returned NS_ERROR_MALFORMED_URI, just a few lines later, when it tried to parse string containing supposed port (which contained everything from the first colon). That's what I meant by requiring some additional code to make it do something useful.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35650
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Moonchild » 2016-02-15, 22:52

Well, this is part 1 then ;)

EDIT: hints for where to go next for added code would be nsDocShell.cpp (for actually handling the code in ::EndPageLoad) and possibly changing the error code from malformed URI to either something existing that matches the type of error, or something that is defined in xpcom/base/ErrorList module NS_ERROR_MODULE_NETWORK but unused (there are a few) and reuse that with a custom error in the front-end (langpack work)

EDIT2: I juggled the repo for this, rolled back to before the change and created a new branch for this as IPv6-fixup-work with this commit as part1.
Trying to fix up the host/address might be simpler because it can pretty much remain contained within the parser (or at most the caller), but I'm not sure if just slapping it within brackets is a valid solution.
Also: what if someone wants to actually do a web search for a specific IPv6 address, and not bracketing it on purpose?

(all of this wouldn't be an issue if people actually didn't insist on this searching from the address bar nonsense and would use the search bar)
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
Sob__
Lunatic
Lunatic
Posts: 251
Joined: 2014-02-17, 01:12
Location: CZ

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Sob__ » 2016-02-16, 01:34

Moonchild wrote:..., but I'm not sure if just slapping it within brackets is a valid solution.
It is, if you are sure that user wanted to enter just address without port. Unfortunately, the answer is just "most likely yes", but you can't be sure (unless the last part contains some obviously hexadecimal number with a-f). If you want no guessing, it's not an option.
Moonchild wrote:Also: what if someone wants to actually do a web search for a specific IPv6 address, and not bracketing it on purpose?
Too bad. ;) I'm not a fan of address bar searching either. I mean, what if I want to search for "www.example.net" instead of opening it? That's why we have dedicated search bar, to make it clear to browser what it's expected to do.

I'll take another look at the code, but I'm starting to wonder if there even is good solution for this.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35650
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Detect IPv6 not encapsulated in [] and correct them

Unread post by Moonchild » 2016-02-16, 01:57

A simple solution actually would be to set keyword.enabled to false if you want malformed URIs to actually throw an error instead of using search. But that obviously also disables all searching from the address bar. Unfortunately, that is what people want, ambiguous or not.

There is a complication with "slapping brackets on it" too: The way the data is passed around is in a data structure that is a raw pointer object (I think because the code is so old), they are not just strings or arrays you can manipulate. I'm also not entirely familiar with how these macros are used, and that will require research into this particular section of code.

Maybe it's better to do this at a higher level, closer to the front-end, if you want to "fix up" the URL; similar to fixups done for DNS errors for raw names (sticking .com and www. on them by default). That would be in the docshell.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite