About the use of system libraries

Talk about code development, features, specific bugs, enhancements, patches, and similar things.
Forum rules
Please keep everything here strictly on-topic.
This board is meant for Pale Moon source code development related subjects only like code snippets, patches, specific bugs, git, the repositories, etc.

This is not for tech support! Please do not post tech support questions in the "Development" board!
Please make sure not to use this board for support questions. Please post issues with specific websites, extensions, etc. in the relevant boards for those topics.

Please keep things on-topic as this forum will be used for reference for Pale Moon development. Expect topics that aren't relevant as such to be moved or deleted.
User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35402
Joined: 2011-08-28, 17:27
Location: Motala, SE
Contact:

About the use of system libraries

Unread post by Moonchild » 2020-01-29, 17:15

Since I've had to (try and) explain this several times to various people already, and it tends to lead to a lot of rhetoric about GNU policy and recommendations that might be decent guidelines for the common case (but can't be used to blindly apply to every software project), I'd like to spend some time here trying to put it down in as clear as possible statements why we have always discouraged - and for official branding use other than private builds have outright prevented - the use of critical system libraries as part of building Pale Moon.

Primarily, the issue is that people who want to use this "prefer system libs" approach overlook one critical thing about our code base: It is extremely large and extremely complex.
While I wholly agree with the practice of building on system libraries if you supply small, simple or rarely-updated software that offloads critical functionality to 3rd party libraries to automatically pick up improvements in those libraries as a convenience, it simply does not work for a project of our size.

Some math

One of the issues is the potentially different combinations of components that result from using system libraries. It's simply impossible, even for a large organization, to build and test software using all possible configurations that can result from building a code base as large as ours with system libraries that get called from many thousands of different locations in the code.
Let's take the following example for UNIX-like operating systems where this primarily applies:
There's the option to use system libraries for no less than 14 libraries due to the size&complexity of the software and historical choices to allow it.
Assume there's on average 4 commonly-used different versions available for each library.
Assume there's 20 different operating system flavors to check (to cover all major distributions of Linux and BSD, that's conservative)
That leads to (4^14)*20 = 5,368,709,120 possible configurations of the final software.
Oh, but it doesn't even end there. Especially on operating systems that build everything from source as opposed to binary package distribution, there's another factor involved: build configuration. Libraries can be built with various different build-time options, some of which would exclude components that would be necessary for our use of the library. This makes the available library versions multiplied by the number of different build configurations that library can have. Let's assume a conservative figure of 3 on/off build options for each library:
We now have ((4*(2^3))^14)*20 = 23,611,832,414,348,226,068,480 (23.6 Septillion) possible configurations of the final software... :crazy:

Even if that number is quite a bit off the mark, it's still a stupidly-large number, and I think you can see how this is an absolute nightmare to consider supporting. Any of these combinations that aren't tested (which is practically all of them) could have issues that are not the result of errors in our source code or even the code of the libraries, but e.g. merely an incompatibility issue. Bug reports would follow that have no clear cause or resolution, and the entire thing as a result becomes unmaintainable.

Some patching required

In addition to the math above, there is the fact to consider that we (must) patch libraries for the functionality we need (a prime example is adding animated PNG support to libpng, but that is far from the only one) or to mitigate specific security issues a browser would face that local software does not (a browser has to process foreign content that is completely out of our control and each bit of it can be malicious). Not to mention the fact that some operating systems are equally forced to patch system libraries to make them usable for general use on the O.S. and its locally-used tools but that we do not have, know about or cater to.
All of this adds even more unknown factors to whether a library taken from the system will be in every single way, in every single function, 100% compatible or what our software expects the library to do or return, which is absolutely essential.

"But it's a security nightmare"

(if you have jumped to this section before reading the previous part of the post, go back now and start from the top; you need that background)
This is the main argument proponents of system libraries always bring up. Unfortunately, the real security nightmare is using system libraries instead of in-tree libraries of known compatibility, configuration and version.

While it is absolutely true that a known vulnerability in a library would also be present in in-tree libraries, there is no immediate certainty of that vulnerability also manifesting in the final software. We may not even use the feature of the library that's vulnerable, for example. Or it may not be exposed to content. Or it's simply not exploitable at all in the way we use the library. And we are aware of security issues and update the software regularly. The only advantage system libraries would give here would be the potential (automatic) mitigation of a lib vulnerability (if it is even applicable) a little sooner because it won't require an update of the software that builds its own libraries as part of its source tree.

The other side of this argument is however much more severe. If there is a version conflict, build configuration problem or above all compatibility issue, it not only can, but absolutely will cause exploitable vulnerabilities in the final software product.
Let's take one of the easiest to exploit vulnerabilities as an example: type confusion. Our software makes a library call and expects a return value of a certain type. Library maintainers decided for whatever reason (maybe a security consideration of their own) to change the return type to something else, or maybe the o.s. maintainers decided to patch it or build it in a different configuration that is more secure in the scope of their distro. Your operating system dutifully updates the system library to the "more secure version", along with any system-supplied tools that need updating for this change. So far so good!
But here's the problem: Our software, without being in any way changed, has now become vulnerable by having a different type of data returned that may reliably crash it in an exploitable way. Especially if it's operating system specific, it would require our software to be patched specifically for that new system library to avoid these problems. Since it wouldn't be an issue anywhere else, the o.s. maintainers might not even be aware of the issue, and we would also not be aware of it because we use known-good versions. In fact we might not even be able to adopt a needed patch because it would break every other setup out there that doesn't use the exact same set of system libraries as the vulnerable configuration...
Then take that, and multiply it by the number of libraries and operating systems available.

Now ask yourself: which of these two scenarios is the real security nightmare?

An analogy, and what this means for branding

To understand why we're refusing to allow official branding on configurations that make excessive use of system libraries for critical functions of the browser, I'm borrowing an analogy from Tobin here that, I think, perfectly illustrated the point:
New Tobin Paradigm wrote:Pale Moon as a product is the Codebase.. the entire codebase and the branding as determined by the creators of the product. Point 8 (sic: of the redistribution license) allows for code changes that allow a positive build but do not result in a materially different product. Substituting in-tree libs for system libs, beyond the potential mismatch and subtle and hard to reconcile conflicts with the glue code, materially changes the nature of the end result. It's like baking a cake and using margarine, powdered eggs, and powdered milk instead of real butter, real eggs, and real milk.

You get something kinda close but it isn't the same thing.. It is something.. OTHER..
Then slapping the Pale Moon sticker on it and passing it off as something it kinda isn't is what the issue is. People are gonna consume that modified cake and it JUST isn't gonna be what they were expecting and then they will come to us about it (sic: since it has our sticker on the packaging). It may even taste HORRIBLE instead of off and guess who is blamed for that.. "This cake tastes horrible! Don't eat any of it".

That is what it boils down to. They (sic: builders using system libraries) want to distribute our cake but then want the ability to change that cake how they see fit but still use our sticker.
The point is, by building the software using system libraries your are changing the software, by including untested, different code supplied by third parties, often in unpredictable, detrimental and possibly unsafe and insecure ways.
To add to this what the GNU foundation thinks of this kind of situation, it's clear that what we are asking of people who insist on using system libraries at their own risk (i.e. to not use our official branding and name) is very much in line with the philosophy behind free software:
https://www.gnu.org/philosophy/free-sw.html wrote:Rules about how to package a modified version are acceptable, if they don't substantively limit your freedom to release modified versions, or your freedom to make and use modified versions privately. Thus, it is acceptable for the license to require that you change the name of the modified version, remove a logo, or identify your modifications as yours. As long as these requirements are not so burdensome that they effectively hamper you from releasing your changes, they are acceptable; you're already making other changes to the program, so you won't have trouble making a few more.
In closing

I hope this clarified the situation (at least somewhat). I'll be more than happy to answer any relevant, concrete and direct questions about this if anything isn't yet clear, but before you do please do read this post in its entirety and with attention.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

Locked