Outage post-mortem, and apologies

Pale Moon releases and site news
(read-only)
User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35402
Joined: 2011-08-28, 17:27
Location: Motala, SE
Contact:

Outage post-mortem, and apologies

Unread post by Moonchild » 2022-03-21, 15:28

This post is to as clearly as possible, and unambiguously, inform everyone what happened last weekend (March 19-20 2022), what the consequences and current situation are, and how Pale Moon will move on. I'm also taking this time to apologize to everyone for various issues, inconvenience caused, and my lack of judgement when dealing with people very unlike myself.

I'll try to summarize in a TL;DR at the end, but if you do have some time please read it in its entirety.

Tobin's exit, and resulting outages

On March 19th (UTC+1)
16:20-16:47 There was an escalation between myself and Tobin on IRC, primarily about issues with addons and language packs. After being told in no unmistakable terms that I was "on my own" to solve all the issues, including being left with an incomplete addons site for Pale Moon 30, I left IRC.
17:05 Tobin shut down the add-ons server VPS without warning. This VPS was under his control as an agreement between the two of us that he would run one of the project's servers and pay for it, as "his donation to the project".
17:45 In an attempt to recover essential development data before it too was either removed or made inaccessible, I logged into the other server related to Pale Moon operations that Tobin ran, and attempted to create a backup of data. This process was terminated and my access revoked immediately.
18:01 Tobin posted his "exit rant" on the forum, including posting in public what was private conversation in a non-public IRC channel, and some unverifiable monologue below it giving me "deadlines to respond" that was supposedly towards me but I obviously never saw, because I was not logged in wherever he posted it.
In the exit rant post, he accused me of "trying to steal data" from "the BinOC server" which according to him had no essential data related to Pale Moon. However, it did. If it really wasn't in any way relevant to Pale Moon, I obviously wouldn't have access to it, nor use such access.
18:32-18:48 I became aware of the addons server being down and of Tobin's ragequit post, and had an additional exchange on the forum
18:50 Tobin was banned from the forum
~19:00 I started working on revoking his access on all servers and services in the next hour, starting with the most pertinent areas where the most project damage could be done (the repository server)
19:11 While I was in the process of revoking Tobin's access which he was obviously aware of (as I assume he was notified by e-mail or some other notification of the repo account status change), Tobin logged into one of the project's servers he still had access to ("just in time" by about 25 minutes, as I had not gotten to that server yet, but clearly exceeding his authorization at that time) that was used as DNS server for palemoon.org as well as a number of other, private domains (including one in use by the University of Mexico study groups) and proceeded to elevate to root privileges, then delete all zones and configuration of it and shut the DNS service down. Once DNS stopped resolving, this caused the overall outage of the domains previously served (1-2 hours time frame for things to stop resolving). I was unaware he had done this at this time. With him having killed resolution at his DNS server, and now at Pale Moon's end also, the domains would no longer resolve.
Mar 19 18:11:16 de2 sshd[20651]: Accepted password for mattatobin from 209.222.17.70 port 38482 ssh2
Mar 19 18:11:16 de2 sshd[20651]: pam_unix(sshd:session): session opened for user mattatobin by (uid=0)
Mar 19 18:11:23 de2 sudo: mattatobin : TTY=pts/0 ; PWD=/home/mattatobin ; USER=root ; COMMAND=/bin/bash -c su -p
Mar 19 18:11:23 de2 sudo: pam_unix(sudo-i:session): session opened for user root by mattatobin(uid=0)
Mar 19 18:11:24 de2 su: pam_unix(su:session): session opened for user root by mattatobin(uid=0)
Mar 19 18:12:02 de2 polkitd[507]: Registered Authentication Agent for unix-process:20760:438299705 (system bus name :1.150178 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_GB.UTF-8)
Mar 19 18:12:05 de2 polkitd[507]: Unregistered Authentication Agent for unix-process:20760:438299705 (system bus name :1.150178, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_GB.UTF-8) (disconnected from bus)
Mar 19 18:12:06 de2 su: pam_unix(su:session): session closed for user root
Mar 19 18:12:06 de2 sudo: pam_unix(sudo-i:session): session closed for user root
Mar 19 18:12:08 de2 sshd[20654]: Received disconnect from 209.222.17.70 port 38482:11: disconnected by user
Mar 19 18:12:08 de2 sshd[20654]: Disconnected from 209.222.17.70 port 38482
Mar 19 18:12:08 de2 sshd[20651]: pam_unix(sshd:session): session closed for user mattatobin

Mar 19 18:11:16 de2 systemd-logind: New session 75073 of user mattatobin.
Mar 19 18:11:24 de2 su: (to root) mattatobin on pts/0
Mar 19 18:12:02 de2 systemd: Stopping Berkeley Internet Name Domain (DNS)...
Mar 19 18:12:03 de2 named[815]: received control channel command 'stop'
Mar 19 18:12:03 de2 named[815]: shutting down: flushing changes
Mar 19 18:12:03 de2 named[815]: stopping command channel on 127.0.0.1#953
Mar 19 18:12:03 de2 named[815]: stopping command channel on ::1#953
Mar 19 18:12:03 de2 named[815]: no longer listening on ::#53
Mar 19 18:12:03 de2 named[815]: no longer listening on 127.0.0.1#53
Mar 19 18:12:03 de2 named[815]: no longer listening on 80.255.7.132#53
Mar 19 18:12:03 de2 named[815]: no longer listening on ::1#53
Mar 19 18:12:05 de2 named[815]: exiting
Mar 19 18:12:05 de2 systemd: Stopped Berkeley Internet Name Domain (DNS).
Mar 19 18:12:08 de2 systemd-logind: Removed session 75073.
~19:45 Subsequently, barely after finishing the account control tasks, I lost access to literally everything because all my controlled domains and services were no longer resolving, I had to figure out what he had done, regain access to servers by raw IP, reinstate mail delivery, restore DNS records from a backup I happened to have and transition resolvers to a 3rd party as soon as possible, just in case he had also renamed his NS records to break the link with my registrar's data.

Who is this Tobin guy anyway?

Tobin (Matt A. Tobin) has been an integral part of the Pale Moon project for a good number of years and has been invaluable for a number of transitions that would otherwise have taken way longer or have had Pale Moon take a different development path with less independence. This cooperation was never very smooth because Tobin had a history with SeaMonkey and past experiences/traumas from there (and the resistance to change in the SM council) made him very forceful at regular occasions that required a lot of extra effort to curb, which I simply not always had the will or energy to do. The resulting development of Pale Moon was as much a compromise to his wishes as it was what I saw Pale Moon becoming. As a result, a number of less favourable decisions were made that were received by users with mixed results.
In general, Tobin has been at the forefront of user interactions a lot, which I very regularly had to temper but apparently was unable to do properly (one of my apologies here for that) because, honestly, I'm just one person and can't really do everything, especially if part of it is having to direct other members of my small team into behaving better.

I also should have seen the signs of his deteriorating state of mind, no doubt aggravated by the recent death of his father, a stressful move to a different state, and obviously "putting way too much hay on his fork" in terms of development for Pale Moon. Unfortunately I did not read the signs very well, leaving us with this escalation of events.

Consequences for Pale Moon

The consequences for Pale Moon with his damaging exit and yanking of the plug of the addons server are pretty major, because of the changes made in the new milestone as well as his unilateral decision to completely rewrite the addons site at the same time.

As a result, the current situation is:
  • There is no site with current-version addons, and the software written for it is wholly incomplete
  • There is no site with previous version addons because a new server needs to be set up
  • We don't have streamlined updates for extensions, language packs or themes
  • The development decisions made around add-ons were, as self-admitted by Tobin, to try and manipulate the users into having to make extensions Pale Moon exclusive regardless of the change to the Firefox GUID. I will be working on undoing that mistake and restore previous flexibility with Pale Moon's own GUID and Firefox GUID compatibility for installation.
  • Since this escalation happened shortly after a new milestone release which is always a very critical time in need of many updates and fixes (and is never perfect, which I was blamed personally for, despite needing to release for a critical security flaw (used in the wild)), it was a major blow for operations
I'm now left with the difficult decisions to make to weigh short term fixes against big picture development.
This will mean a rollback of the milestone and security update to 29.4 as I'm not confident this can be solved immediately on v30 in a satisfactory way.

Apologies

At this point I want to sincerely apologize for Tobin's behaviour in the past and my failure to recognize how much of a danger he has been to Pale Moon in terms of project health, progression and direction to becoming a truly independent browser. I equally apologize for not listening more closely to the users who did come forth and expressed genuine concern about his participation in the project. You were all correct, and I should have been more resolute in dealing with team members regardless of their technical value to the project (which in his case was immense).
I also apologize to all extension developers for the massive confusion around compatibility and any additional work this has caused and is causing. It really was not intentional from my end. I thought I was doing right with my course change but apparently I got knocked off-course while I was correcting it.

Thank you

A big thanks to everyone who have spoken up in support of Pale Moon in this troubling time, and to those who have come forth with donations to get things back up and running again (you know who you are!). Now more than ever, Pale Moon could use your support in that respect to get back on track.

TL;DR

To recap, we've had a severe outage due to foul play from one of the core developers abusing his trust and access to knock domain name resolution offline. We are also entirely without an add-ons site which means extensions, themes and language packs aren't available as normal. I'm doing my best to find a proper resolution for this and have halted the rollout for v30 for the moment.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

Locked