
As load increased, this resulted in latency and some loss of availability for infrastructure that hosts Office 365 portal services.
DARK CLOUD MONEY GLITCHES DRIVER
“ A code issue with a network interface driver caused intermittent packet loss to occur under certain conditions. The PIR goes on to describe the root cause as: In any case, because administrators were affected, the problem was noticed quickly and reports flowed into Microsoft to ask what was happening.

I can’t think of why an end user would want to connect to the Office 365 portal en route to an application like those mentioned above, but I guess it’s possible and some do. The point made here is that the majority of end users continued to work as normal because they use clients (like Outlook or a mobile device) that don’t go anywhere near the portal or use bookmarks (a direct URL such) to access web-based apps like Delve or the Office 365 Video Portal.įor example, if you type, Outlook Web App starts without any need to go near. End-user access to Outlook on the web, SharePoint Online, OneDrive for Business, and other Office 365 services was not affected by this event however, affected users would have been unable to navigate to those services through the Office 365 portal.” “Affected users and administrators were unable to sign in to the Office 365 portal via or. If you’re a tenant that might have been affected by the incident, you can get a copy of the PIR through the 30-day history section of the Office 365 Service Health Dashboard (SHD). I asked Microsoft about the incident and received the Post-Incident Report (PIR reference MO36910 dated 30 December). On December 18, the affect was really only felt by those who wanted to log on to the Office 365 portal () to perform administrative tasks. The December 3 issue prevented many more end users from being able to access Office 365. The biggest and most important difference between the two is how they compromised the ability of end users to work. All incidents are painful for those who are unable to work while a resolution is determined, but as I discuss later on, detecting and fixing a problem’s root cause within a complex infrastructure can take some time.īoth outages occurred during the morning peak in Western Europe.

The December 18 incident was reasonably short-lived at 140 minutes (from 9:15AM UTC to 11:25AM UTC – the earlier incident lasted 316 minutes). However, as it turns out, the root cause had absolutely nothing to do with AAD. The natural reaction of those who couldn’t connect to Office 365 was that the new problem was a rerun of the previous event and that feeling was duly picked up by press commentary at the time. On December 18, another issue surfaced that affected some Office 365 and Azure customers in Western Europe. Last month, I discussed whether a problem that caused Office 365 users to be unable to authenticate provided any indication that Azure Active Directory (AAD) was proving to be an Achilles Heel for Microsoft’s cloud services. That outage affected users in Western Europe on December 3 and underlined the dependency that Office 365 has on other parts of Microsoft's cloud infrastructure.
