Strategic Technology Consultant

Blog

Ideas, Thoughts, Reflections

Assessing the Limits to Black Box Testing in Light of Meltdown & Spectre

If you’re like me, you’re trying to keep up with the news surrounding the Meltdown and Spectre security flaws. It’s fascinating. But instead of recapping what went wrong, I want to discuss a fresh idea as to how companies can better test systems to avoid any of this from happening in the first place.

It isn’t that I’m against black box testing. (That’s a method that examines a system’s functionality without looking at its internal workings.) But I’ve never believed that it’s a great way to catch the really critical types of errors. It’s better suited for the simple stuff: Putting some predetermined value into the system and expecting some value to come back out is fairly painless.

I’ve been lucky enough to work with great developers over the years and this sequence isn’t usually an issue for them. Some companies only do black box testing as a matter of principal. Others don’t include their quality assurance (QA) teams in architecture and development meetings, so by design the QA really doesn’t understand how the systems work.

In light of that, here is a thought experiment.

What type of testing would have caught the recent critical errors like Meltdown and Spectre? What type of skills would be necessary? What type of knowledge would of been required?  

Linus, the inventor of Linux, has become heavily involved with this topic of conversation--very publicaly too, might I add. My point, though? If you’re working on these bugs, it’s a pretty strong indication that you are (or should be) an expert in how the system works. As the world scrambles to fix an Intel issue that has been around as long as modern day computer chips, you have to wonder what other flaws are out there.

To find these types of bugs, you have to really know how the system works and actively test for a very specific issue. Otherwise, folks like Linus, Microsoft and Google would have found the bugs years ago given they have people specifically looking for issues like this. The attack vector has been known for awhile so this isn’t some unknown type of bug.

As we start out 2018 a little slower, since the fix will have to occur in software, in the operating systems, will slow computers down by 5-30%. If your company is in the cloud, it could cost you 5-30% more (Although, that isn’t 100% true because most companies don’t run their systems at capacity.)

Because of this, I challenge the tech world to accept the fact that knowing how to test a system really comes from understanding a system--and therefore understanding what could go wrong.