Know the limitaion of Machine Learning
Representativeness of data
Carmakers are building a fake city, for example, in Michigan to test autonomous vehicles. What’s important, though, is whether the test data represents real-world driving conditions?
A highly autonomous vehicle is designed to operate only in a certain designated area such as “driving only in downtown Pittsburgh.” In DoT lingo, this concept is the “Operational Design Domain.” So the question to ask is: Does the data used for that autonomous driving truly represent that particular designated area?
“Over-fitting” is a well-known problem in Machine Learning. Over-fitting could occur when a model begins to "memorize" training data rather than "learning" to generalize from trends, Koopman explained. An over-trained machines can perfectly predict the training data, but it falters when attempting predictions about new or unseen data.
Regulators should be fully aware of this pitfall and at least raise questions about it.
Validation of testing environment
“You need to make sure simulation is mapped to the real world,” said Koopman. You need to examine whether the representativeness of the testing environment and statistical analysis of validation via testing (including simulation) effectively measure the established safety goals.
Analysis of brittleness
Machine Learning is tough. When an ML-based system encounters something it has never seen before (known as “long tail” or “outlier”), “the ML-system can freak out,” said Koopman.
“When something really unusual happens, people – human drivers – would at least realize something unusual has happened,” said Koopman, and they might try to do something about it, successfully or unsuccessfully. In contrast, a machine might not register this extreme anomaly. It could just keep on going.
This is known as “brittleness” in ML terms. “When that happens, I want to see a plan,” said Koopman, as to how an autonomous vehicle should cope.
‘Safe,’ ‘unsafe,’ ‘not sure’
When an autonomous vehicle is operating in conditions outside the intended ODD (Operational Design Domain), it has to recognize it’s outside its comfort zone. Such an adventure must be deemed invalid.
Koopman said, “I’m not saying that the autonomous vehicle operating outside its envelope is NOT safe, but being ‘unsafe’ and ‘not sure’ aren’t the same thing.”
Noting that there are only three situations – ‘safe,’ ‘unsafe’ and ‘not sure,” he said, “Don’t tell me your autonomous vehicle is safe, when you haven’t done the engineering legwork to determine whether it is safe, or if you aren’t sure.”
If a robo-car ventures beyond its ODD, “First, I want to make sure that your vehicle knows about it. Second, I want to know what the strategy is when that happens. Will it shut down the system in an orderly manner, or will it do something else?”
ISO 26262 is essential
In summing up improvements in the Fed’s safety assessment requirement, especially related to Machine Learning, Koopman stressed that his comments are not meant to say that the ISO 26262 standard does not apply.
He said, “Rather, it is essential that ISO 26262 style safety engineering be performed. Within that context, ML data sets either need to be credibly mapped into the standard’s framework, or something additional must be done beyond ISO 26262 for ML validation.”
Next page: Independence of safety assessment