
Advances in computer vision and machine learning enable a wide range of technology to perform sophisticated tasks with little or no human supervision. From autonomous drones and self-driving vehicles to medical imaging and product development, many computer applications and robots use visual information to make critical decisions. Cities are increasingly relying on these automated technologies for public safety and infrastructure maintenance.
However, compared to humans, computers see a kind of tunnel vision that leaves them vulnerable to attacks with potential disastrous consequences. For example, a human driver, who sees graffiti covering a stop sign, will still recognize it and stop the car at an intersection. Graffiti can cause a self -driving car, on the other hand, to miss the stop sign and plow the intersection. And, while human minds can filter out all sorts of unusual or extraneous visual information when making a decision, computers will hang small deviations from the expected data.
This is because the brain is infinitely complex and can process a lot of data and past experiences simultaneously to arrive at almost immediate decisions appropriate for the situation. Computers rely on mathematical algorithms trained in datasets. Their creativity and knowledge are constrained by the limitations of technology, mathematics, and human vision.
Malicious actors can take advantage of this vulnerability by changing how a computer views an object, by altering the object itself or some aspect of software involved in vision technology. aw. Other attacks can manipulate computer-made decisions about what it sees. Either way can speak to disaster for individuals, towns, or companies.
A team of researchers at Bourns College of Engineering at UC Riverside is working on ways to curb attacks on computer vision systems. To do that, Salman Asif, Srikanth Krishnamurthy, Amit Roy-Chowdhury, and Chengyu Song first examined which attacks would act.
“People want to do these attacks because there are so many places where machines interpret data to make decisions,” said Roy-Chowdhury, chief investigator at a recently concluded DARPA AI Explorations program called Techniques for Machine Vision Disruption. “It may be in the interest of an adversary to manipulate data where the machine makes a decision. How can an adversary attack a stream of data to make decisions wrong?
An adversary would inject some software malware into a self-driving vehicle, for example, so that when data comes from the camera it is a little disturbed. As a result, models installed to detect a pedestrian fail and the system hallucinates something or does not see one that exists. Understanding how to make effective attacks can help researchers design and better defense mechanisms.
“We look at how an image is distorted so that when it’s analyzed in a machine learning system, it’s miscategorized,” Roy-Chowdhury said. “There are two main ways to do this: Deepfakes where a person’s face or facial expression in a video is altered to deceive someone, and enemy attacks where a tig -attack manipulates how the machine makes a decision but a person.usually not wrong.The idea is that you make a small change to an image that is not understood by someone but an automatic system can make mistakes. ”
Scroll to Continue
Roy-Chowdhury, his colleagues, and their students found that most of the existing mechanisms of attack were targeted by the misclassification of specific objects and activity. However, most scenes have many objects and often there are some relationships between the objects in the scene, meaning that some things happen together more often than others.
People who study computer vision call this co-occurrence “context.” Group members demonstrate how to design contextually-aware attacks that change relationships between objects in the scene.
“For example, a table and a chair are often seen side by side. But the tiger and seat are rarely seen together. We want to manipulate all of this, ”Roy-Chowdhury said. “You can change the stop sign to the speed limit sign and remove the crosswalk. If you replace the stop sign with the speed limit sign but leave the crosswalk, the computer in a self-driving car can still recognize it as a situation where it needs to stop.
Earlier this year, at the Association for the Advancement of Artificial Intelligence conference, researchers showed that for a machine to make a wrong decision it is not enough to manipulate just one thing. The group develops a strategy of making counter attacks that change many things simultaneously in a consistent manner.
“Our main understanding is that successful transfer attacks require holistic manipulation of the scene. We learned a context graph to guide our algorithm on what objects should be targeted to deceive. victim model, while maintaining the overall context of the scene, ”Salman Asif said.
In a paper presented this week at the Conference on Computer Vision and Pattern Recognition conference, the researchers, along with their collaborators at PARC, a research division of the Xerox company, built further on this concept and proposed an approach where the attacker has no access. on the victim’s computer system. This is important because with each entry the attacker risks being found by the victim and a defense against the attack. So the most successful attacks are likely to be those that do not scan the victim’s system, and it is important to anticipate and design defenses against these “zero-query” attacks.
Last year, the same group of researchers took advantage of time-context relationships to launch attacks against sequential videos. They use geometric modifications to design highly efficient attacks on video classification systems. The algorithm brings successful troubles to surprisingly few tests. For example, countermeasures generated from this technique have better attack success rates with 73% fewer attempts compared to most modern methods for attacks. adversarial attack on video. This allows for a faster attack with less scrutiny of the victim’s system. This paper was presented at the major machine learning conference, Neural Information Processing Systems 2021.
The fact that context-aware counter attacks are more powerful on natural images with multiple objects than existing ones that mostly focus on images with a dominant object opens the route even more. effective defenses. These defenses may consider the contextual relationship between objects in an image, or even between objects in a scene in images on multiple cameras. It holds the potential for the development of even more secure systems in the future.
Crossposted from UC Riverside