When Xinchao Wang, Stevens Institute of Technology assistant professor of computer science, started his Ph.D. studies at the École Polytechnique Fédérale de Lausanne, Switzerland, nearly a decade ago, he wanted to translate his longtime passion for artificial intelligence (AI) into application-based computer vision research.
“My friends told me I’d never get a job,” Wang recalls.
Undeterred, he began tracking and analyzing moving people and objects over time using strategically placed cameras and complex calculations. And then a real-life application bounced into his court.
A Slam Dunk with the International Basketball Federation
“[Lausanne] is close to Geneva, home of the International Basketball Federation (FIBA),” Wang explains. “My Ph.D. supervisor got a project with FIBA, and we designed and prototyped a way to use four fixed cameras, one at each corner of the court, to track both the movement of the players and the trajectory of the ball, and then to almost instantaneously analyze their interactions. Before, people would sit on the court and manually record what was happening. Now, with AI and computer vision algorithms, it’s fully automatic and free of human error.”
Wang helped create the mathematical systems that map out the area on a grid, track movements through that grid, and quantify precise movements down to the angle and force used to shoot or pass the ball. The resulting images and statistics help coaches and players evaluate the strengths, weaknesses, and tactics of players and teams.
His advisor sold the prototype to a company in Los Angeles, and now it’s the foundation of the system used across the National Basketball Association (NBA).
Clarifying the Alphabet Soup of AI
Stevens researchers, including Wang, are on the leading edge of artificial intelligence and its subsets of machine learning, deep learning, and computer vision. Though the terms may sound vague to the rest of us, Wang is crystal-clear on their meanings and their high-potential significance.
- Artificial intelligence: “Making a system that can behave and act in a smart way, like a human.”
- Computer vision: “Using AI and digital images to train computers to interpret and understand things that can be seen, such as basketball players and basketballs, just as human eyes do. It’s application-based.”
- Machine learning: “A subset of AI. Asking the machine to apply training and knowledge to perform a task based on patterns and inference rather than full instructions. Success depends on defining optimal theories, models, approaches, and methodologies to be used in applications such as computer vision. It’s more theoretical—you deal with probability and matrices.”
- Deep learning: “A subset of machine learning. Feeding data into the machine, like a human brain, to help it complete tasks like classifying images such as a cat or a dog or a person or a street, then inferring distances and other data, such as the distance of a pedestrian from another point. It’s the kind of theoretical modeling and analysis that allows AI to achieve goals such as autonomous cars.”
Picture This – A Future in AI
Xinchao’s development of this specialized STEM expertise stems from his longtime passion for science fiction.
He was born in China, where he completed his education through high school—and became fascinated by movies such as E.T., The Extra-Terrestrial. When he was just nine years old, he began studying programming using the Logo language. “We programmed to move a pointer up, down, left, and right, so as to make digital drawings,” he says.
Majoring in computer science was a natural progression, and he earned his bachelor’s degree, with honors, from The Hong Kong Polytechnic University. For his senior studies, he worked with Dacheng Tao, now a professor of computer science at the University of Sydney in Australia. “He led me to AI research,” Wang says. “We worked on facial recognition. Now your smartphone can instantly recognize your face, but back in 2008, the model took a lot of time to run, and the accuracy wasn’t very good.” With Tao, undergraduate student Wang published his first paper, on facial recognition, and decided to continue in the AI field.
“AI wasn’t popular because it couldn’t yet be commercialized and applied to real-world problems,” Wang notes. “People said I wouldn’t get a job after graduation, but I was very interested in it. I love that AI has two aspects. The first is like most engineering problems; you’re building something to solve a task, such as making a car or writing a code, that no one has ever solved before. But unlike many engineering problems, the second aspect is about understanding how the human brain works so we can build those smarter machines. For example, if we understand and apply how our brain uses our stereo vision for depth perception, then we can use two fixed cameras looking at the same thing to reconstruct the 3D coordinates of the object, mimicking human eyes.”
After completing his doctorate in Lausanne in 2015, Wang became a post-doctoral fellow with the University of Illinois Urbana in Champaign, where he began using machine learning to adjust the quality and size of images and videos. “With AI techniques, we can convert a small, low-resolution image to a larger, higher-quality image,” he explains. “We can also use AI to adjust a large image or video file to reduce the traffic load of a network.”
In calendar year 2019 alone, Wang has published 18 papers in peer-reviewed journals.
Strike a Pose for Safety
Since coming to Stevens in 2017, Wang has worked on a variety of projects, including leveraging computer vision for 3D human pose estimation.
“We can use a camera to estimate a person’s joints and construct a 3D skeleton to know whether he’s raising his hand or opening his arms, whether he’s lying down or jumping, and other details through time,” Wang notes.
While this is fascinating technology on its own, adding the special sauce of programming to recognize harmless behavior and criminal acts yields potentially life-saving safety applications. “If we install cameras in a parking lot, we can follow the trajectory of people and cars, and use deep learning to understand, analyze, and predict behavior in real time,” Wang explains. “The system could alert security to suspicious events, such as a person sitting inside a car or walking around cars without entering them. In an airport or train or bus station, it could detect when someone puts down baggage and walks away, allowing authorities to react more quickly to possible threats. Of course, it can also be used for other more mundane tasks, such as analyzing consumer shopping behavior for marketing purposes.”
More on computer vision: Teaching A.I. to See: How Computer Vision Is Reshaping Medicine, Security, YouTube and the NBA