“We were thrilled because we were up against more than 500 groups from industry and academia, including Baidu, Google, Huawei, Stanford, and MIT,” says team lead and HP Labs senior research scientist Dr. Lei Liu. Joining Liu in developing the winning solution were Dr. Pang-Ning Tan, associate professor of computer science and engineering at Michigan State University, and Xi Liu, a Ph.D. student at Michigan State and former intern who worked with Liu at HP Labs this summer.
The competition was run in association with the UK-based SPHERE project, which aims to impact healthcare through data-fusion and machine learning via the development of a common platform of home-based environmental sensors and with a particular emphasis on enabling the elderly to live safely at home while maintaining their privacy and independence.
“While our focus here was on advancing activity recognition to help keep seniors safe, the technologies we invented can work with data from any other types of sensors, and thus have potential uses far beyond healthcare,” Liu suggests. “They’re particularly relevant to services built around home healthcare, wearable devices, and the Internet of Things, all three of which are areas of current interest at HP Labs.”
To achieve their win, the HP/Michigan State team had to first develop a system for matching the raw ‘training’ data with the activity labels they were given, extracting a set of features from that data that clearly represented each activity class. They then looked for spatial and temporal patterns in the data, building that information into their machine learning model. In addition, they developed a novel approach to the issue of “overfitting,” a major challenge in machine learning where the learning model overreacts to minor fluctuations in the training data and therefore offers poor predictions when presented with new data points.
“This problem bothered us for quite a while,” says Liu. “One of the challenges was that we could improve performance with the training data, but when we tried the solution on the unlabeled test data, its performance actually got worse. This performance inconsistency is a huge challenge because it doesn’t offer a good indication of how to get around it, so you have no idea if the effort you are making to address it each time will be effective.”
The team’s solution to the problem was a novel boosting approach, coupled with a diverse set local predictive models that also helped make the machine learning system more robust.
“Another challenge was that they were also missing some sensors, like cameras, in some areas of the house, like the bedroom and bathroom,” adds Liu. “We had to learn the relations between different types of sensor data so that we could spot an emergency anywhere in the house. We did that by modeling relationships between data from different sensors when that was available and then using the model to accurately identify what people were doing even when no one could actually see them.”