Machine learning with limited data
Connecting state and local government leaders
The process of teaching a neural network to recognize an object or pattern typically requires a great deal of data. What if there isn't much available?
Machine learning has been credited with a wide range of advancements over the past few years. It’s the backbone of image recognition technology, chatbots and driverless cars.
“Many people right now are building machine learning applications across numerous fields,” said James Sethian of Berkeley Lab's Center for Advanced Mathematics for Energy Research Applications.
The process of teaching a computer to recognize a stop sign or any other object typically requires a great deal of data. But what if there isn’t much relevant data available?
“Because we’re working in a laboratory, and because there’s a lot of scientific problems here, we often don’t have the vast amounts of curated or labeled data that are necessary to train such a network,” Sethian said.
So Sethian and his colleague Daniël Pelt began work on a neural network that requires fewer parameters and training images. The result is a Mixed-Scale Dense Convolution Neural Network.
A typical neural network is made up of layers. In the past, each layer performed a specific analysis. One layer informs the next layer, so relevant information must be copied and passed along.
“What that means is if there is some relevant information found at a certain layer, but it’s required deeper in the network, [then] it has to be copied throughout the network,” Pelt said.
Standard practice involves looking at fine-scale information (i.e, Is this an edge?) in the early layers and large-scale information (Where is X in relation to Y in the image?) in the later layers.
“The main differences with our approach compared to these traditional ones is that now we mix different scales within each layer,” Pelt told GCN.
This means large-scale information is analyzed earlier along with fine-scale information, allowing the algorithm to focus on only the relevant fine-grain details. The ability to have multiple scales on the same layer was one important change, Pelt said. The second is what gives their network the “Dense” part of its name.
Pelt and Stehian wanted each layer to be able to talk with every other layer – they wanted to layers to be densely connected. This means information doesn’t have to be copied repeatedly throughout the network, and earlier layers can communicate relevant information directly to layers later in the series.
These changes in how the neural network operates mean it needs far fewer parameters and training images to correctly identify what is being observed.
One project Pelt and Sethian have been working on is using this for extracting biological structure from cell images. This is an arduous process when done by hand, taking weeks to accomplish, the researchers said.
But the new neural network was able to be trained on data from seven cells to accurately identify the biological structure of an eighth. (All of these had been done by hand, including the eighth, so the results could be compared.)
Carolyn Larabell, director of the National Center for X-ray Tomography and a professor at the University of California San Francisco School of Medicine, explained the importance of this advance in a statement.
"This new approach has the potential to radically transform our ability to understand disease, and is a key tool in our new Chan-Zuckerberg-sponsored project to establish a Human Cell Atlas, a global collaboration to map and characterize all cells in a healthy human body,” Larabell said.
And while the Mixed-Scale Dense Convolution Neural Network is quite powerful, it does not require supercomputer-scale resources. In fact, this ability is live on a web portal for anyone to access and use for image labeling applications.
“We’re hoping a larger community can take advantage of these capabilities without having to understand all the details inside them,” Sethian said.