Google Brain's Artificial Intelligence Research

Here we list their recent research relevant to artificial intelligence and machine learning. The Google Brain team works to advance artificial intelligence by research and systems engineering.


Using Machine Learning to Explore Neural Network Architecture
The process of manually designing machine learning models is difficult as the process of designing networks often takes a significant amount of time and experimentation by those with significant machine learning expertise. Here they look at new types of neural nets to make it possible for non-experts to create neural nets tailored to their particular needs, allowing machine learning to have a greater impact to everyone.
AutoML (Automatic Machine Learning) for large scale image classification and object detection
Google found that AutoML can design small neural networks that perform on a par with neural networks designed by human experts, however, these results were only for small academic datasets. Here they experiment with AutoML on larger more challenging datasets, such as ImageNet image classification.
Neural Optimizer Search with Reinforcement Learning
Here they present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. They train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions.
Searching for Activation Functions
The choice of activation functions in deep networks has a significant effect on the training dynamics and task performance. Currently, the most successful and widely-used activation function is the Rectified Linear Unit (ReLU). Although various hand-designed alternatives to ReLU have been proposed, none have managed to replace it due to inconsistent gains. In this work, researchers propose to leverage automatic search techniques to discover new activation functions. Using a combination of exhaustive and reinforcement learning-based search, researchers discover multiple novel activation functions. Their experiments show that the best discovered activation function, f(x)=x⋅sigmoid(βx), which they named "Swish", tends to work better than ReLU on deeper models across a number of challenging datasets. For example, simply replacing ReLUs with Swish units improves top-1 classification accuracy on ImageNet by 0.9 % for Mobile NASNet-A and 0.6 % for Inception-ResNet-v2. The simplicity of Swish and its similarity to ReLU make it easy for practitioners to replace ReLUs with Swish units in any neural network.
Improving End-to-End Models For Speech Recognition
Traditional automatic speech recognition (ASR) systems, used for a variety of voice search applications at Google, are comprised of an acoustic model (AM), a pronunciation model (PM) and a language model (LM), all of which are independently trained, and often manually designed, on different datasets. AMs take acoustic features and predict a set of subword units, typically context-dependent or context-independent phonemes. Next, a hand-designed lexicon (the PM) maps a sequence of phonemes produced by the acoustic model to words. Finally, the LM assigns probabilities to word sequences. Training independent components creates added complexities and is suboptimal compared to training all components jointly.
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network. In their previous work, they have shown that such architectures are comparable to state-of-the-art ASR systems on dictation tasks, but it was not clear if such architectures would be practical for more challenging tasks such as voice search. In this work, Google Brain researchers explore a variety of structural and optimization improvements to their LAS model which significantly improve performance.
Partnering on machine learning in healthcare
Advanced machine learning can discover patterns in de-identified medical records (that is, stripped of any personally identifiable information) to predict what is likely to happen next, and thus, anticipate the needs of the patients before they arise. So Google has partnered with world-class medical researchers and bioinformaticians at UC San Francisco, Stanford Medicine and University of Chicago Medicine to explore how machine learning combined with clinical expertise could improve patient outcomes, avoid costly incidents and save lives.
Teaching Robots to Understand Semantic Concepts
Machine learning can allow robots to acquire complex skills, such as grasping and opening doors. However, learning these skills requires us to manually program reward functions that the robots then attempt to optimize. In contrast, people can understand the goal of a task just from watching someone else do it, or simply by being told what the goal is. The research describes how robots can use their experience to understand the salient events in a human-provided demonstration, mimic human movements despite the differences between human robot bodies, and understand semantic categories, like “toy” and “pen”, to pick up objects based on user commands.
Avoiding Discrimination through Causal Reasoning
As machine learning progresses rapidly, its societal impact has come under scrutiny. An important concern is potential discrimination based on protected attributes such as gender, race, or religion. Since learned predictors and risk scores increasingly support or even replace human judgment, there is an opportunity to formalize what harmful discrimination means and to design algorithms that avoid it. However, researchers have found it difficult to agree on a single measure of discrimination. As of now, there are several competing approaches, representing different opinions and striking different trade-offs.
Exploring and Visualizing an Open Global Dataset
Machine learning systems are increasingly influencing many aspects of everyday life, and are used by both the hardware and software products that serve people globally. As such, researchers and designers seeking to create products that are useful and accessible for everyone often face the challenge of finding data sets that reflect the variety and backgrounds of users around the world. In order to train these machine learning systems, open, global - and growing - datasets are needed.