Highlights from the 2018 Google PhD Fellowship Summit

Posted by Susie Kim, Program Manager, University Relations

Google created the PhD Fellowship Program to recognize and support outstanding graduate students doing exceptional research in Computer Science and related disciplines. This program provides a unique opportunity for students pursuing a graduate degree in Computer Science (or related field) who seek to influence the future of technology. Now in its tenth year, our Fellowships have helped support close to 400 graduate students globally in Australia, China and East Asia, India, North America, Europe, the Middle East and Africa, the most recent region to award Google Fellowships.

Every year, Google PhD Fellows are invited to our global PhD Fellowship Summit where they are exposed to state-of-the-art research pursued across Google, and are given the opportunity to network with Google’s research community as well as other PhD Fellows from around the world. Below we share some highlights from our most recent summit, and also announce the newest recipients.

Summit Highlights
At this year’s annual Global PhD Fellowship Summit, Fellows from around the world converged on our Mountain View campus for two days of talks, focused discussions, sharing research work, and networking. VP of Education and University Programs Maggie Johnson welcomed the Fellows and presented Google’s approach to research and its various outreach efforts that encourage collaboration with academia. The agenda also included talks on a range of topics, starting with an opening keynote from Principal Scientist Maya Gupta on controlling machine learning models with constraints and goals to make them do what you want, followed by researchers Andrew Tomkins, Rahul Sukthankar, Sai Teja Peddinti, Amin Vahdat, Martin Stumpe, Ed Chi and Ciera Jaspan giving talks from a variety of research perspectives. A closing presentation was given by Jeff Dean, Senior Fellow and SVP of Google AI, who spoke about using deep learning to solve a variety of challenging research problems at Google.

Starting clockwise from top left: Researchers Rahul Sukthankar and Ed Chi talking with Fellow attendees; Jeff Dean delivering the closing talk; Poster session in full swing.

Fellows had the chance to connect with each other and Google researchers to discuss their work during a poster event, as well as receive feedback from leaders in their fields in smaller deep dives. A panel discussion comprised of Fellow alumni, 2 from academia and 2 from Google, provided both perspectives on career paths.

Google Fellows attending the 2018 PhD Fellowship Summit.

The Complete List of 2018 Google PhD Fellows
We believe that the Google PhD Fellows represent some of the best and brightest young researchers around the globe in Computer Science, and it is our ongoing goal to support them as they make their mark on the world. As such, we would like to announce the latest recipients from China and East Asia, India, Australia and Africa, who join the North America, Europe and Middle East Fellows we announced last April. Congratulations to all of this year’s awardees! The complete list of recipients is:

Algorithms, Optimizations and Markets
Emmanouil Zampetakis, Massachusetts Institute of Technology
Manuela Fischer, ETH Zurich
Pranjal Dutta, Chennai Mathematical Institute
Thodoris Lykouris, Cornell University
Yuan Deng, Duke University

Computational Neuroscience
Ella Batty, Columbia University
Neha Spenta Wadia, University of California – Berkeley
Reuben Feinman, New York University

Human Computer Interaction
Gierad Laput, Carnegie Mellon University
Mike Schaekermann, University of Waterloo
Minsuk (Brian) Kahng, Georgia Institute of Technology
Niels van Berkel, The University of Melbourne
Siqi Wu, Australian National University
Xiang Zhang, The University of New South Wales

Machine Learning
Abhijeet Awasthi, Indian Institute of Technology – Bombay
Aditi Raghunathan, Stanford University
Futoshi Futami, University of Tokyo
Lin Chen, Yale University
Qian Yu, University of Southern California
Ravid Shwartz-Ziv, Hebrew University
Shuai Li, Chinese University of Hong Kong
Shuang Liu, University of California – San Diego
Stephen Tu, University of California – Berkeley
Steven James, University of the Witwatersrand
Xinchen Yan, University of Michigan
Zelda Mariet, Massachusetts Institute of Technology

Machine Perception, Speech Technology and Computer Vision
Antoine Miech, INRIA
Arsha Nagrani, University of Oxford
Arulkumar S, Indian Institute of Technology – Madras
Joseph Redmon, University of Washington
Raymond Yeh, University of Illinois – Urbana-Champaign
Shanmukha Ramakrishna Vedantam, Georgia Institute of Technology

Mobile Computing
Lili Wei, Hong Kong University of Science & Technology
Rizanne Elbakly, Egypt-Japan University of Science and Technology
Shilin Zhu, University of California – San Diego

Natural Language Processing
Anne Cocos, University of Pennsylvania
Hongwei Wang, Shanghai Jiao Tong University
Jonathan Herzig, Tel Aviv University
Rotem Dror, Technion – Israel Institute of Technology
Shikhar Vashishth, Indian Institute of Science – Bangalore
Yang Liu, University of Edinburgh
Yoon Kim, Harvard University
Zhehuai Chen, Shanghai Jiao Tong University
Imane khaouja, Université Internationale de Rabat

Privacy and Security
Aayush Jain, University of California – Los Angeles

Programming Technology and Software Engineering
Gowtham Kaki, Purdue University
Joseph Benedict Nyansiro, University of Dar es Salaam
Reyhaneh Jabbarvand, University of California – Irvine
Victor Lanvin, Fondation Sciences Mathématiques de Paris

Quantum Computing
Erika Ye, California Institute of Technology

Structured Data and Database Management
Lingjiao Chen, University of Wisconsin – Madison

Systems and Networking
Andrea Lattuada, ETH Zurich
Chen Sun, Tsinghua University
Lana Josipovic, EPFL
Michael Schaarschmidt, University of Cambridge
Rachee Singh, University of Massachusetts – Amherst
Stephen Mallon, The University of Sydney

Continua a leggere

Pubblicato in Senza categoria

Learning to Predict Depth on the Pixel 3 Phones

Posted by Rahul Garg, Research Scientist and Neal Wadhwa, Software Engineer

Portrait Mode on the Pixel smartphones lets you take professional-looking images that draw attention to a subject by blurring the background behind it. Last year, we described, among other things, how we compute depth with a single camera using its Phase-Detection Autofocus (PDAF) pixels (also known as dual-pixel autofocus) using a traditional non-learned stereo algorithm. This year, on the Pixel 3, we turn to machine learning to improve depth estimation to produce even better Portrait Mode results.

Left: The original HDR+ image. Right: A comparison of Portrait Mode results using depth from traditional stereo and depth from machine learning. The learned depth result has fewer errors. Notably, in the traditional stereo result, many of the horizontal lines behind the man are incorrectly estimated to be at the same depth as the man and are kept sharp.
(Mike Milne)

A Short Recap
As described in last year’s blog post, Portrait Mode uses a neural network to determine what pixels correspond to people versus the background, and augments this two layer person segmentation mask with depth information derived from the PDAF pixels. This is meant to enable a depth-dependent blur, which is closer to what a professional camera does.

PDAF pixels work by capturing two slightly different views of a scene, shown below. Flipping between the two views, we see that the person is stationary, while the background moves horizontally, an effect referred to as parallax. Because parallax is a function of the point’s distance from the camera and the distance between the two viewpoints, we can estimate depth by matching each point in one view with its corresponding point in the other view.

The two PDAF images on the left and center look very similar, but in the crop on the right you can see the parallax between them. It is most noticeable on the circular structure in the middle of the crop.

However, finding these correspondences in PDAF images (a method called depth from stereo) is extremely challenging because scene points barely move between the views. Furthermore, all stereo techniques suffer from the aperture problem. That is, if you look at the scene through a small aperture, it is impossible to find correspondence for lines parallel to the stereo baseline, i.e., the line connecting the two cameras. In other words, when looking at the horizontal lines in the figure above (or vertical lines in portrait orientation shots), any proposed shift of these lines in one view with respect to the other view looks about the same. In last year’s Portrait Mode, all these factors could result in errors in depth estimation and cause unpleasant artifacts.

Improving Depth Estimation
With Portrait Mode on the Pixel 3, we fix these errors by utilizing the fact that the parallax used by depth from stereo algorithms is only one of many depth cues present in images. For example, points that are far away from the in-focus plane appear less sharp than ones that are closer, giving us a defocus depth cue. In addition, even when viewing an image on a flat screen, we can accurately tell how far things are because we know the rough size of everyday objects (e.g. one can use the number of pixels in a photograph of a person’s face to estimate how far away it is). This is called a semantic cue.

Designing a hand-crafted algorithm to combine these different cues is extremely difficult, but by using machine learning, we can do so while also better exploiting the PDAF parallax cue. Specifically, we train a convolutional neural network, written in TensorFlow, that takes as input the PDAF pixels and learns to predict depth. This new and improved ML-based method of depth estimation is what powers Portrait Mode on the Pixel 3.

Our convolutional neural network takes as input the PDAF images and outputs a depth map. The network uses an encoder-decoder style architecture with skip connections and residual blocks.

Training the Neural Network
In order to train the network, we need lots of PDAF images and corresponding high-quality depth maps. And since we want our predicted depth to be useful for Portrait Mode, we also need the training data to be similar to pictures that users take with their smartphones.

To accomplish this, we built our own custom “Frankenphone” rig that contains five Pixel 3 phones, along with a Wi-Fi-based solution that allowed us to simultaneously capture pictures from all of the phones (within a tolerance of ~2 milliseconds). With this rig, we computed high-quality depth from photos by using structure from motion and multi-view stereo.

Left: Custom rig used to collect training data. Middle: An example capture flipping between the five images. Synchronization between the cameras ensures that we can calculate depth for dynamic scenes, such as this one. Right: Ground truth depth. Low confidence points, i.e., points where stereo matches are not reliable due to weak texture, are colored in black and are not used during training. (Sam Ansari and Mike Milne)

The data captured by this rig is ideal for training a network for the following main reasons:

  • Five viewpoints ensure that there is parallax in multiple directions and hence no aperture problem.
  • The arrangement of the cameras ensures that a point in an image is usually visible in at least one other image resulting in fewer points with no correspondences.
  • The baseline, i.e., the distance between the cameras is much larger than our PDAF baseline resulting in more accurate depth estimation.
  • Synchronization between the cameras ensure that we can calculate depth for dynamic scenes like the one above.
  • Portability of the rig ensures that we can capture photos in the wild simulating the photos users take with their smartphones.

However, even though the data captured from this rig is ideal, it is still extremely challenging to predict the absolute depth of objects in a scene — a given PDAF pair can correspond to a range of different depth maps (depending on lens characteristics, focus distance, etc). To account for this, we instead predict the relative depths of objects in the scene, which is sufficient for producing pleasing Portrait Mode results.

Putting it All Together
This ML-based depth estimation needs to run fast on the Pixel 3, so that users don’t have to wait too long for their Portrait Mode shots. However, to get good depth estimates that makes use of subtle defocus and parallax cues, we have to feed full resolution, multi-megapixel PDAF images into the network. To ensure fast results, we use TensorFlow Lite, a cross-platform solution for running machine learning models on mobile and embedded devices and the Pixel 3’s powerful GPU to compute depth quickly despite our abnormally large inputs. We then combine the resulting depth estimates with masks from our person segmentation neural network to produce beautiful Portrait Mode results.

Try it Yourself
In Google Camera App version 6.1 and later, our depth maps are embedded in Portrait Mode images. This means you can use the Google Photos depth editor to change the amount of blur and the focus point after capture. You can also use third-party depth extractors to extract the depth map from a jpeg and take a look at it yourself. Also, here is an album showing the relative depth maps and the corresponding Portrait Mode images for traditional stereo and the learning-based approaches.

This work wouldn’t have been possible without Sam Ansari, Yael Pritch Knaan, David Jacobs, Jiawen Chen, Juhyun Lee and Andrei Kulik. Special thanks to Mike Milne and Andy Radin who captured data with the five-camera rig.

Continua a leggere

Pubblicato in Senza categoria

AC Play 1/6th scale Street Bruiser American Soldier Outfit (Green) & Head Sculpts Set

Pre-order Acplay ATX044 1/6 Scale Street Bruiser American Soldier Two Head & Costume Set from KGHobby (link HERE)

Guile (ガイル Gairu) is a character in Capcom’s Street Fighter series of fighting games. He debuted as one of the original eight characters in 1991′s Street Fighter II and appeared in the game’s subsequent updates. In the games he is portrayed as a major in the United States Air Force who is seeking to avenge the death of his Air Force buddy Charlie at the hands of the villainous dictator M. Bison.

AC Play 1/6th scale Street Bruiser American Soldier Outfit (Green) & Head Sculpts Set Features: Normal expression head, Angry expression head, Green vest, Green camouflage trousers, Leather belt, Necklace, Boots, Special effects accessories x2, Tattoo sticker. NOTE: Body not included (display only). Works with Phicen M35 super-flexible seamless male body and other similarly styled muscular bodies.

Scroll down to see the rest of the pictures.
Click on them for bigger and better views.

Continua a leggere

Pubblicato in Senza categoria