Announcing the 6th Fine-Grained Visual Categorization Workshop

Posted by Christine Kaeser-Chen, Software Engineer and Serge Belongie, Visiting Faculty, Google AI

In recent years, fine-grained visual recognition competitions (FGVCs), such as the iNaturalist species classification challenge and the iMaterialist product attribute recognition challenge, have spurred progress in the development of image classification models focused on detection of fine-grained visual details in both natural and man-made objects. Whereas traditional image classification competitions focus on distinguishing generic categories (e.g., car vs. butterfly), the FGVCs go beyond entry level categories to focus on subtle differences in object parts and attributes. For example, rather than pursuing methods that can distinguish categories, such as “bird”, we are interested in identifying subcategories such as “indigo bunting” or “lazuli bunting.”

Previous challenges attracted a large number of talented participants who developed innovative new models for image recognition, with more than 500 teams competing at FGVC5 at CVPR 2018. FGVC challenges have also inspired new methods such as domain-specific transfer learning and estimating test-time priors, which have helped fine-grained recognition tasks reach state-of-the-art performance on several benchmarking datasets.

In order to further spur progress in FGVC research, we are proud to sponsor and co-organize the 6th annual workshop on Fine-Grained Visual Categorization (FGVC6), to be held on June 17th in Long Beach, CA at CVPR 2019. This workshop brings together experts in computer vision with specialists focusing on biodiversity, botany, fashion, and the arts, to address the challenges of applying fine-grained visual categorization to real-life settings.

This Year’s Challenges
This year there will be a wide variety of competition topics, each highlighting unique challenges of fine-grained visual categorization, including an updated iNaturalist challenge, fashion & products, wildlife camera traps, food, butterflies & moths, fashion design, and cassava leaf disease. We are also delighted to introduce two new partnerships with world class institutions—The Metropolitan Museum of Art for the iMet Collection challenge and the New York Botanical Garden for the Herbarium challenge.

The FGVC workshop at CVPR focuses on subordinate categories, including (from left to right, top to bottom) animal species from wildlife camera traps, retail products, fashion attributes, cassava leaf disease, Melastomataceae species from herbarium sheets, animal species from citizen science photos, butterfly and moth species, cuisine of dishes, and fine-grained attributes for museum art objects.

In the iMet Collection challenge, participants compete to train models on artistic attributes including object presence, culture, content, theme, and geographic origin. The Metropolitan Museum of Art provided a large training dataset for this task based on subject matter experts’ descriptions of their museum collections. This dataset highlights the challenge of inferring fine-grained attributes that are grounded in the visual context indirectly (e.g., period, culture, medium).

A diverse sample of images included in the iMet Collection challenge dataset. Images were taken from the Metropolitan Museum of Art’s public domain dataset.

The iMet Collection challenge is also noteworthy for its status as the first image-based Kernels-only competition, a recently introduced option on Kaggle that levels the playing field for data scientists who might not otherwise have access to adequate computational resources. Kernel competitions provide all participants with the same hardware allowances, giving rise to a more balanced competition. Moreover, the winning models tend to be simpler than their counterparts in other competitions, since the participants must work within the compute constraints imposed by the Kernels platform. At the time of writing, the iMet Collection challenge has over 250 participating teams.

In the Herbarium challenge, researchers are invited to tackle the problem of classifying species from the flowering plant family Melastomataceae. This challenge is distinguished from the iNaturalist competition, since the included images depict dried specimens preserved on herbarium sheets, exclusively. Herbarium sheets are essential to plant science, as they not only preserve the key details of the plants for identification and DNA analysis, but also provide a rare perspective into plant ecology in a historical context. As the world’s second largest herbarium, NYBG’s Steere Herbarium collection contributed a dataset of over 46,000 specimens for this year’s challenge.

In the Herbarium challenge, participants will identify species from the flowering plant family Melastomataceae. The New York Botanical Garden (NYBG) provided a dataset of over 46,000 herbarium specimens including over 680 species. Images used with permission of the NYBG.

Every one of this year’s challenges requires deep engagement with subject matter experts, in addition to institutional coordination. By teeing up image recognition challenges in a standard format, the FGVC workshop paves the way for technology transfer from the top of the Kaggle leaderboards into the hands of everyday users via mobile apps such as Seek by iNaturalist and Merlin Bird ID. We anticipate the techniques developed by our competition participants will not only push the frontier of fine-grained recognition, but also be beneficial for applying machine vision to advance scientific exploration and curatorial studies.

Invitation to Participate
We invite teams to participate in these competitions to help advance the state-of-the-art in fine-grained image recognition. Deadlines for entry into the competitions range from May 26 to June 3, depending on the challenge. The results of these competitions will be presented at the FGVC6 workshop at CVPR 2019, and will provide broad exposure to the top performing teams. We are excited to encourage the community’s development of more accurate and broadly impactful algorithms in the field of fine-grained visual categorization!

Acknowledgements
We’d like to thank our colleagues and friends on the FGVC6 organizing committee for working together to advance this important area. At Google we would like to thank Hartwig Adam, Chenyang Zhang, Yulong Liu, Kiat Chuan Tan, Mikhail Sirotenko, Denis Brulé, Cédric Deltheil, Timnit Gebru, Ernest Mwebaze, Weijun Wang, Grace Chu, Jack Sim, Andrew Howard, R.V. Guha, Srikanth Belwadi, Tanya Birch, Katherine Chou, Maggie Demkin, Elizabeth Park, and Will Cukierski.

Continua a leggere

Pubblicato in Senza categoria

Hot Toys Avengers: Endgame 1/6th scale Chris Evans as Captain America 12-inch Figure

“Today we have a chance to take it all back. We will. Whatever it takes.”

Famous for his superhuman strength and an indestructible shield, Steve Rogers who born with the will to carry on, will finds himself called into action to complete a mission with the universe’s entire existence on the line. The Super-Soldier will need to bring together his surviving Avengers companions and the World’s Mightiest Heroes.

As a war hero, national symbol, enemy of oppression and a key member of the Avengers, Steve Rogers has proved his great leadership and importance throughout all these years in Marvel Cinematic Universe productions, and continues to earn hearts of Marvel fans in this remarkable journey. Hot Toys proudly present the 1/6th scale Captain America collectible figure taken direct inspiration from the last installment of The Infinity Saga.

The highly-accurate collectible figure is expertly crafted based on the appearance of Chris Evans as Captain America/Steve Rogers from Avengers: Endgame. It features a newly painted helmeted head sculpt with three interchangeable lower faces capturing Chris Evan’s facial expressions and an un-helmeted head sculpt, a muscular body which naturally portray Captain America’s muscularly toned body, a meticulously tailored outfit with star emblem on the chest, Cap’s star-spangled shield, a delicate compass, a signature helmet and a specially designed movie-themed figure stand.

Scroll down to see all the pictures.
Click on them for bigger and better views.

Hot Toys MMS536 1/6th scale Avengers: Endgame Chris Evans as Captain America Collectible Figure specially features: Authentic and detailed likeness of Chris Evans as Captain America in Avengers: Endgame | newly painted Captain America helmeted head sculpt with three (3) interchangeable lower part of faces capturing Chris Evans’s facial expressions | newly painted un-helmeted head sculpt | Movie-accurate facial expression with detailed hair and skin texture | Approximately 31 cm tall Newly developed body with over 30 points of articulations which naturally portray Captain America’s muscular body in the film | Seven (7) pieces of interchangeable gloved hands including: pair of fists, pair of shield holding hands, shield throwing right hand, shield catching left hand, finger pointing right hand

Costume: navy blue-colored scale patterned Captain America suit with red trims and silver star emblem on chest, navy blue-colored embossed patterned pants with pouches, knee pads, fabric coated elbow pads, fabric coated knee pads, brown-colored leather-like back shield holder and body strap, brown-colored leather-like belt with pouches, dark brown-colored boots

Weapons: circular red and blue Captain America shield with silver star emblem and weathering effect

Accessories: Captain America helmet, compass, movie-themed figure stand with movie logo and character name *Additional accessories coming soon

Continua a leggere

Pubblicato in Senza categoria

Get Involved!

One of the most important approaches we have going right now is the NGA Advocacy and Technical Services department.  It is here, where so much that can affect our day to day world, is worked on, debated, pushed, delayed etc.  This takes on a larger role when you think about the comments made by the Mayor of New York last week with regards to glass buildings.  (There are others in our industry, like Chuck Knickerbocker, who covered this much more eloquently than I could ever, so I’ll their words stand for me)

We are used to people taking shots at our industry and livelihood and we will continue to fight on all fronts but we surely could use more people involved and following along.   Go to this page HERE to see what’s happening currently and reach out to add your help to it.  If we don’t keep working together as an industry comments that are directed at going to less glass will start to become more real than we want them to be!  We can’t sit still, so please get involved!!
Elsewhere…

–  It has been a while, so time for the latest Glass Magazine review… I’m looking at the April edition, which has cover of an incredible entrance that was fabricated by AGNORA.  This issue was dedicated to the “Top Fabricators” and delivered as always.  The special section included a list of the top companies; focus on women owned operations, standout “partner fabricators” as named by their customers and some excellent stats about the market place.  It’s truly an incredible section of reading for anyone who has interest in the fabricated glass world.  Also for the 3rd month in a row, the “Trendhunter” article delivered thanks this time to Michael Spellman of IGE.  Awesome piece on automation on the fab floor that has me wishing I had a plant of my own to put some of this innovation in.  Oh and the “take 5” with Andrew Haring was super.  I am getting to work closely with Andrew on GlassBuild promotion and the guy is absolutely brilliant.

–  Ad of the month choice was tough yet again, very thick issue with a lot of contenders.  My winner this month is Sika.  I usually don’t like text heavy ads- but the Sika ad jumped out at me because of the awesome picture they chose and the color matched Sika logo.  The picture was a very sophisticated structure that made me want to study it.  So when you stop on an ad like that it’s a winner.  Kudos to whomever at Sika did this- I think I only know the great Kelly Townsend there, so Kelly you can take the credit…lol!

–  Congrats to McGrory Glass on their new website launch.  The site setup is very unique with regards to layout and optimization.  It works nicely on the user experience for sure… check it out here- and congrats to the team there on a job well done!
–  The Texas Glass Association Glass Conference II is coming up quickly…. If you are involved in this industry and in Texas, you need to consider getting there for it.  More info can be found HERE.  Personally I am looking forward to seeing old friends and meeting new people.  As I previously noted I am honored to be speaking at the session, and I’ll be sharing some interesting forecast news amongst other nuggets.

In addition the other speakers and topics are very strong.  Learn more here.
–  I have been hearing that black matte hardware- of all styles and applications are getting extremely hard to find.  I’m even hearing that from the millwork side of the business.  That look is hot right now and maybe too hot for everyone to keep up.

–  GlassBuild registration is now open.  Don’t procrastinate… register now and also grab your hotel room.   By the way my Philadelphia friends both the Eagles and Phillies will be in town during the GlassBuild run up and show, so you can mix a little sports with your show of the year.
–  Last this week… one of the coolest things to see architecturally in Michigan is at Michigan State University and the Broad Art Museum.  The exterior is stunning thanks to a great design by Zaha Hadid and glass from Guardian Glass.  Now this summer on the inside of this amazing structure will be an incredible glass sculpture in the exhibition named Oscar Tuazon-Water School.  Instead of me screwing up the description I’ll just use this from Guardian Glass:

“Tuazon’s “water window” uses more than 200 square feet of monolithic, tempered lites provided by Guardian Glass. The four trapezoidal shapes, which weigh in excess of 800 pounds, are installed in a steel frame connected to a post and bearing, which allows the water window to also rotate, further transforming the window into a door. A digitally printed image – a reference to the original water window by Baer – was placed on the 3rd surface and fired into the glass.”

More info can be found here- but if you find yourself in the great state of Michigan this summer- this is worth seeing!
LINKS of the WEEK

There’s usually some dumb political move each week- here’s one for this post.  This can’t be true right?
The length that folks will go to for tickets to the Masters is amazing
This one is the ultimate “hmmmmm” article.  Interesting battle on parking tickets andenforcement. 
VIDEO of the WEEK

So did everyone binge the 2nd season of Cobra Kai on YouTube yet?  I did- thought it was good- of course not as good as season 1, but solid nonetheless.  Regardless of it you watched the show or not- if you saw the original Karate Kid movie- this “mock” 30 for 30 is a fun watch.  Check it out. 6 great minutes of content!

Continua a leggere

Pubblicato in Senza categoria

Hot Toys Avengers: Endgame 1/6th scale Scarlett Johansson as Black Widow Collectible Figure

“This is going to work, Steve.” – Black Widow

After the events of Avengers: Infinity War, half of the population disappeared in the universe. As one of the most compelling characters in the Marvel Cinematic Universe, Black Widow’s hand-to-hand deadly combat skills are unmatched. Teaming up with Captain America, this master spy can’t quit at the moment, especially since the Avengers’ next mission appears to be an incredibly impossible one.

Marvel Studios’ Avengers: Endgame will be arriving at worldwide theatres! Given unprecedented popularity of the character, Hot Toys is elated to present this talented assassin and founding member of the Avengers – Black Widow in 1/6th scale collectible figure!

Delicately crafted based on the appearance of Scarlett Johansson as Black Widow in the movie, the figure features a newly developed head sculpt with remarkable likeness and beautifully braided hair sculpture in her distinctive color, tactical battle suit styled with red markings, highly detailed weapons including articulated pistols, batons and a movie-themed figure stand with an interchangeable graphic card.

Scroll down to see the rest of the pictures.
Click on them for bigger and better views.

Hot Toys MMS533 1/6th scale Avengers: Endgame Black Widow Collectible Figure specially features: Newly developed head sculpt with authentic and detailed likeness of Scarlett Johansson as Black Widow in Avengers: Endgame | Movie-accurate facial expression and make-up | Highly detailed reddish brown hair sculpture of Black Widow with braided hair style | Approximately 28 cm tall Body with over 28 points of articulations | Eight (8) pieces of interchangeable black color gloved hands including: pair of fists, pair of relaxed hands, pair of hands for holding pistols, pair of hands for holding batons

Costume: black-colored one-piece jumpsuit with shoulder armor, red and black-colored wrist guards with Widow’s Bite bracelets, black-colored belt with pistol holsters on thighs, black-colored platform boots

Weapons: long baton, Two (2) short batons, Two (2) articulated pistols

Accessories: non-detachable baton backpack with two (2) baton handles, specially designed graphic card with character icon, movie-themed figure stand with movie logo and character name

Release date: Approximately Q2 – Q3, 2020 Continua a leggere

Pubblicato in Senza categoria

Evaluating the Unsupervised Learning of Disentangled Representations

Posted by Olivier Bachem, Research Scientist, Google AI Zürich

The ability to understand high-dimensional data, and to distill that knowledge into useful representations in an unsupervised manner, remains a key challenge in deep learning. One approach to solving these challenges is through disentangled representations, models that capture the independent features of a given scene in such a way that if one feature changes, the others remain unaffected. If done successfully, a machine learning system that is designed to navigate the real world, such as a self driving car or a robot, can disentangle the different factors and properties of objects and their surroundings, enabling the generalization of knowledge to previously unobserved situations. While, unsupervised disentanglement methods have already been used for curiosity driven exploration, abstract reasoning, visual concept learning and domain adaptation for reinforcement learning, recent progress in the field makes it difficult to know how well different approaches work and the extent of their limitations.

In “Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations” (to appear at ICML 2019), we perform a large-scale evaluation on recent unsupervised disentanglement methods, challenging some common assumptions in order to suggest several improvements to future work on disentanglement learning. This evaluation is the result of training more than 12,000 models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on seven different data sets. Importantly, we have also released both the code used in this study as well as more than 10,000 pretrained disentanglement models. The resulting library, disentanglement_lib, allows researchers to bootstrap their own research in this field and to easily replicate and verify our empirical results.

Understanding Disentanglement
To better understand the ground-truth properties of an image that can be encoded in a disentangled representation, first consider the ground-truth factors of the data set Shapes3D. In this toy model, shown in the figure below, each panel represents one factor that could be encoded into a vector representation of the image. The model shown is defined by the shape of the object in the middle of the image, its size, the rotation of the camera and the color of the floor, the wall and the object.

Visualization of the ground-truth factors of the Shapes3D data set: Floor color (upper left), wall color (upper middle), object color (upper right), object size (bottom left), object shape (bottom middle), and camera angle (bottom right).

The goal of disentangled representations is to build models that can capture these explanatory factors in a vector. The figure below presents a model with a 10-dimensional representation vector. Each of the 10 panels visualizes what information is captured in one of the 10 different coordinates of the representation. From the top right and the top middle panel we see that the model has successfully disentangled floor color, while the two bottom left panels indicate that object color and size are still entangled.

Visualization of the latent dimensions learned by a FactorVAE model (see below). The ground-truth factors wall and floor color as well as rotation of the camera are disentangled (see top right, top center and bottom center panels), while the ground-truth factors object shape, size and color are entangled (see top left and the two bottom left images).

Key Results of this Reproducible Large-scale Study
While the research community has proposed a variety of unsupervised approaches to learn disentangled representations based on variational autoencoders and has devised different metrics to quantify their level of disentanglement, to our knowledge no large-scale empirical study has evaluated these approaches in a unified manner. We propose a fair, reproducible experimental protocol to benchmark the state of unsupervised disentanglement learning by implementing six different state-of-the-art models (BetaVAE, AnnealedVAE, FactorVAE, DIP-VAE I/II and Beta-TCVAE) and six disentanglement metrics (BetaVAE score, FactorVAE score, MIG, SAP, Modularity and DCI Disentanglement). In total, we train and evaluate 12,800 such models on seven data sets. Key findings of our study include:

  • We do not find any empirical evidence that the considered models can be used to reliably learn disentangled representations in an unsupervised way, since random seeds and hyperparameters seem to matter more than the model choice. In other words, even if one trains a large number of models and some of them are disentangled, these disentangled representations seemingly cannot be identified without access to ground-truth labels. Furthermore, good hyperparameter values do not appear to consistently transfer across the data sets in our study. These results are consistent with the theorem we present in the paper, which states that the unsupervised learning of disentangled representations is impossible without inductive biases on both the data set and the models (i.e., one has to make assumptions about the data set and incorporate those assumptions into the model).
  • For the considered models and data sets, we cannot validate the assumption that disentanglement is useful for downstream tasks, e.g., that with disentangled representations it is possible to learn with fewer labeled observations.

The figure below demonstrates some of these findings. The choice of random seed across different runs has a larger impact on disentanglement scores than the model choice and the strength of regularization (while naively one might expect that more regularization should always lead to more disentanglement). A good run with a bad hyperparameter can easily beat a bad run with a good hyperparameter.

The violin plots show the distribution of FactorVAE scores attained by different models on the Cars3D data set. The left plot shows how the distribution changes as different disentanglement models are considered while the right plot displays the different distributions as the regularization strength in a FactorVAE model is varied. The key observation is that the violin plots substantially overlap which indicates that all methods strongly depend on the random seed.

Based on these results, we make four observations relevant to future research:

  1. Given the theoretical result that the unsupervised learning of disentangled representations without inductive biases is impossible, future work should clearly describe the imposed inductive biases and the role of both implicit and explicit supervision.
  2. Finding good inductive biases for unsupervised model selection that work across multiple data sets persists as a key open problem.
  3. The concrete practical benefits of enforcing a specific notion of disentanglement of the learned representations should be demonstrated. Promising directions include robotics, abstract reasoning and fairness.
  4. Experiments should be conducted in a reproducible experimental setup on a diverse selection of data sets.

Open Sourcing disentanglement_lib
In order for others to verify our results, we have released disentanglement_lib, the library we used to create the experimental study. It contains open-source implementations of the considered disentanglement methods and metrics, a standardized training and evaluation protocol, as well as visualization tools to better understand trained models.

The advantages of this library are three-fold. First, with less than four shell commands disentanglement_lib can be used to reproduce any of the models in our study. Second, researchers may easily modify our study to test additional hypotheses. Third, disentanglement_lib is easily extendible and can be used to bootstrap research into the learning of disentangled representations—it is easy to implement new models and compare them to our reference implementation using a fair, reproducible experimental setup.

Reproducing all the models in our study requires a computational effort of approximately 2.5 GPU years, which can be prohibitive. So, we have also released >10,000 pretrained disentanglement_lib models from our study that can be used together with disentanglement_lib.

We hope that this will accelerate research in this field by allowing other researchers to benchmark their new models against our pretrained models and to test new disentanglement metrics and visualization approaches on a diverse set of models.

Acknowledgments
This research was done in collaboration with Francesco Locatello, Mario Lucic, Stefan Bauer, Gunnar Rätsch, Sylvain Gelly and Bernhard Schölkopf at Google AI Zürich, ETH Zürich and the Max-Planck Institute for Intelligent Systems. We also wish to thank Josip Djolonga, Ilya Tolstikhin, Michael Tschannen, Sjoerd van Steenkiste, Joan Puigcerver, Marcin Michalski, Marvin Ritter, Irina Higgins and the rest of the Google Brain team for helpful discussions, comments, technical help and code contributions.

Continua a leggere

Pubblicato in Senza categoria

SpecAugment: A New Data Augmentation Method for Automatic Speech Recognition

Posted by Daniel S. Park, AI Resident and William Chan, Research Scientist

Automatic Speech Recognition (ASR), the process of taking an audio input and transcribing it to text, has benefited greatly from the ongoing development of deep neural networks. As a result, ASR has become ubiquitous in many modern devices and products, such as Google Assistant, Google Home and YouTube. Nevertheless, there remain many important challenges in developing deep learning-based ASR systems. One such challenge is that ASR models, which have many parameters, tend to overfit the training data and have a hard time generalizing to unseen data when the training set is not extensive enough.

In the absence of an adequate volume of training data, it is possible to increase the effective size of existing data through the process of data augmentation, which has contributed to significantly improving the performance of deep networks in the domain of image classification. In the case of speech recognition, augmentation traditionally involves deforming the audio waveform used for training in some fashion (e.g., by speeding it up or slowing it down), or adding background noise. This has the effect of making the dataset effectively larger, as multiple augmented versions of a single input is fed into the network over the course of training, and also helps the network become robust by forcing it to learn relevant features. However, existing conventional methods of augmenting audio input introduces additional computational cost and sometimes requires additional data.

In our recent paper, “SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition”, we take a new approach to augmenting audio data, treating it as a visual problem rather than an audio one. Instead of augmenting the input audio waveform as is traditionally done, SpecAugment applies an augmentation policy directly to the audio spectrogram (i.e., an image representation of the waveform). This method is simple, computationally cheap to apply, and does not require additional data. It is also surprisingly effective in improving the performance of ASR networks, demonstrating state-of-the-art performance on the ASR tasks LibriSpeech 960h and Switchboard 300h.

SpecAugment
In traditional ASR, the audio waveform is typically encoded as a visual representation, such as a spectrogram, before being input as training data for the network. Augmentation of training data is normally applied to the waveform audio before it is converted into the spectrogram, such that after every iteration, new spectrograms must be generated. In our approach, we investigate the approach of augmenting the spectrogram itself, rather than the waveform data. Since the augmentation is applied directly to the input features of the network, it can be run online during training without significantly impacting training speed.

A waveform is typically converted into a visual representation (in our case, a log mel spectrogram; steps 1 through 3 of this article) before being fed into a network.

SpecAugment modifies the spectrogram by warping it in the time direction, masking blocks of consecutive frequency channels, and masking blocks of utterances in time. These augmentations have been chosen to help the network to be robust against deformations in the time direction, partial loss of frequency information and partial loss of small segments of speech of the input. An example of such an augmentation policy is displayed below.

The log mel spectrogram is augmented by warping in the time direction, and masking (multiple) blocks of consecutive time steps (vertical masks) and mel frequency channels (horizontal masks). The masked portion of the spectrogram is displayed in purple for emphasis.

To test SpecAugment, we performed some experiments with the LibriSpeech dataset, where we took three Listen Attend and Spell (LAS) networks, end-to-end networks commonly used for speech recognition, and compared the test performance between networks trained with and without augmentation. The performance of an ASR network is measured by the Word Error Rate (WER) of the transcript produced by the network against the target transcript. Here, all hyperparameters were kept the same, and only the data fed into the network was altered. We found that SpecAugment improves network performance without any additional adjustments to the network or training parameters.

Performance of networks on the test sets of LibriSpeech with and without augmentation. The LibriSpeech test set is divided into two portions, test-clean and test-other, the latter of which contains noisier audio data.

More importantly, SpecAugment prevents the network from over-fitting by giving it deliberately corrupted data. As an example of this, below we show how the WER for the training set and the development (or dev) set evolves through training with and without augmentation. We see that without augmentation, the network achieves near-perfect performance on the training set, while grossly under-performing on both the clean and noisy dev set. On the other hand, with augmentation, the network struggles to perform as well on the training set, but actually shows better performance on the clean dev set, and shows comparable performance on the noisy dev set. This suggests that the network is no longer over-fitting the training data, and that improving training performance would lead to better test performance.

Training, clean (dev-clean) and noisy (dev-other) development set performance with and without augmentation.

State-of-the-Art Results
We can now focus on improving training performance, which can be done by adding more capacity to the networks by making them larger. By doing this in conjunction with increasing training time, we were able to get state-of-the-art (SOTA) results on the tasks LibriSpeech 960h and Switchboard 300h.

Word error rates (%) for state-of-the-art results for the tasks LibriSpeech 960h and Switchboard 300h. The test set for both tasks have a clean (clean/Switchboard) and a noisy (other/CallHome) subset. Previous SOTA results taken from Li et. al (2019), Yang et. al (2018) and Zeyer et. al (2018).

The simple augmentation scheme we have used is surprisingly powerful—we are able to improve the performance of the end-to-end LAS networks so much that it surpasses those of classical ASR models, which traditionally did much better on smaller academic datasets such as LibriSpeech or Switchboard.

Performance of various classes of networks on LibriSpeech and Switchboard tasks. The performance of LAS models is compared to classical (e.g., HMM) and other end-to-end models (e.g., CTC/ASG) over time.

Language Models
Language models (LMs), which are trained on a bigger corpus of text-only data, have played a significant role in improving the performance of an ASR network by leveraging information learned from text. However, LMs typically need to be trained separately from the ASR network, and can be very large in memory, making it hard to fit on a small device, such as a phone. An unexpected outcome of our research was that models trained with SpecAugment out-performed all prior methods even without the aid of a language model. While our networks still benefit from adding an LM, our results are encouraging in that it suggests the possibility of training networks that can be used for practical purposes without the aid of an LM.

Word error rates for LibriSpeech and Switchboard tasks with and without LMs. SpecAugment outperforms previous state-of-the-art even before the inclusion of a language model.

Most of the work on ASR in the past has been focused on looking for better networks to train. Our work demonstrates that looking for better ways to train networks is a promising alternative direction of research.

Acknowledgements
We would like to thank the co-authors of our paper Chung-Cheng Chiu, Ekin Dogus Cubuk, Quoc Le, Yu Zhang and Barret Zoph. We also thank Yuan Cao, Ciprian Chelba, Kazuki Irie, Ye Jia, Anjuli Kannan, Patrick Nguyen, Vijay Peddinti, Rohit Prabhavalkar, Yonghui Wu and Shuyuan Zhang for useful discussions.

Continua a leggere

Pubblicato in Senza categoria

Show Review

Busy industry week for those on the east coast of North America.  Both the Top Glass event in suburban Toronto and the Mid Atlantic Glass Expo in Maryland had full houses and strong events that attracted a great range of glass professionals.  I was lucky enough to be in Maryland and experience what the fine folks at the Mid Atlantic Glass Association put together.  Good large layout that made it easy to work the room and network.  Great representation of industry companies overall and a pretty positive attitude towards the market though the event happened before the latest ABI came out (more on that below).  Of course for me the networking possibilities are the driver and I got to meet some new people and re-connect with others.   In the new category, was great to visit with Joe Sennese of Vitro Architectural Glass.  Good guy for sure and he was working their stand with my old friend Nathan McKenna who gets congratulations on his latest promotion inside the walls at Vitro! Kudos Nathan!  Also new but too quick and not enough time to talk was Trevor Elliott of Kawneer,  I wanted to spend more time with Trevor but he was the MAN in demand being the outgoing President of the  MGA and all.  Hopefully next show we can catch up more, but glad I put a face with the name. 

Meanwhile it was an awesome trip down memory lane for me at the Trulite booth.  I got to see three people that I worked with closely in various times of my professional career but really had not seen any of them once I started Sole Source years ago.  Debbie Lamer and I go very far back and it just was incredible to see she is the same awesome, focused force she was back in the day.  I worked with her when we were both young pups in the business and I look and feel like I am 70 years old and she looks 24.  April Oakley and I worked together at Arch and she was one of my favorite customer service reps ever and she was also one of the first people to ever read this blog way back in 2005.  I loved catching up with her and the fact her skills and talents are being perfectly utilized by Trulite was a daymaker. Last but not least I worked with Ken Passmore when I was at Vitro and he has not changed a bit- still smart, sharp and just an all around solid guy.  It was a great thrill for me to see and visit with these three…  So of course I am now very fired up about the next two events on my calendar… The Texas Glass Association Conference May 17 in Waco and of course GlassBuild America in September.  Can’t wait to see people and keep the networking and education going.  This week just helped keep those fires burning!

Elsewhere….

–  Speaking of GlassBuild… get ready REGISTRATION opens this week!  Keep an eye out for notification of the open and get registered.  Even bigger… go get your hotel rooms locked down, Atlanta will be busy so get in the hotel blocks sooner than later.  This show will be off the charts… you will not want to miss it.

–  Some sad news though this week on the people side.  John Lang passed away.  John was a fantastic sales professional in our world for years and I worked with him when he was at Arch in Kansas City.  Funny guy- great sense of humor and timing.  He retired from the industry a few years back and traveled the world and enjoyed himself.  I am so sorry to see him go, my thoughts, prayers and condolences to John’s family and friends.

–  Back to the industry world, I teased above the Architectural Billings Index (ABI) hit a big roadblock this month… the ABI posted its lowest score in 7 years at 47.8** 

So a few things to look at… first we knew that the crazy January score of 55.3 was a bit of an outlier and so the lower scores are corrections for sure.  The labor shortage as we all know already is very real and now the analysts pointed to that as a reason the index was underwater.   On the good side, work and future activity is still pointing to a strong year and economic experts are waving off any thought of a recession at this juncture.  We’ll keep monitoring it all though and see if we get a bounce back. 
**Note- ABI has changed formulas so the 47.1 posted in 2012 may not be an exact match to this month’s low score, but there’s no conversion chart to use otherwise.

–  On the flip side other metrics were up in March so, again we have a lot of data at play here so it all bears watching closely.  I think those scarred by the recession happen to be more keenly interested in every data point.  (I am one of those for sure)

–  Last this week…  you may have seen this… plans for a big floatable city that is big time sustainable…
LINKS of the WEEK

The college admission scandal continues to enthrall me.  I love the argument now that the celebs didn’t realize they were doing anything wrong…. No way… horrendous excuses abound.
Just click to read the headline.. then move on.  Wild.
Sorry this guy is not as good as El Pres when it comes to Pizza Reviews.  Get the one bite app…
VIDEO of the WEEK

With GlassBuild registration about to open… time to look back at last years event… check it out!

Continua a leggere

Pubblicato in Senza categoria

Hot Toys Avengers: Endgame 1/6th scale Hawkeye / Ronin 12-inch Collectible Figure (Deluxe Version)

“Under different circumstances, this would be a kind of awesome.”

The fourth installment in the Avengers saga will be the culmination of 22 interconnected films and will draw audiences to witness the turning points of this epic journey. Our beloved heroes will truly understand how fragile this reality is, and the sacrifices that must be made to uphold it. Being one of the founding members of the Avengers, Hawkeye showed a glaring absence from Avengers: Infinity War. Returning to the silver screen this time with his primal weapons, the master marksman has to fight this war for lives.

Celebrating the kick off of Avengers: Endgame exhibition powered by Hot Toys in Hong Kong tomorrow, we are very excited to present fans with the latest 1/6th scale Hawkeye collectible figure (Deluxe Version) taken direct inspiration from the movie.

Meticulously crafted based on the image of Jeremy Renner as Hawkeye/Clint Barton in Avengers: Endgame, the screen-accurate collectible figure features a newly developed head sculpt with remarkable likeness, finely elaborated outfit, his signature bow and arrows, a dagger, a number of shurikens and a specially designed movie-themed figure stand.

The biggest highlight of this Deluxe Version is none other than the fact that Hawkeye’s Alternate Version outfit is skillfully translated on a 1/6th scale collectible figure and exclusively include an array of interchangeable parts such as a newly developed masked head sculpt, vest, arm guards, katana with sheath and a number of gloved hands for fans to display the figure in variety of action poses!

Scroll down to see all the pictures.
Click on them for bigger and better views.

Hot Toys MMS532 Avengers: Endgame 1/6th scale Clint Barton/Hawkeye/Ronin Collectible Figure (Deluxe Version) specially features: newly developed head sculpt with authentic and detailed likeness of Jeremy Renner as Hawkeye from Avengers: Endgame | newly developed masked head sculpt with authentic and detailed likeness of Ronin*** | Movie-accurate facial features with detailed wrinkles and skin texture | Approximately 30 cm tall Body with over 30 points of articulations | Six (6) pieces of interchangeable hands for Hawkeye including: pair of fists, pair of open hands, right hand for holding bow, left hand for holding arrow, Six (6) pieces of black-colored interchangeable gloved hands for Ronin including***: pair of fists***, pair of open hands***, pair of hands for holding dagger***

Costume: black colored interchangeable hooded vest with gold trims***, black colored interchangeable arm guards with gold trims***, black colored vest, perfectly tailored black colored long sleeves top, black colored pants with knee guards, black colored boots, black colored utility belt on waist, cross-body shoulder belt, arm guards

Weapons: katana***, black colored bow, Ten (10) individual arrows and twelve (12) interchangeable arrowheads of different styles, dagger, Two (2) shurikens in opened mode, Two (2) shurikens in closed mode

Accessories: katana sheath (attachable to cross-body belt)***, arrow quiver (attachable to cross-body belt), interchangeable character name tag***, Movie-theme figure stand with character name and interchangeable graphic card

*** Exclusive to Deluxe Version

Release date: Approximately Q2 – Q3, 2020 Continua a leggere

Pubblicato in Senza categoria