Google Details Tensor Chip Powers

In January’s special Top Tech 2017 issue, I wrote about various efforts to produce custom hardware tailored for performing deep-learning calculations. Prime among those is Google’s Tensor Processing Unit, or TPU, which Google has deployed in its data centers since early in 2015.

In that article, I speculated that the TPU was likely designed for performing what are called  “inference” calculations. That is, it’s designed to quickly and efficiently calculate whatever it is that the neural-network it’s running was created to do. But that neural network would also have to be “trained,” meaning that its many parameters would be tuned to carry out the desired task. Training a neural network normally takes a different set of computational skills: In particular, training often requires the use of higher-precision arithmetic than does inference.

Yesterday, Google released a fairly detailed description of the TPU and its performance relative to CPUs and GPUs. I was happy to see that the surmise I had made in January was correct: The TPU is built for doing inference, having hardware that operates on 8-bit integers rather than higher-precision floating-point numbers.[Read More]