relu rectified linear unit m number of training examples rectified taking a max with 0 ads, housing standard neural net applications: structured data autonomous driving hybrid neural net applications unstructured data applications natural language translation, photo tagging and speech recognition long short-term memory applications grammar learning, handwriting recognition, music composition and speech recognition preferred net type for unlabeled data autoencoder or restricted boltzmann machine (RBM) natural language translation and speech recognition (one dimensional, EG having a time component) recurrent neural net applications hierarchical data, image parsing, text processing recursive neural tensor net (RNTN) applications convolutional neural net (CNN) and deep belief net (DBN) applications image recognition, object recognition, machine vision deep belief net (DBN) applications classification recursive neural tensor net (RNTN) applications object recognition error metric count of incorrect classifications / total classifications recall metric true positives / total positives precision metric true positives / total classifications parallel processing hardware parallelism parallel programming software parallelism ASIC < FPGA < GPU power consumption TPU Google Tensor Processing Unit cross entropy function commonly used in output layer for multiclass classification growing start with a small net, increase its size until cost is not impacted pruning start with a large net, decrease its size until cost is impacted regularizer start large, used to prevent overfitting gating units GRU and LSTM relu, sigmoid range 0.0 to 1.0 tanh range -1.0 to 1.0