The Letter reports the benefits of decomposing the multilayer perceptron (MLP) for pattern recognition tasks. Suppose there are N classes, instead of employing 1 MLP with N outputs, N MLPs are used, each with a single output. In practice, this allows fewer hidden units to be used than would be employed in the single MLP. Furthermore, it is found that decomposing the problem in this way allows convergence in fewer iterations, and it becomes straightforward to distribute the training over as many workstations as there are pattern classes. The speedup is then linear in the number of pattern classes, assuming there are as many processors as classes. If there are more classes than processors, then the speedup is linear in the number of processors. It is shown that on a difficult hand-written OCR problem, the results obtained with the decomposed MLP are slightly superior than those for the conventional MLP, and obtained in a fraction of the time.