This paper leverages free‐form textual responses from a key manufacturing survey to create sentiment indexes that mirror categorical measures from the same survey and also contain predictive content—both in and out‐of‐sample—for manufacturing output. We use textual data from the Institute for Supply Management to compare sentiment metrics based on dictionary and deep learning natural language processing methods. The best performing sentiment measures classify comments based on fine‐tuned deep learning models. To add interpretability, we apply Shapley decompositions to show that a relatively small number of words—associated with very positive and very negative sentiment—account for much of the variation in the aggregate sentiment index.