This paper examines the Random Utility Model (RUM) in repeated stochastic choice settings where decision‐makers lack full information about payoffs. We propose a gradient‐based learning algorithm that embeds RUM into an online decision‐making framework. Our analysis establishes Hannan consistency for a broad class of RUMs, meaning the average regret relative to the best fixed action in hindsight vanishes over time. We also show that our algorithm is equivalent to the Follow‐The‐Regularized‐Leader method, offering an economically grounded approach to online optimization. Applications include modeling recency bias and characterizing coarse correlated equilibria in normal‐form games.