Integrating multimodal data and machine learning for entrepreneurship research
Yash Raj Shrestha & Vivianna Fang He
Abstract
Research Summary Extant research in neuroscience suggests that human perception is multimodal in nature—we model the world integrating diverse data sources such as sound, images, taste, and smell. Working in a dynamic environment, entrepreneurs are expected to draw on multimodal inputs in their decision making. However, extant research in entrepreneurship has largely focused on how entrepreneurs or investors develop insights from data in a single mode. A few studies that have used a multimodal approach either simplify the multimodal data (MMD) into a few constructs or manually analyze the data without fully utilizing their potential. Such oversimplification limits the insights that can be gained from MMD. In this paper, we offer a framework to guide researchers to analyze and integrate MMD, capturing various cues embedded in the entrepreneurial process. We illustrate how applying machine learning algorithms to MMD can engender a robust, reliable, and scalable approach for researchers to effectively capture the elusive yet critical aspects of entrepreneurial phenomena. We also curate a set of data and algorithm resources for researchers interested in leveraging MMD in their studies. Managerial Summary Entrepreneurs operate in fast‐paced and complex environments where success often relies on the ability to make sense of diverse and rich information, which ranges from explicit observations (e.g., what they see and hear) to more subtle contextual cues. Yet, most entrepreneurship research focuses on analyzing data in a single mode, such as only texts or numbers. Our research highlights the importance of embracing multimodal data ( MMD ) that combines various formats like audio, image, video, and text, to better understand and explain entrepreneurial decision‐making. We introduce a practical framework and a set of machine learning techniques that help managers and researchers alike harness the richness of multimodal data. Rather than simplifying or manually analyzing multimodal data, our approach allows for scalable, systematic, and reliable insights into the entrepreneurial process. For practitioners, this means better tools for evaluating pitches, tracking team dynamics, or sensing market trends in real time. To support adoption, we also provide a curated set of MMD sources and algorithms that organizations can leverage to make more informed strategic decisions.
5 citations
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.41 × 0.4 = 0.16 |
| M · momentum | 0.63 × 0.15 = 0.09 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.