Building Bridges Between Computational Methods and Human Translation: An English to Brazilian Portuguese Application of Machine Translation in the Cross-Cultural Adaptation of Psychological and Health-Related Assessments
MAICON RODRIGUES ALBUQUERQUE et al.
Abstract
The present study evaluated the effectiveness of machine translation (MT) in both forward (English to Brazilian Portuguese) and backward translation (Brazilian Portuguese to English) of psychological and health-related assessments. The quality of the translations was assessed using the COMET (Crosslingual Optimized Metric for Evaluation of Translation) metric, and statistical modeling was performed using Generalized Estimating Equations (GEE). In forward translation, COMET scores from DeepL (β = 0.0020, p = 0.667), OpenAI (β = 0.0041, p = 0.256), and Widn.AI (β = 0.0027, p = 0.505) showed no statistically significant differences from human outputs, whereas, Azure (β = −0.0143, p = 0.024) showed statistically significant underperformance. W-ADL and SCOFF showed lower scores, often below the 0.940 threshold, suggesting greater cultural adaptation demands. In back-translation, DeepL (β = −0.000075, p = 0.965), OpenAI (β = −0.0002, p = 0.883), and Widn.AI (β = −0.0047, p = 0.227) matched human performance, but Azure again underperformed (β = −0.0103, p = 0.013). Lower COMET scores were observed for SPAI, PSDQ, BIS-11, W-ADL, and SCOFF compared to DII (all p < 0.05). Despite this, the overall quality of back-translation remained high. Overall, COMET appears to be a robust metric for evaluating semantic fidelity in both directions, particularly in back-translation when the original version serves as a reference. These results support the integration of MT into cross-cultural adaptation approaches, suggesting that this approach is not a replacement but a supportive tool that enhances translation efficiency while maintaining the indispensable role of human expert judgment.
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.