社会情境下奖赏学习中的预测误差加工方式与应激反应的关联：基于计算建模视角

doi:10.12677/ap.2026.164211

期刊菜单

社会情境下奖赏学习中的预测误差加工方式与应激反应的关联：基于计算建模视角
Association between Prediction Error Encoding in Social Reward Learning and Stress Responses: A Computational Modeling Perspective

DOI: 10.12677/ap.2026.164211, PDF,
作者: 马意清：西南大学心理学部，重庆；西南大学教育部认知与人格重点实验室，重庆
关键词: 社会奖赏加工；强化学习；预测误差；应激反应；Social Reward Processing； Reinforcement Learning； Prediction Error； Stress Response

摘要: 心理社会应激常由社会评价威胁与不可控感驱动，但个体在相似应激源下的反应强度与恢复轨迹存在显著差异。奖赏过程被视为潜在的保护机制，然而既有研究多停留在总体性指标，较少从学习更新环节定位差异来源。基于计算建模视角，本研究在社会互动情境中考察奖赏学习的预测误差加工方式与应激反应的关联。80名大学生分别在两次实验中完成特里尔社会应激测验(TSST)与重复信任任务(rTG) (两环节间隔 ≥ 3天)。TSST期间多时点评估主观压力感、不确定感与社会评价威胁，并计算AUCg与AUCi表征急性应激动力学；同时提取心电信号并计算心率变异性(HRV)指标。rTG行为数据采用分层贝叶斯强化学习模型拟合，并以LOOIC与WAIC进行模型比较。结果显示，效价特异性Rescorla-Wagner模型对rTG行为数据拟合最佳，提示社会奖赏学习存在效价不对称更新。参数–应激对应关系上，负性预测误差权重α⁻与主观压力感AUCg及社会评价威胁AUCi呈显著正相关，而α⁺与逆温度参数τ与主观应激指标未见稳定关联。生理层面，模型参数与各HRV指标之间未观察到显著相关。补充分析显示，社会逆温度参数τ_social与感知压力(PSS)总分及失控感维度呈正相关。上述结果提示，社会奖赏学习中对负性预测误差的赋权更可能与急性社会应激的主观动力学特征耦合，而更稳定的决策确定性特征可能在更长时间尺度上与慢性压力负荷相关，为从计算成分层面理解社会情境下奖赏学习–应激关联提供了机制线索。

Abstract: Psychosocial stress is often driven by social-evaluative threat and perceived uncontrollability, yet individuals show marked heterogeneity in stress reactivity and recovery. Although reward-related processes have been proposed as protective factors, prior work has largely relied on global indices and has rarely localized individual differences to specific computational components of learning. From a computational modeling perspective, the present study examined whether prediction-error (PE) processing during social reward learning is associated with acute and chronic stress-related outcomes. Eighty Chinese college students completed the Trier Social Stress Test (TSST) and a repeated Trust Game (rTG) on separate days (≥3-day interval). During the TSST, subjective stress, uncertainty, and social-evaluative threat were assessed at multiple time points, and area-under-the-curve indices (AUCg and AUCi) were derived to characterize stress dynamics. Electrocardiography was recorded to compute heart-rate variability (HRV) indices. Trial-by-trial rTG choices were fitted with hierarchical Bayesian reinforcement-learning models, and model comparison was conducted using LOOIC and WAIC. The valence-specific Rescorla-Wagner model provided the best account of rTG behavior, indicating asymmetric updating for positive versus negative social outcomes. At the individual-differences level, the negative PE weight (α⁻) was positively associated with cumulative subjective stress (AUCg) and the increase in social-evaluative threat (AUCi), whereas the positive PE weight (α⁺) and inverse temperature (τ) showed no robust associations with subjective stress indices. No significant associations were observed between computational parameters and HRV measures. In a supplementary analysis, the social inverse temperature (τ_social) was positively related to perceived stress (PSS), particularly the uncontrollability dimension. These findings suggest that stronger weighting of negative social PEs may preferentially couple with acute subjective stress dynamics, whereas more stable decisional certainty may relate to longer-term stress burden, providing mechanistic insights into the link between social reward learning and stress.

文章引用：马意清 (2026). 社会情境下奖赏学习中的预测误差加工方式与应激反应的关联：基于计算建模视角. 心理学进展, 16(4), 379-391. https://doi.org/10.12677/ap.2026.164211

参考文献

[1]	杨廷忠, 黄汉腾(2003). 社会转型中城市居民心理压力的流行病学研究. 中华流行病学杂志, 24(9), 760-764.
[2]	Berridge, K. C., & Robinson, T. E. (2003). Parsing Reward. Trends in Neurosciences, 26, 507-513.[CrossRef] [PubMed]
[3]	Bonanno, G. A. (2004). Loss, Trauma, and Human Resilience: Have We Underestimated the Human Capacity to Thrive after Extremely Aversive Events? American Psychologist, 59, 20-28.[CrossRef] [PubMed]
[4]	Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M. et al. (2017). Stan: A Probabilistic Programming Language. Journal of Statistical Software, 76, 1-32.[CrossRef] [PubMed]
[5]	Clithero, J. A., & Rangel, A. (2014). Informatic Parcellation of the Network Involved in the Computation of Subjective Value. Social Cognitive and Affective Neuroscience, 9, 1289-1302.[CrossRef] [PubMed]
[6]	Cohen, S., Kamarck, T., & Mermelstein, R. (1983). A Global Measure of Perceived Stress. Journal of Health and Social Behavior, 24, 385-396.[CrossRef]
[7]	Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., & Dolan, R. J. (2011). Model-Based Influences on Humans’ Choices and Striatal Prediction Errors. Neuron, 69, 1204-1215.[CrossRef] [PubMed]
[8]	Dickerson, S. S., & Kemeny, M. E. (2004). Acute Stressors and Cortisol Responses: A Theoretical Integration and Synthesis of Laboratory Research. Psychological Bulletin, 130, 355-391.[CrossRef] [PubMed]
[9]	Faul, F., Erdfelder, E., Lang, A., & Buchner, A. (2007). Gpower 3: A Flexible Statistical Power Analysis Program for the Social, Behavioral, and Biomedical Sciences. Behavior* Research Methods, 39, 175-191.[CrossRef] [PubMed]
[10]	Feder, A., Nestler, E. J., & Charney, D. S. (2009). Psychobiology and Molecular Genetics of Resilience. Nature Reviews Neuroscience, 10, 446-457.[CrossRef] [PubMed]
[11]	Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T., & Hutchison, K. E. (2007). Genetic Triple Dissociation Reveals Multiple Roles for Dopamine in Reinforcement Learning. Proceedings of the National Academy of Sciences, 104, 16311-16316.[CrossRef] [PubMed]
[12]	Hammen, C. (2005). Stress and Depression. Annual Review of Clinical Psychology, 1, 293-319.[CrossRef] [PubMed]
[13]	Hu, W., Liu, Y., Li, J., Zhao, X., & Yang, J. (2021). Early Life Stress Moderated the Influence of Reward Anticipation on Acute Psychosocial Stress Responses. Psychophysiology, 58, e13892.[CrossRef] [PubMed]
[14]	Hu, W., Zhao, X., Liu, Y., Ren, Y., Wei, Z., Tang, Z. et al. (2022). Reward Sensitivity Modulates the Brain Reward Pathway in Stress Resilience via the Inherent Neuroendocrine System. Neurobiology of Stress, 20, Article ID: 100485.[CrossRef] [PubMed]
[15]	Kendler, K. S., Karkowski, L. M., & Prescott, C. A. (1999). Causal Relationship between Stressful Life Events and the Onset of Major Depression. American Journal of Psychiatry, 156, 837-841.[CrossRef] [PubMed]
[16]	Kirschbaum, C., Pirke, K., & Hellhammer, D. H. (1993). The “Trier Social Stress Test”—A Tool for Investigating Psychobiological Stress Responses in a Laboratory Setting. Neuropsychobiology, 28, 76-81.[CrossRef] [PubMed]
[17]	Knutson, B., & Greer, S. M. (2008). Anticipatory Affect: Neural Correlates and Consequences for Choice. Philosophical Transactions of the Royal Society B: Biological Sciences, 363, 3771-3786.[CrossRef] [PubMed]
[18]	Schultz, W. (2016). Dopamine Reward Prediction Error Coding. Dialogues in Clinical Neuroscience, 18, 23-32.[CrossRef]
[19]	Schwabe, L., & Wolf, O. T. (2011). Stress-Induced Modulation of Instrumental Behavior: From Goal-Directed to Habitual Control of Action. Behavioural Brain Research, 219, 321-328.[CrossRef] [PubMed]
[20]	Slavich, G. M., & Irwin, M. R. (2014). From Stress to Inflammation and Major Depressive Disorder: A Social Signal Transduction Theory of Depression. Psychological Bulletin, 140, 774-815.[CrossRef] [PubMed]
[21]	Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd ed.). MIT Press.
[22]	Ulrich-Lai, Y. M., Christiansen, A. M., Ostrander, M. M., Jones, A. A., Jones, K. R., Choi, D. C. et al. (2010). Pleasurable Behaviors Reduce Stress via Brain Reward Pathways. Proceedings of the National Academy of Sciences, 107, 20529-20534.[CrossRef] [PubMed]
[23]	Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC. Statistics and Computing, 27, 1413-1432.[CrossRef]

为你推荐

友情链接