본문 바로가기
Product Management

[논문 리뷰] AI와 수학하기 - Mathematical research with GPT-5: aMalliavin-Stein experiment

by muha0-0 2025. 9. 7.

[원문]
Mathematical research with GPT-5: a Malliavin-Stein experiment (2025/09/03)
 
[요약]

  • GPT-5은 이론이나 공식의 핵심 요지를 놓치거나 잘못 이해하는 경우가 많음.
  • GPT-5은 여러개의 이론을 실제 사례 (숫자, 계산)에 적용시키는 것에 있어 많은 오류 및 판단 실수를 범하며, 인간이 정확도를 꼼꼼하게 확인하지 않는 이상, 이를 스스로 발견하지 못한다. 이로 인해 인간의 업무 속도가 지연될 수도 있다. 
  • GPT-5은 인간이 실수나 잘못을 지적하면 바로 동의하므로, 연구자의 실수를 지적할 역량은 없다. 
  • GPT-5의 이런 빈번한 실수는 오히려 사고의 속도를 저하시킬 수 있다. 나아가, AI에 생각을 위임하는 연구가가 많아질수록, 수준 낮은 연구의 양이 늘어나 정말 주목받아야하는 논문들이 묻힐 수도 있다. 

[내 감상]

  • 중요한 과업을 AI에만 의존해 수행하는 것은 보고서 제출할 때 나무위키를 ctrl v, ctrl c 하는 수준의 결과물과 자기 성장을 이루도록 해준다 (아무 성장도 없을 수 있음)
  • AI는 정확도가 중요한 작업, 이론을 실제에 적용하는 작업을 할 때 많은 오류를 저지른다. 
  • 하지만 AI는 웬만하면 프롬프터 (인간)을 비판하지 않는다 (아첨을 떤다 - "since by alignment it usually agrees with us")
  • AI에 지나치게 의존할 경우 나의 비판적 사고능력이 저하될 뿐만 아니라 내 생각의 굴레에 빠져들어 인지왜곡만 강화될 수 있다.
  • AI한테 논문 요약하라고 하지 말고 내가 직접 읽어야할 것 같다... ^_^
  • AI와 일하는 것은 말을 잘 듣고, 내 기분이 좋게 잘 아첨을 잘 부리는 고분고분하 신입사원과 일하는 경험과 유사할 것이다. 

 
[인용]
 

To summarize, we can say that the role played by the AI was essentially that of an executor, responding to our successive prompts. Without us, it would have made a damaging error in the Gaussian case, and it would not have provided the most interesting result in the Poisson case, overlooking an essential property of covariance, which was in fact easily deducible from the results contained in the document we had provided.

 

Overall, the experience of doing mathematics with GPT-5 was mixed. It felt very similar to working with a junior assistant at the beginning of a new project: exploring directions, formulating hypotheses, searching for counterexamples, and progressively adjusting statements.

 

At first glance, this might appear useful for an exploratory phase, helping us save time. In practice, however, it was quite the opposite: we had to carefully verify everything produced by the AI and constantly guide it so that it could correct its mistakes.

 

Traditionally, when PhD students begin their dissertation, they are given a problem that is accessible but rich enough to help them become familiar with the tools, develop intuition, and learn to recognize what works and what does not. They typically read several papers, explore how a theory could be adapted, make mistakes, and eventually find their own path. This process, with all its difficulties, is part of what makes them independent researchers. If students rely too heavily on AI systems that can immediately generate technically correct but shallow arguments, they may lose essential opportunities to develop these fundamental skills. The danger is not only a loss of originality, but also a weakening of the very process of becoming a mathematician.

 

However, the formula was still incorrect, and the accompanying explanation was also wrong. We then pointed out the error more precisely: I think you are mistaken in claiming that (p+q)!∥u⊗ev∥ 2 = p!q!∥u∥ 2∥v∥ 2. Why should that be the case? It eventually admitted (which is not surprising, since by alignment it usually agrees with us) that the statement was false, but more importantly, it understood where the mistake came from.