Opinion Pieces

Does ChatGPT Solve Math Problems Correctly? [We Tried It]

ChatGPT has been all the rave since late 2022 and has been put to several uses such as creative writing, essays, resume reviews, and research. If you’re a student or just someone who applies mathematics daily, you’ve probably tried to get it to do some math as well.

Can you trust its solutions to math problems though? We are also curious and have gone to lengths to find out how correct ChatGPT math answers are.

ChatGPT solves basic math problems correctly, however, you cannot totally rely on ChatGPT for more complex math problems. The bot’s knowledge of equations, formulas, and other math frameworks is impressive. However, you would need to fact-check the answer it provides.

Math has different levels to it and we get that so we would get into more detail in the rest of this article. Read on to find out more.

Are Math Answers From ChatGPT Reliable?

The developers of ChatGPT announced that the mathematical capabilities of ChatGPT had been improved on, following an update on 30th January 2023. Before then, many users had mixed opinions about the tool’s math-solving abilities. While ChatGPT could do elementary math to an extent, it somehow struggled with counting items as few as four.

It would be easy to conclude that if ChatGPT cannot solve elementary math, then its math abilities are simply not there. It is then interesting that it can solve math problems that are more advanced than this.

It correctly solved a diverse range of Grade 3 math problems, including basic arithmetic and decimals, where more than two operations are required. It however appears to struggle with fractions. 

Screenshot of Math Problem Result From ChatGPT
math problem result from chatgpt

Despite knowing what needed to be done, that is, dividing the amount of dye used the previous day by the amount of dye used for one pair of pants, it gets the actual calculation wrong and ends up with an inaccurate answer, 20 instead of 4.

A glimmer of hope may be found in the report that ChatGPT-4 performs 30% better than the free ChatGPT-3.5 version in the math section of the standardized aptitude tests (SATs).

Despite this, while ChatGPT-4 and ChatGPT-3.5 respectively score better than 93-94% and 75% of SAT takers, they are still not 100% accurate. ChatGPT-4 scored 730 out of 800 and performed poorly in the inequalities and conversion sections while ChatGPT-3.5 scored 600.

So, while ChatGPT’s math prowess is impressive, its answers are not always accurate. Just as we did, you may find yourself giving it more than one prompt to clarify the question or to get it to re-evaluate the question.

Can ChatGPT Do Calculus and Geometry?

A simple way to think of calculus is to see it as the mathematical study of change. It studies and describes complex physical events like movement, change, and optimization in mathematical terms. 

We tested ChatGPT on some calculus questions, particularly differentiation formulas and it did perfectly well, solving all four questions accurately.

Before you get too excited, other users have however stated that ChatGPT performed woefully on a calculus exam and in another instance, needed guidance to arrive at the correct solution. In both instances, the solving process was correct but the final answer was not. So its calculus ability is still sketchy.

Geometry studies shapes and objects just as calculus studies change. It is a branch of math that studies the properties and measurements of objects in mathematics. 

In getting ChatGPT to solve geometry problems, you are faced with two challenges. Firstly, since it is text-based, the user cannot feed images to it. For this reason, geometry is sometimes excluded in testing ChatGPT’s math abilities.

You thus have to describe the object being studied well enough for ChatGPT to visualize it and answer questions based on that visualization. Another challenge then arises: ChatGPT simply cannot visualize geometrical objects. We presented some geometry problems to it and it either got them wrong or asked for clarity with the question. 

It is thus unsurprising that it scored zero in the geometry section of the SATs. The paid version, ChatGPT4, however, scored a 100 in the same section. It is thus likely that the developers have improved ChatGPT’s geometry capabilities in version 4.

Can ChatGPT Solve Word Problems?

If there is one area of math the average person would reasonably expect ChatGPT to excel in, it’s word problems. It is after all at its most basic, a chatbot. 

Surprisingly, however, it had an abysmal accuracy rate of around 13% when tested on 1000 math word problems in early January. The percentage of accuracy remained the same after the January 30 update.

A game-changer was removing words from the prompt that told ChatGPT not to show its workings. Another test that allowed it to show its workings resulted in 51% accuracy.

It is not clear why showing its workings made that much of a difference. Nonetheless, it is something to bear in mind when giving prompts to ChatGPT.

We repeated one of the word problems to ChatGPT: “One whole number is three times a second. If 20 is added to the smaller number, the result is 6 more than the larger.” While it was not altogether surprising that it got it wrong – it got it wrong in February 2023 –  it was confusing that its wrong solution was different from the solution generated in February.

ChatGPT Result to Word Math Problem

Despite getting a general idea of what needed to be done, ChatGPT mixed up the assigned variables and had to be corrected by the user twice before it figured out what it got wrong. Clearly, relying on ChatGPT for solutions to math word problems might not be such a good idea.

See also: Does ChatGPT Give the Same Answer to Everyone?

In What Ways Is ChatGPT Beneficial in Learning Math?

It is clear that ChatGPT cannot be relied on for its accuracy in solving math problems. It, thus, certainly is a long way off from being an AI math tutor. Asides from its low accuracy, it tends to change its answers when corrected even if the user is wrong.

This does not mean, however, that it is useless in learning math. ChatGPT’s faultline seems to lie not in a lack of knowledge of mathematical principles but in a failure to apply such knowledge accurately to the problem at hand. It is, in fact, a great knowledge base.

This means that ChatGPT can help with math knowledge like theorems and formulas. It is also a good tool to practice math with, as it can generate exercises from the data it was trained with. 

ChatGPT Generates Math Problem For Grade 3 Student
ChatGPT Generates Math Problem For Grade 3 Student

It also shows explanations while solving problems and its easy-to-read and step-by-step workings are an excellent resource for revising taught material. However, it is advisable that advanced learners rather than beginners use it this way because of its tendency to be inaccurate in its workings or its eventual answer.

Where teachers do choose to use ChatGPT in this way, it is suggested that they fact-check ChatGPT’s output and use other existing educational tools.

What Are the Drawbacks of Using ChatGPT to Solve Math Problems?

ChatGPT’s math ability is certainly promising. Its propensity for error cannot, however, be overlooked especially in a field like math that prizes exactness. While it can still be used to solve math problems, this limit should be borne in mind. 

For instance, it still struggles with problems that involve multiple operations. Another area it performs poorly is in making multistep logical inferences. This is particularly useful in solving word problems, as you have to first translate the words into a sort of math equation. 

Exactness and consistency are required in solving mathematical problems and these are sadly not strong suits of ChatGPT.

Using ChatGPT to Solve Math Problems is a Hit or Miss

While using several prompts and guiding ChatGPT through a math problem might help, its answers are not always accurate. There is certainly a lot to learn from its math knowledge base and its easy-to-follow workings. However, sometimes, the bot will simply bluff its way into giving inaccurate answers.

Matt Davidson

Greetings and welcome to The Tech Vox. Find a Tech job and learn about the latest topics in the tech world. Join my team and I as we unravel the latest in gadgets, software, and digital trends. Where we break down complexities, share insights, and explore the forefront of innovation.

Leave a Reply

Your email address will not be published. Required fields are marked *