Can ChatGPT 4o read Construction Docs better than me?

Can ChatGPT 4o read Construction Docs better than me?

I put it to the test with my house CDs without any training or extra data.

Asked it 50 questions about the CDs and overall was underwhelmed with the results.

Before I get into the results some important disclaimers. Do not upload your professional construction documents to ChatGPT.

This is just a personal experiment with construction documents of my own house I drew with the sole purpose of AI personal studies. Uploading directly to ChatGPT is giving OpenAI access to your documents, instead of the secure way of using the APIs.

Ok now to a quick summary of the results:

Open AI’s latest LLM ChatGPT 4o is a remarkable model and its ability to understand images is a game-changer, but I was surprised and disappointed with the amount of incorrect answers it gave. But ChatGPT 4o does deserve admiration. Here’s 3 things that impressed me:

1. The new 4o model can understand construction documents without any specific training, fine-tuning, RAG or any special functions. This test is just on vanilla ChatGPT.

2. The speed of answers even when incorrect is revolutionary. The set is only 32 sheets, but still a ton of information to read in seconds.

3. It was able to find information from one sheet and use it to find the answer on another sheet.

Honestly I expected ChatGPT 4o to do a better job on this experiment. At first, I was really impressed with its ability to understand the drawings, but when I uploaded an entire set the model really struggled to give correct answers.

I tried a few different prompts but that only exaggerated the problems. Here’s 3 things that disappointed me:

1. Hallucinations or when the model generated text that is plausible-sounding but factually incorrect. Earlier models like gpt-3.5-turbo-instuct are much better at this task.

2. Inconsistent answers. Sometimes it would get one question right, then get the same question wrong later. (This surprised me the most of this experiment)

3. Many easy questions stumped it. There were no question type it could always get correct.

Overall ChatGPT 4o cannot read construction documents as well as I can. But there are better ways to use AI to read CDs. It requires more data, RAG and training.

I’ve tested those techniques as well and they are more promising.

I’ll be sharing quick summaries of those results too and a more detailed comparison of how these different strategies compare against each other.