Understanding construction documents can be challenging, even for the most advanced AI models like ChatGPT 4o.
When pasting an entire construction document set into ChatGPT 4o, the model can understand and answer questions about the drawings, but the results are not always great. Too often they are hallucinations, or confidently incorrect answers. I expected it to perform better, but maybe it was more my fault then the AI.
While ChatGPT 4o is a powerful tool for interpreting images, it isn’t specifically designed to understand construction documents. Without any specific training or custom databases to query, the prompt becomes the most important factor for good responses.
To address this, I experimented with 8 different prompt techniques to see which enhanced ChatGPT’s ability to understand construction documents accurately. This article explores my methods and findings.
Importance of Prompt Engineering
Prompts are an essential component of Large Language Models and prompt engineering involves crafting specific instructions to guide the LLM to provide clearer, accurate responses. After all the AI will only do what you ask it to do. Therefore its vital to craft a detailed prompt to get the results you want.
There are many prompt frameworks and best practices so I started to learn more. I saw this images posted on LinkedIn by a few people. If multiple people are posting the same image then it must be good place to start right?
Experiment Setup
I ran experiments using 8 prompt frameworks to determine which provided the best responses to questions about a construction document set. I asked the same 30 questions to each prompt.
I divided these 30 questions into 3 difficultly levels; 10 easy, 10 medium and 10 hard. Each question had a predefined correct answer, and responses were categorized as correct, mostly right, not answered or wrong. The final results are simplified with mostly right responses tallied as correct, and not answered as wrong.
For each prompt I created a new thread, attached the full set of architectural drawings, pasted one of the prompts then asked 10 questions. Then repeated this process for each prompt and difficulty level.
To get better results I only asked each thread 10 questions and did so immediately after creating the thread. I noticed asking too many questions or going back to old threads would result in worse responses.
Disclaimers:
- I do not recommend you paste your professional construction documents to ChatGPT.
- I conducted this experiment with CDs of my own house that I drew with the sole purpose of personal AI experiments.
- This is personal studies unrelated to my professional job. Everything posted in this article is my personal opinion and does not necessarily represent the views of any other person or party.
Prompts Tested
R-F-T (act as a Role, create a Task, show as Format)
- Act as a contractor. Answer questions about the attached construction documents. Only provide answers to questions you can reference the answer from the drawing. Answer the question concisely, then state what sheet and drawing number the information is referenced from.
T-A-G (define Task, state the Action, clarify the Goal)
- The task is to answer questions about the attached construction document drawings. Act as an expert contractor and find the answers to the user’s questions in the drawings. Goal is to concisely answer the question, and provide a reference to where that information was found in the documents. If the answer cannot be found, do not answer the question and tell the user the response cannot be found.
B-A-B (explain problem Before, state outcome After, ask for the Bridge)
- It takes too long to find information in construction documents. I want to have my questions answered about the attached construction documents with references to where that information is located in the drawings. Find the information in the drawings that would answer my question and explain how you navigated to that information.
C-A-R-E (give the Context, describe the Action, clarify the Result, give the Example)
- I need AI assistance for reading the attached construction documents. Can you assist us by locating the information asked and explaining why you found that information there? Our desired outcome is a concise answer to the question and a concise explanation of how that information was found. An example is, Question: “How big is the master bedroom door?” Response: “The master bedroom door is 36”x84”. The master bedroom was located on the second floor plan and the door tag is numbered 202. Then I looked at the door schedule and found the row for door number 202 then found the size in the Door Size column.”
A-P-E (state the Action, create a Purpose, describe the Exception)
- Search the attached drawings for information to answer my question. Respond with a concise answer and reference where the information was found in the drawings. The goal is to quickly and concisely answer questions about construction documents. I expect concise answers and references for each question when the information can be found in drawings. If information is not found in drawings, respond with, ‘The answer to your question cannot be found in the drawings.’
E-R-A (describe the Exception, act as a Role, state the Action)
- I expect concise answers and references for each question when the information can be found in drawings. If information is not found in drawings, respond with, ‘The answer to your question cannot be found in the drawings.’ You are to act as an experienced contractor specializing in reading construction documents. I want you to find information in the drawings attached that answer my question. Clearly and concisely answer the question and provide a reference to where that information was found.
R-A-C-E (specify the Role, state the Action, give the Context, give an Example)
- You are to act as an experienced contractor specializing in reading construction documents. I want you to find information in the drawings attached that answer my question. Then clearly and concisely answer the question and provide a reference to where that information was found. The construction drawings are of a single family house and only contain the architectural sheets. Information can be found in the drawings to answer my questions. An example is, Question: “How big is the master bedroom door?” Response: “The master bedroom door is 36”x84”. The master bedroom was located on the second floor plan and the door tag is numbered 202. Then I looked at the door schedule and found the row for door number 202 then found the size in the Door Size column.”
R-I-S-E (specify the Role, describe the Input, ask for Steps, describe the Expectation)
- You are to act as an experienced contractor specializing in reading construction documents. I’ve attached a full architectural construction document set of a single family house. Search the drawings for information that can answer my questions. Answer the question clearly and concisely. Provide a Step by Step guide to how that information was found. I expect concise answers and references for each question when the information can be found in drawings. If information is not found in drawings, respond with, ‘The answer to your question cannot be found in the drawings.’
Questions
Easy – Single-Sheet Queries
- These questions can be answered by referencing a single drawing or sheet within the construction documents. They typically involve straightforward dimensions, locations, or specifications that are clearly marked.
- Examples: What is the dimension of the basement bedroom? What sheet and view is the window sill in the brick wall section detail?
Medium – Cross-Sheet Analysis
- These questions require the contractor to reference information from multiple sheets or drawings to determine the correct answer. It involves combining data points from different locations within the documents.
- Examples: What is the size of the master bedroom door? What is the height of the kitchen upper cabinets?
Hard – Multi-Sheet Synthesis
- These questions involve synthesizing information from multiple sheets and sections within the construction documents. They require a comprehensive understanding of the project to aggregate various data points into a single answer.
- Examples: How many windows are on the house? How many square feet of carpet does the house have?
Results
Evaluating the results
- C-A-R-E prompt framework preformed the best and only prompt to produce more correct answers than incorrect
- The different prompts produced varying length of responses because each prompt instructs the LLM to think differently and approach the problem in a different way.
- All prompt frameworks struggled on the hard questions
- Three of the eight prompt frameworks did not have a drop in performance between easy and medium difficulty questions
- Consistent performance across prompts, with an average of 14.5 correct answers out of 30.
- ChatGPT 4o is good at answering easy and medium level questions, but needs special training to answer hard questions
Takeaways
This experiment showed how different prompts can improve model performance. At a high-level it seems:
- Different prompts can significantly influence model performance.
- ChatGPT 4o provided more incorrect answers than correct, even with different prompting techniques.
- Prompt engineering is crucial for improving response accuracy in construction document reading.
- There is potential to use varied responses from different prompts in a loop for better accuracy.
Conclusion
The experiment demonstrated that while ChatGPT 4o has the potential to understand and answer questions about construction documents, the accuracy of its responses is highly dependent on the prompting technique used.
The C-A-R-E framework showed the best performance, indicating that providing context and clear instructions can significantly enhance the model’s understanding. However, there is still a need for further training and refinement, especially for more complex questions.
This experiment highlights the importance of prompt engineering in leveraging the full potential of LLMs for specialized tasks. Future efforts could focus on developing tailored training datasets and refining prompt techniques to further improve the accuracy and reliability of responses in the construction domain.