Efficient data processing is a major challenge for representatives of many companies. This is especially true for companies that manage many financial documents in their operations. I decided to present one of the solutions developed for a client, which is a response to the challenges posed. The mentioned client was looking for an automated way to be able to analyze PDF invoices. The goal was to achieve accurate data extraction of the most important financial information about totals and cost breakdowns.
Processing PDF files presents significant challenges, particularly in extracting structured data from these documents. Accurate data extraction is crucial for ensuring reliable financial information and improving efficiency in invoice management.
I invite you to read more about the challenges, describe our approach as well as the technologies used, and reveal information about the implementation strategy and cost of this solution.
Table of Contents:
1. Discovering and understanding the client's expectation
2. Approach to handling PDF invoices
3. Extracting Text from PDF Files
4. Analyze the PDF content using ChatGPT
5. Deploy the application in the cloud
Introduction to Invoice Automation
Invoice automation is the process of using technology to streamline and automate the extraction of data from invoices, making it easier to manage and process financial transactions. With the rise of artificial intelligence (AI) and machine learning (ML), invoice automation has become more efficient and accurate.
By automating the invoice process, businesses can save significant time and reduce errors associated with manual data entry. This not only improves the accuracy of extracted data but also enables businesses to make more informed decisions. Additionally, invoice automation helps businesses comply with regulatory requirements and reduces the risk of fraud. The ability to quickly and accurately extract data from invoices ensures that financial records are up-to-date and reliable, which is crucial for effective financial management.
Discovering and understanding the client's expectations
The beginning of our cooperation was a request to develop a dedicated solution to analyze invoices in PDF. The client wanted to be able to select the most important information contained in them, such as the total amount of the invoice or the main expense categories. And even though there are dedicated solutions for analyzing PDF files on the market, the assumption was to minimize costs. Therefore, while working on this functionality, we focused on providing a practical solution that would benefit the budget.
Setting Up the Environment
To set up an environment for invoice automation, you will need the following:
-
A computer with a stable internet connection
-
A programming language such as Golang or Python
-
A code editor or IDE
-
A ChatGPT API key
-
A Google Sheets account
First, install the necessary dependencies and libraries for your chosen programming language. For Golang, you can use go get to install required packages. Set up your code editor or IDE and create a new project. Next, obtain a ChatGPT API key from OpenAI and set up your Google Sheets account to store the extracted data.
Here’s a quick guide to get you started:
-
Install Dependencies: Use go get to install libraries like pdftotext-go and openai.
-
Set Up Code Editor: Configure your preferred code editor or IDE for Golang development.
-
Obtain API Key: Sign up for an OpenAI account and get your ChatGPT API key.
-
Google Sheets Setup: Create a new Google Sheets spreadsheet where the extracted data will be saved.
Approach to handling PDF invoices
The main technology we used for the implementation was the Golang language. We chose it for its performance, simplicity and compatibility with many libraries. Our attention was drawn to concurrency support, which works well for handling multiple tasks simultaneously. Golang also has a clean syntax and simple structure. This is an asset that is important for maintaining demanding applications. In addition, the language is compatible with various libraries, which greatly facilitates the analysis of PDF files. And most importantly, the direction we have taken is extremely efficient. As a result, our customer can enjoy low latency when extracting and processing data from invoices.
An important advantage of our solution is its adaptability. The architecture easily handles more than just invoices. With a few changes, the solution can process a variety of document types, including purchase orders, receipts and contracts. This flexibility ensures that the same approach and tools can meet a wide range of document processing needs.
Extracting Text from PDF Files
As I mentioned earlier, the key task was to create an environment that enables text extraction from PDF files. To take care of cost savings on the client side, we reached for an open source solution - Poppler-utlis, and more specifically pdftotext. We used pdftotext-go libraries to integrate the tool with the Golang application. It provides a seamless interface for accessing pdftotext. Below is a sample code snippet.
func OutputPdfText(inputPath string) (string, error) {
pdf, _ := os.ReadFile(inputPath)
pages, err := pdftotext.Extract(pdf)
if err != nil {
fmt.Printf("Error: %v\n", err)
}
invoiceText := ""
for _, page := range pages {
invoiceText = invoiceText + page.Content
}
return invoiceText, nil
}
Analyze the PDF content using ChatGPT
After extracting the invoice content in text form, it was time to analyze the data. For this task, we used the GPT Chat using the OpenAI API. We decided on the GPT-4o Mini model, which is a lighter and economical version compared to GPT-4. This decision was also dictated by the limited budget resources of the project. Before analyzing the content of the PDF document sent to ChatGPT, we performed a procedure to protect against Prompt Injection. In addition, the content of the invoice was stripped of sensitive data like company data.
The goal was to enter the invoice text and retrieve critical information, such as the total amount. Ensuring accurate data extraction and clarity in the original text is crucial for correct interpretation by AI technologies. This process transformed raw invoice text into actionable data, extracting the total invoice amount with minimal latency and cost. The following code snippet shows how we implemented this functionality:
func TotalInvoice(invoiceText string) string {
client := openaiClient()
ctx := context.Background()
chatCompletion, err := client.Chat.Completions.New(ctx, openai.ChatCompletionNewParams{
Messages: []openai.ChatCompletionMessageParamUnion{
openai.UserMessage("This is invoice content:" + invoiceText + " tell total amount of the invoice, tell only value"),
openai.AssistantMessage("You are an account assistant"),
},
Model: openai.ChatModelGPT4oMini2024_07_18,
})
if err != nil {
panic(err.Error())
}
return chatCompletion.Choices[0].Message.Content
}
data:image/s3,"s3://crabby-images/4c4f1/4c4f12bb46f4ddcd0736b8e2631e12ae69cc7ccd" alt="PDF invoices"
Deploy the application in the cloud
To ensure adequate availability and scalability, we decided to deploy the application on Google Cloud Run. This is a serverless platform. which allows for simpler deployment and management. We took steps to containerize the application using Docker. We also provide users with clear instructions on how to download invoices and receipts from the service. Then the Docker image was saved in Google Artifact Registry. In this way, we took care of a secure and centralized storage solution.
Thanks to the serverlessness of Google Cloud Run, we were able to:
-
Automatically scale resources according to load
-
Minimize the cost of inactivity
-
Focus on the application without worrying about infrastructure
Handling Errors and Customization
In this section, we will discuss how to handle errors and customize the invoice automation application.
To handle errors, you can use error handling mechanisms such as try-catch blocks or error types. In Golang, you can use the error type to handle errors and log them using the log package. This will help you identify and debug issues in the application.
To customize the application, you can modify the code to suit your specific requirements. For example, you can add additional functionality to extract specific data from invoices, such as the unit price or payment date, or save the data to a different spreadsheet. Customization allows you to tailor the application to meet the unique needs of your business, ensuring that it correctly processes and extracts the necessary details from your invoices.
Cost sharing
It's time to look at how this solution translates into costs. One of the advantages was the cost-effectiveness, which, by using Poppler-utlis, did not need extra funding. All because it is an open-source tool. In addition, by using the GPT-4o Mini model, each invoice analysis cost about $0.003. The expense of implementing Cloud Run was also not exorbitant. The cost with light to moderate use is only $2 per month. As you can see, this is a favorable proposition for companies processing a large number of invoices.
Conclusion and Future Directions
In this article, we have discussed the concept of invoice automation and how to build a Golang application using ChatGPT to extract data from invoices and save it to a Google Sheets spreadsheet. We have also covered how to handle errors and customize the application to meet specific needs.
Looking ahead, there are several potential enhancements to consider. For instance, integrating machine learning algorithms could further improve the accuracy of extracted data. Additionally, connecting the application with other financial systems could streamline the entire financial process, from invoice receipt to payment. These future directions can help create a more robust and efficient invoice automation system, providing even greater value to businesses.
By leveraging the power of AI and cloud services, businesses can significantly improve their financial operations, saving time and reducing errors. If you are interested in implementing a similar solution for your business, feel free to reach out to us for a tailored approach that meets your specific needs.
Summary
As you can see, the project we have developed offers great opportunities for integrating open source tools, AI and cloud services. A seemingly simple but tedious task combines several modern solutions. At the same time, it translates into time and money savings. The solution we delivered to the client fully met their original expectations. We are currently waiting for further feedback steps to further improve the tool.
If you are interested in the above story and would like to develop your business in a similar way, we invite you to contact us. Fill out a short form and then wait for contact from our representatives. We will offer you solutions tailored to your needs.