<img height="1" src="https://www.facebook.com/tr?id=&quot;1413357358800774&quot;&amp;ev=PageView&amp;noscript=1" style="display:none" width="1">

Efficient data processing is a major challenge for representatives of many companies. This is especially true for companies that manage many financial documents in their operations. I decided to present one of the solutions developed for a client, which is a response to the challenges posed. The mentioned client was looking for an automated way to be able to analyze PDF invoices. The goal was to be able to extract the most important financial information about totals and cost breakdowns. I invite you to read more about the challenges, describe our approach as well as the technologies used, and reveal information about the implementation strategy and cost of this solution.

Table of Contents:

1. Discovering and understanding the client's expectation

2. Approach to handling PDF invoices

3. Extracting Text from PDF Files

4. Analyze the PDF content using ChatGPT

5. Deploy the application in the cloud

6. Cost sharing

7. Summary

Discovering and understanding the client's expectations 

The beginning of our cooperation was a request to develop a dedicated solution to analyze invoices in PDF. The client wanted to be able to select the most important information contained in them, such as the total amount of the invoice or the main expense categories. And even though there are dedicated solutions for analyzing PDF files on the market, the assumption was to minimize costs. Therefore, while working on this functionality, we focused on providing a practical solution that would benefit the budget.

automating invoice

Approach to handling PDF invoices

The main technology we used for the implementation was the Golang language. We chose it for its performance, simplicity and compatibility with many libraries. Our attention was drawn to concurrency support, which works well for handling multiple tasks simultaneously. Golang also has a clean syntax and simple structure. This is an asset that is important for maintaining demanding applications. In addition, the language is compatible with various libraries, which greatly facilitates the analysis of PDF files. And most importantly, the direction we have taken is extremely efficient. As a result, our customer can enjoy low latency when extracting and processing data from invoices.

An important advantage of our solution is its adaptability. The architecture easily handles more than just invoices. With a few changes, the solution can process a variety of document types, including purchase orders, receipts and contracts. This flexibility ensures that the same approach and tools can meet a wide range of document processing needs.

Extracting Text from PDF Files

As I mentioned earlier, the key task was to create an environment that enables text extraction from PDF files. To take care of cost savings on the client side, we reached for an open source solution - Poppler-utlis, and more specifically pdftotext. We used pdftotext-go libraries to integrate the tool with the Golang application. It provides a seamless interface for accessing pdftotext. Below is a sample code snippet.


func OutputPdfText(inputPath string) (string, error) {
	pdf, _ := os.ReadFile(inputPath)
	pages, err := pdftotext.Extract(pdf)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
	}
	invoiceText := ""
	for _, page := range pages {
		invoiceText = invoiceText + page.Content
	}
	return invoiceText, nil
}

Analyze the PDF content using ChatGPT

After extracting the invoice content in text form, it was time to analyze the data. For this task, we used the GPT Chat using the OpenAI API. We decided on the GPT-4o Mini model, which is a lighter and economical version compared to GPT-4. This decision was also dictated by the limited budget resources of the project. Before analyzing the content of the PDF document sent to ChatGPT, we performed a procedure to protect against Prompt Injection. In addition, the content of the invoice was stripped of sensitive data like company data.

The goal was to enter the invoice text and retrieve critical information, such as the total amount. This process transformed raw invoice text into actionable data, extracting the total invoice amount with minimal latency and cost. The following code snippet shows how we implemented this functionality:


func TotalInvoice(invoiceText string) string {
    client := openaiClient()
    ctx := context.Background()
    chatCompletion, err := client.Chat.Completions.New(ctx, openai.ChatCompletionNewParams{
                Messages: []openai.ChatCompletionMessageParamUnion{
                    openai.UserMessage("This is invoice content:" + invoiceText + " tell total amount of the invoice, tell only value"),
                    openai.AssistantMessage("You are an account assistant"),
                },
                Model: openai.ChatModelGPT4oMini2024_07_18,
                })
    if err != nil {
        panic(err.Error())
    }
    return chatCompletion.Choices[0].Message.Content
}

 

PDF invoices

Deploy the application in the cloud

To ensure adequate availability and scalability, we decided to deploy the application on Google Cloud Run. This is a serverless platform. which allows for simpler deployment and management. We took steps to containerize the application using Docker. Then the Docker image was saved in Google Artifact Registry. In this way, we took care of a secure and centralized storage solution.

Thanks to the serverlessness of Google Cloud Run, we were able to:

  • Automatically scale resources according to load
  • Minimize the cost of inactivity
  • Focus on the application without worrying about infrastructure

Cost sharing

It's time to look at how this solution translates into costs. One of the advantages was the cost-effectiveness, which, by using Poppler-utlis, did not need extra funding. All because it is an open-source tool. In addition, by using the GPT-4o Mini model, each invoice analysis cost about $0.003. The expense of implementing Cloud Run was also not exorbitant. The cost with light to moderate use is only $2 per month. As you can see, this is a favorable proposition for companies processing a large number of invoices.

Summary

As you can see, the project we have developed offers great opportunities for integrating open source tools, AI and cloud services. A seemingly simple but tedious task combines several modern solutions. At the same time, it translates into time and money savings. The solution we delivered to the client fully met their original expectations. We are currently waiting for further feedback steps to further improve the tool.

If you are interested in the above story and would like to develop your business in a similar way, we invite you to contact us. Fill out a short form and then wait for contact from our representatives. We will offer you solutions tailored to your needs.

Let's get in touch!