Bank statements can be frustrating documents. There are many use cases that required well formatted financial data as the input, but more often than not this data is stuck in PDFs, proprietry formats or, even today, hard copy. The vast array of differing formats only serves to complicate matters.
Some banks are getting the idea, and providing easy ways for their customers to access well formatted CSV version of their data, or better yet, adhering to Open Banking standards, but take up has been slow. And what if you have no control over the source? It is a common need as an auditor, forensic accountant or financial investigator to analyse large quantities of bank statements (and other tabular data), provided from their clients (or even subpoenaed during a legal case). Huge numbers of person-hours are needed to process the data into a standard format (often Excel) to allow for further analysis.
Textreme can help. By automatically finding and extracting data formatted as tables, the time taken from document ingestion to analysis are cut down drastically. As an example, take this bank statement:
It has no table lines, multi line descriptions, and multiple tables per page. All things that could trip up other software, and force the user to rely on template based systems, or even manual data entry. Let’s see how Textreme deals with it.
Tables tab, upload the document. Once processed, click the document name to see the table data that Textreme found.
Both the summary table and the main transactions table are present. They are formatted as in the existing document. The data can be copied straight to the clipboard (in either CSV or Markdown format) using the buttons at the top of each table. This can drastically cut down time to analysis, simply paste the data straight into Excel.
Alternatively, the whole document can be downloaded, in CSV, Excel, plain text or Markdown, using the file menu. This is particularly useful for multipage documents, such as the following:
Running through Textreme and downloading the whole file as an Excel gives the following, ready for further analysis with no manual input required: