Sharing registries in financial systems

registy

In this post, I would like to talk about a different kind of intersystem interaction. In the previous post we discussed how to avoid problems when connecting to external services, and today we will talk about a specific to some industries way of exchanging information, namely file sharing.

Let me explain what we mean. In financial organizations and fintech, it is a common practice to share registries, statements etc. For example:

  • When processing cards, we receive information from the payment system about the blocking of funds on the user’s card. The system blocks the amount, and after a while, the payment system sends a clearing file containing transactions, depending on which the amount is either charged or the funds get unblocked.
  • Banks exchange files with information about the movement of money on accounts and reflect these movements in their own systems.

Files are usually your regular tables with numerous rows, that come in different formats – CSV, DBF, Excel etc.

Here we have compiled for you a small checklist on how to implement the development, testing and operation of this type of interaction.

Implementation

Use an asynchronous uploading pattern

It’s best to use an asynchronous pattern to a file. If uploaded by HTTP, this may look like this: an operator uploads a clearing file, and in return receives a file ID. Then periodically it uses a specific URL with this ID to find out the status of the file.

>> POST /registry/upload
>> registry.csv
<< 200 OK
<< {“fileId”:”42”}

>> GET /registry/status/42
<< 200 OK
<< {“status”:”pending”}

File processing happening simultaneously with the user waiting for the result is not that convenient – the request might time out in case the file is too large, etc.

Streaming instead of downloading to memory

If parser and data allow, it’s best to stream the data and not to download the entire file to memory (for example, use SAX instead of building the entire model in memory if you’re parsing XML). In some cases, this is not possible because of the nature of the data itself (you need to validate all the content at once) or because of the specifics of the parser implementation. In this case, you can limit the size of the files and organize the processing queue so that the files fit into memory. If a larger file comes into the system, you can notify the administrator through monitoring or the alerts system.

Keep the original file

If there is an error in the production, it will be much easier for you to find the cause, as you can simply download the file and write a test/tackle the problem area.

Think of idempotence

To avoid duplication of records and files, you need to think through the mechanism of idempotence. Most often, various statements and clearings have unique identifiers of the operation, which can be relied upon. But sometimes the format or nature of these operations does not imply any identifiers, so you have to make them yourself.

If each entry cannot be clearly identified, you can arrange an idempotent APIs of the file download itself. For example, you can calculate the checksum by the contents of the file, its name, or in the delivery protocol to immediately create the idempotent key.

>> POST /registry/upload
>> X-REQUEST-ID: 34
>> registry.csv
<< 200 OK
<< {“fileId”:”42”}

>> POST /registry/upload
>> X-REQUEST-ID: 34
>> registry.csv
<< 409 CONFLICT
<< {“error”:”already_exists”}

In the latter case, the responsibility for downloading duplicates is transferred to the user of your system.

Connect the monitoring

The parameters of such monitoring may be:

  • File size
  • Number of entries in the file
  • Record processing status
  • Processing time of the records and files and the like

If something goes wrong while downloading and processing, you’ll be able to easily see it. It will also be useful to set notifications for various non-standard situations or errors.

Log the errors

If there is an error in the processing of records or files, you need to log it with as much detailed information as possible. It’s best to store this information together with the line itself or the file, but it’s not necessary as the goal is to provide quick access to the details of the error.

Not OK

java.lang.NullPointerException: null

OK

An error occurred during processing record #42
Error details:
payload: {...}
parserSettings: {...}
thread: scheduled-job-thread-pool-12
requestId: 34,
user: admin
serverId: 46
version: 4.5.4
host: my.service.prod
timestamp: 2020-01-02T00:01:02
root cause: java.lang.NullPointerException: null
stacktrace:

...

Keep as much context as possible

This point overlaps a bit with the previous one. Context included parameters such as:

  • Time In:
  • Processing time
  • Processing status
  • Start source (if there is an operator, for example)
  • File sources
  • App version
  • Information about the server where the processing took place and much more

Such information will also help in the analysis of incidents and will greatly simplify the life of the developer.

Provide the possibility of manually correcting and restarting

It’s quite a common situation where you need to restart processing individual lines or files because of incorrect data getting inside. In this case, it is useful to have a “kill switch” to restart the processing of strings and/or files. Don’t forget to save the history of such restarts as well.

Implement the ability to manually disable processing

In case incorrect data started to arrive in the system or you found an error in your algorithm, the possibility of disabling processing will come in handy. In my practice, I have made these same mistakes several times. As a result, we had to sort through and reboot such records manually, as it was not possible to stop processing without disabling the entire system module.

Validate the entry

As mentioned in the previous post it is much cheaper and easier to fix an error on early stages than to catch the incomprehensible NullPointerException in the bowels of the system.

Another phenomenon, that I have repeatedly encountered – developers would often “nail” the maximum length of the fields and various validations right in the database. This move narrows the room for manoeuvre. A real-life example – the field, which was responsible for the storage of the name of the legal entity for some reason limited to the length of 160 characters. And of course, after a short time, the data that goes beyond the set limit has entered the system. If the check was in the code, it would be enough to correct the code and update the version so that the new data could be successfully processed. But now we need to change the DB scheme, which is not that easy.

Another rule concerning the types of data –  don’t go hardcode on the data type where you don’t need to. For example, for the sum, it makes sense to immediately provide a field of view BigDecimal, and for example, for the ID it is better to lay a string type, even if the ID itself looks like an integer value.

If there’s an error, tell the customer what went wrong.

It will be much easier for a client to work when they see a detailed description of an error rather than an abstract internal error. Which file, what line, why fell over, but without technical details (the stack trace will be excessive here). In general, when building an API or any interaction in general, it is necessary to explain as clearly as possible to the client what went wrong – otherwise, instead of writing programs, you will work as technical support.

Testing

Now let’s look at the things you should look out for when testing.

Check border cases

Test and properly process cases if you have received the following:

  • Empty file
  • File in the wrong format
  • File with missing columns

For some reason, this is also often overlooked.

Test the incorrect data

Make sure your parsers break down if the fields contain incorrect data. In my practice, there were cases when the date in the wrong format was perceived by the parser as valid and the result has been a complete mess.

The same was true with an overflow of integer types – in the code, the field had an int type,  when actually we had a long. We expected the parser to give exception and fail, but it turned out that he swallowed this error and just transformed long into int, naturally with the great losses.

Test the system using real data

If possible, ask for real data for a significant amount of time that you could use for testing. You will likely come across cases that have not been described in the documentation.

Check the performance of your system

There are business processes that are sensitive to the processing speed of such batches. For example, when closing a bank day, you need to process N records in M minutes. Make sure your system meets performance requirements. During the processing, it makes sense to provide a backpressure mechanism in case the system starts choking.

In conclusion

Simple at first glance task contains many pitfalls. Take the time to implement the things that have been described above. It is not that difficult, but the time spent on it is sure to pay off during the operation.

Leave a Reply

Your email address will not be published. Required fields are marked *