Tips for Programming Accounting Features in Web Apps

Learn about representing money, rounding algorithms, and anticipating oddball transactions.

This article is part of my Confessions of an Unintentional CTO book, which is currently available to read for free online.

At what point in your life did you realise you were no longer a child but rather an adult? Was it losing your virginity? Donning a suit for the first time? Or being obliged to make and pay rent, month after month? The web application developer equivalent of all this growing up is….don’t get your hopes up…accounting.

I want to share some advice on handling that most first world of problems, that of counting all your money. But first, why would anyone want to count their money? Isn’t it enough to just have a business bank account? Nope…There are two categories of reasons for needing regular accounts: boring ones and frustrating ones. The boring reasons are that accounts are a good internal metric–useful for measuring business health, debugging overspending, and foreseeing cashflow problems. The frustrating reasons are that accounts are often legally stipulated by the implicit social contract you made with your governing state. Whether for taxation or corporate reporting or old-fashioned bureaucratic maliciousness, revenue departments relentlessly need to see your accounts. The only non-boring and non-frustrating reason for accounts is when your business model works with commissions and the like and you need to calculate these figures. This use of accounts is merely irritating.

Tip 1. It’s Not Just Sales Your Software Needs To Handle

Years back, I monetised my law notes website in the most straightforward way I could think of: by charging customers. I fumbled around looking for payment processing code on Stack Overflow and eventually cranked out a SUPER bare-bones buying system. The only thing it could do was sell.

As the business—and I—matured, I realised that my simplistic implementation couldn’t handle even the standard needs of any business that processes payments. Specifically, many of the transactions I had considered “complete” could derail days, weeks, or even months later through means such as:

  1. standard refund requests

  2. credit card chargebacks (i.e customers honestly or, more typically, bogusly telling their bank that their credit card was used to buy goods on my site without their knowledge)

  3. rare, payment-platform-specific oddities like “eCheque payments”

You might think that refunds and the like won’t affect your business, assuming, say, that you have a “no refunds” policy. The reality, however, is that there will always be edge cases requiring the same refunding machinery, two examples being customers who accidentally buy the wrong item or who inadvertently buy the same item twice. Furthermore, you’ll always have a certain percentage of your transactions forcibly refunded by financial companies through chargebacks, the frustrating process I mentioned earlier. (By the way, did you know that if you get too many chargebacks, your payment processor will shut down your account?)

Initially, I handled all these “after-the-fact” transactions from the control panel of my payment provider’s web app. This saved me the initial hassle of programming new web application features to handle these scenarios. But, as with many shortcuts, this one led to long-term problems: specifically, the financial data in my payment provider’s platform no longer matched the data in my web application. This had all sorts of suboptimal consequences, such as:

  1. financial summaries that differed depending on whether they were printed from the payment provider platform or from my web application

  2. paying suppliers overly high royalties, due to our calculating these amounts based on the incomplete story told by my web application’s database which knew nothing about later refunds. Ditto for sales tax.

  3. customers who were refunded for supposedly accidental purchases retaining access to those goods

These problems might seem merely finicky and esoteric when presented here, but in fact they manifested themselves to me rather forcibly in the form of emails from enraged customers, tax documents returned for not balancing, and lost profits (due to overpaid commissions and overpaid sales tax).

I eventually eliminated all these problems through three measures: The first was calculating all figures for taxation purposes based on what was actually present in the payment provider account (e.g. in Stripe or PayPal). The second was handling the refunds (and other weird transactions) in my web application (with extra code, yes), so that my software called the payment provider’s API to refund and also triggered all the correct local state updates. Alternatively, I could have stuck with refunding through the payment provider’s panel but supplemented this with software “listeners” on my server that read PayPal’s notifications and took appropriate local action. The third helpful measure was one I applied to rare, oddball transactions, like E-cheques—I simply started refusing to accept payments of this form, as the hassle in administration and the potential sacrifice in automation wasn’t worth the few percentage points of extra sales. (Indeed, for similar reasons, I stopped allowing customers to pay via direct bank transfer. Given the lack of integration solutions available with my bank, it wasn’t worth my while to handle this extra, out-of-website admin.)

Tip 2. Decide How to Represent Money: Denominations, Significant Digits, Rounding, and Floating-Point Error

It’s crucial that your software adopts a project-wide policy on how to represent money. At the most insane end of the inconsistency spectrum is inconsistent denomination: I’ve seen some seriously amateur open source ecommerce libraries that represent product prices in cents but then go off and represent the order totals on order models and payment provider request models in full dollars! This inconsistent denomination is just asking for trouble—for example when some joyfully naive, future programmer sees the product code and (reasonably) assumes money is always represented as pence in this project, and then goes off and writes payment code that bills customers £4,000 instead of 4,000 pence.

=> Lesson 1 (for the dangerously bad coders): Be consistent with your denomination size

Next up in our perusal of accounting is precision (or significant digits in math speak). Put bluntly, your business would look sloppy—and worryingly unprofessional—if the commission readout figures it displayed to suppliers had arbitrary and inconsistent precision. For example:

Your Earnings This Month

Anatomy Notes: £37.5
Genetics Notes: £33.33333333333333333
Surgery Notes: £6

Instead, your readout should really look like nice and consistent, like this:

Your Earnings This Month

Anatomy Notes: £37.50
Genetics Notes: £33.33
Surgery Notes: £6.00

Differing degrees of precision across money representations may eventually add up to serious problems, such as when core calculation code gives you results that are ever-so-slightly off.

=> Lesson 2: Harmonise your precision across your website. This applies both to formatting code for figures on the frontend and also for representations in the database (e.g. configure the “scale” setting on your SQL numeric type).

Let’s move on to an even more insidious problem with representing money on computers: floating-point error. Broadly speaking, if you represent money as a float data type, then arithmetic, upon its binary representation, will yield small yet upsetting inconsistencies. This happens because the binary representations of floats are essentially pixelated approximations, and as such, they’re both limited and deformed by the number of bits allocated to represent that float.

Practical example: one of my suppliers once complained that I had “ripped her off” because I had promised her £33.34 in royalties that month but only transferred £33.33 into her bank account. The reason for this disparity was that I had, stupidly, represented monetary figures as floating-point numbers in some parts of the code but as arbitrary precision numerical types (e.g. Decimals) in others. Performing rounding on these numerical representations led to divergent results:


=> 33.33


=> 33.34

=> Lesson 3: When representing money, ignore floats and instead reach for an arbitrarily precise numerical type or a dedicated money library.

Think that with this advice you’ve steered clear of a floating-point error? Think again—at least if your programming language has implicit type conversions. Even if you’ve religiously represented money with the proper numerical types, there’s still scope aplenty for inaccuracies to occur due to arithmetical interactions between money and other numbers that you may not consider as money. For example, imagine you give Irish customers a 33% discount on their birthdays. You’d be forgiven to think that this figure isn’t really money per se, and so you might represent it, without too much thought, as a float:


Unfortunately for you, your kindness will be your undoing. As soon as you calculate the discounted order’s total (by multiplying the money-typed total attribute in the order model with the BIRTHDAY_DISCOUNT_SIZE constant in float form), implicit type conversion rules in your chosen programming languages may very well transform your result into a big, ugly float, thereby destroying your accuracy.

=> Lesson 4: Like diseases, floating-point numbers are contagious. Thus any number in your system that arithmetically interacts with your monetary types ought to be treated with care (this usually entails requiring these interacting numbers to be represented as arbitrary precision numerical types).

Now we come to our last stop in our tour of ways in which money representations can go bust: rounding issues. The long and short of it all is that there isn’t only one way of rounding a number. Rounding is just a way of doing things, a convention if you will. As such, you’re expected to get with the program. There are a ton of algorithms out there, each catering to differing needs and producing differing results. To name but a few: round-down, round-half-up, truncate-last-figure, round-half-even, round-ceiling.

There are two points to be made here. First, ensure that your entire system uses the same rounding algorithm. Watch out especially for interactions between different programs or between different programming languages. The second point here is to choose a rounding algorithm that your customers/suppliers/accountants expect. Absent any special requirements, your best bet is to go with the symmetric round-half-up algorithm, which corresponds to what most of us learned about rounding in high school. Briefly, this algorithm will round 6.7 up to 7 and 6.2 down to 6. When the round-half-up algo is fed halfway figures (like 6.5), it rounds them up to 7, thereby justifying the “round-half-up” component of its name. Before we start feeling too complacent, let’s consider the difficulties with rounding negative amounts (i.e. amounts representing refunds or discounts). How does one round -6.5 “up”? Normally “up” would mean that the rounding algo should return a larger number (i.e. one that is closer to positive infinity). Since -6 is greater than -6.5, -6.5 must would up to -6, an alarming result that is asymmetric to the 7 we got when rounding positive 6.5. If you rounded your money figures this way, you’d end up with trouble balancing your books—but at least you’d have the basis for salami slicing, the scam featured in Superman III and Office Space. Luckily, the “symmetric” set of rounding algorithms avoid this issue, such that -6.5 will be rounded to -7, as expected.

=> Lesson 5: Be sure to check that your rounding algorithms conform to your expectations in both positive and negative calculations.

Tip 3. Gather your accounting requirements from a tax professional before writing a single line of payment code

I built my first payments system without quizzing my accountant, thoroughly, about what exactly he needed. For example, I read online that I needed to print something called “invoices”—even though my business sells exclusively to private consumers. Fresh-out-of-college-Jack downloaded some over-the-top invoicing library and wrote code that produced PDF invoices for every single purchase. At the end of quarter, I hauled a knee-high stack of these beauties over to my accountant’s office. Turns out, all those PDFs were unnecessary—I learned that even in the never-ending bureaucratic punch-to-the-face that is the German taxation system, a summary CSV file would have been sufficient, as long as the figures in my PayPal account corroborated this information. The CSV was not only faster for me to program, but also cheaper to print, and less work for my accountant to process.

That’s just one example of my jumping the gun on accounting code. Let me share a few more:

  1. I had to adapt my accounting reports to the number formatting norms in my (current) country of residence—six dollars and seventy cents can be written in English style ($6.70) or in mainland European style ($6,70). Once I was a resident in Germany, I needed to produce reports in European-mainland style—otherwise my accountant would have to convert everything and charge me absurd amounts of money for the pleasure.

  2. I built my payment checkout process unaware of what customer information I was legally required to collect. Specifically, tax authorities within the EU demand that you collect the country and address of every one of your customers. Because I didn’t know these requirements in advance, I had to rebuild parts of my checkout flow, meddling with sensitive parts of my codebase that I’d rather stay away from.

In summary: When it comes to accounts-related features, agile is out and waterfall is in. Get a detailed list of requirements from your accountant before you write a single line of payment code. Optimally ask them to send you an ideal example of the future spreadsheet that they’d like you to send them.

Tip 4. “Denormalise” computed data about financial payables (such as order totals, royalties, and commissions owed)

In programming, generally, there’s a best-practice principle that advises, “if you can calculate a piece of data from its precursor constituents and you have those precursor constituents available in your database, then there’s no need to also store the result of this repeatable calculation”. Programs adhering to this principle would calculate a user’s age on the fly, every time, based on the date_of_birth column stored in the database. Absent heavy performance restraints, programmers following this principle would resist the temptation to append a new age column that stores already calculated results. The point of this practice is to reduce redundancy and to minimise the chances that your database becomes inconsistent (say if you change one user’s date_of_birth without changing their age column too).

This design principle is sensible—except when you’re working with financial payables data. Let me recount an awful headache I ran into: My code on Oxbridge Notes needs to calculate royalties for the authors selling notes there. This calculation happens in many sectors of the codebase—e.g. in the emails alerting authors of their earnings, the author dashboards displaying historical totals, and the backend code that executes the actual commission payments. Sticking to the above design principle, I calculated the commissions afresh every time, based on data in the precursor orders, line_items, and products tables.

This turned out to be a mistake because:

  1. It requires you to maintain legacy calculation code: I radically changed up how I calculated commissions during my third year of business. While this change was problem-free going forward, there were still years’ worth of commissions calculated under the old system and I still needed to be able to reproduce these amounts in various parts of my website. The consequence was that I had to maintain a messy lump of legacy code solely for remembering what I had previously paid in commissions; there were if statements and alternate class definitions strewn everywhere.

  2. It was slow: My commission calculation system was fiercely slow, owing to its being burdened with a many-table join that sometimes summed thousands of disparate transactions in its roundabout way. Because of my unwillingness to implement a warm caching layer, this led to a dreadful user experience—e.g. whenever an author perused their royalties page.

Sooooooo, what best practice should one follow instead? Here’s my thinking: Financial payables are a special type of data, in that they are typically not modified after their creation. Whereas a database column holding a customer’s age will fall out of date with passing time (thereby carrying risks of database inconsistency with respect both to the date_of_birth column and to reality), the same risk isn’t present with historical financial payables: these amounts have already become due or been paid—and, assuming the code is working correctly, that’s the end of it.

With this in mind, I rewrote my web application in a way that dramatically simplified this whole sector. Essentially, I denormalised into a new transactions table that got appended every time a commission, refund, or bonus etc. was accrued. Results were saved to the database instead of calculated anew every time. Now the calculations necessary to power commission readout pages on author dashboards were as simple as summing the entries in the transactions table for that particular person—a highly performant operation. My overall codebase was simplified because there was no longer any need for me to maintain legacy calculation logic. Instead, I could rely on the pre-calculated entries within the transactions table. What’s more is that rarely occurring exotic financial payables, like sign-on bonuses, could be easily added as I expanded the scope of my system.

Tip 5. Generate Accounts Based On Payment Processor Data Rather Than Your Own Invoicing Data

One of the most basic tenets of business accounting is that the figures in your bookkeeping records must match the cash balances you hold in your various accounts. If you start the month of January with a balance of £10,000 in your PayPal account and end the month with £22,400, then your bookkeeping entries for January (showing all the incoming and outgoing transactions) ought to sum to a net surplus of £12,400.

This sort of balancing act is harder than you might expect; all sorts of complicating factors conspire to trip you up—be they financial fees and their refunds, withdrawals or transfers to other bank accounts, once-off corrective payments to compensate for oversights and bugs, accidental personal transactions made through your business account, or currency exchange winnings and losses that vary from day to day.

The biggest systemic precaution you can take to help ensure you don’t overlook a transaction is to generate your accounts from data downloaded from your payment provider—be that PayPal, Stripe, or your business bank account’s API. You should be literally pulling down their data (or uploading their CSV) as part of your accounting process. This recommended workflow is meant to stand in contrast to generating accounts by scanning your database for orders and the like. While you obviously need to reference your internal data in preparing your accounts, they shouldn’t be your final source of truth—there’s too high a chance they miss out on some quirky transaction.

Tip 6 [Probably only relevant to those running two-sided platforms] Track whether amounts have been paid

If you pay suppliers through your platform, you will typically either (1) pay them immediately after each and every sale; or (2) settle accumulated amounts at the end of every week/month/quarter. Regardless of your choice of rhythm, there’s a potential snag: Sometimes these intended transfers do not complete correctly. Maybe the supplier provided you with invalid account details, or maybe their account was temporarily suspended—who knows. The point is that, whenever any of these problems crop up, your web application ought to clearly mark that the payment failed. (Conversely, you should track when payments complete successfully.) Having this information at hand is important, because it enables you to easily retry any failed payment in the future.

This might seem like an incredibly obvious feature, but I’d like to bring it up nonetheless because some payment processing APIs return web-code 200 responses which suggest everything is OK, only for you to later discover that while the web request was validly formed and went through just fine, the actual money is “on hold” in some strange limbo halfway between success and outright failure. The supplier, not seeing their expected payment land in their account, freaks out, sending worried emails that lead to time-consuming investigations on your side.

The lessons to be learned here are twofold: (1) carefully dig into the output details of API responses from payment processors so that you can confidently determine whether the money reached the other party’s account or not; (2) architect your database such that your web application is always aware of how much money has been paid vs. how much is still owed. In my case, that meant adding a paid column to my denormalised transactions table (as per point 4 above), and filling in this column as needed. Before I added this column, a failed batch payment to my suppliers would have necessitated painstaking, manual searches through my payment provider’s dashboard in an attempt to figure out which amounts were paid and which ones derailed.

More Articles:

What Happens If A User Clicks A Button Twice

The most common source of bugs in web applications?

Project-Level Attribute Normalisation

Why you should introduce a software layer close to your object relational mapper that formats attributes into clean, standardised forms

Textmate to VIM

Where I explain how to reproduce over 110 commands from Textmate in VIM.