The spreadsheet conundrum
Spreadsheets are everywhere.
A huge number of us use Excel or Google Sheets on a near-daily basis for expense tracking, budgeting, keeping up with loan or mortgage payments, to-do lists, scheduling events and managing projects and activities.
There's no question that spreadsheets are a fantastic tool for many jobs involving small to moderate amounts of data.
Although the volume of data that can be manipulated in spreadsheets has grown in recent years, with advances in software and hardware, that is not the whole story. Long before you reach the theoretical limits of your spreadsheet, you are likely to begin experiencing usability issues.
Firstly, there’ll be the size of the file itself - even without images or graphics in the files, you'll find long load and save times make for awkward use.
It’s true that high-speed broadband, computers with SSD drives, large amounts of memory and fast multi-core CPUs can mitigate this, but not every business has this infrastructure or wants to buy powerful computers for all their staff, particularly when companies today rely so much on remote workers.
As your spreadsheet accumulates more and more data, you may see errors with add-ons and formulas as they try to run lookups, calculations, or transformations on the huge data sets. If you don't have a high-spec computer, these problems will become apparent much sooner.
Considered also what happens if you need to work with such spreadsheets over a 3G or 4G signal on a mobile device. One can imagine how using a small screen for this would be an exercise in low productivity and frustration.
Mind you, even on a powerful desktop or tablet, users are likely to get frustrated with the sheer volume of data right in front of their face, as they scroll this way and that, trying to find what they’re looking for.
Having been around for decades, spreadsheets have some built-in mechanisms to tackle some of these issues. You could get around some problems using VBA (a programming language for Microsoft Office) or Google's App Scripts, and essentially change your spreadsheet into a kind of application, complete with basic elements of UI, business logic and of course data - all in one package.
I think this is a viable solution in many cases.
I also think there is a better way, because spreadsheets have some downsides that can't be ignored.
- All your data, (and logic) is in a single file. If you file is lost or corrupted - you potentially lose it all. (Subtle hint: Back up your files!)
- Security is awkward. You can certainly password-protect your file, hide columns or password-protect cells and logic. But usually most of the information in a spreadsheet is intended to be visible and accessible. Hint: if your spreadsheets contain highly confidential data, use the masking and cell-protection features.
- Portability - yes, you can copy your spreadsheet easily from one place to another, download it easily from the cloud and share it without much trouble. But so could anyone else.
- Relationships are awkward. You can easily create any number of sets of data across worksheets, but how do you set up and maintain relations between each of them? This is vital to limiting data corruption, duplication, limiting redundancy, improving data consistency and the ability to consume the data.
So, what’s the alternative then?
Well, the ideal scenario is to take all your data over to a relational database. (The 'relational' bit just means that sets of data can be 'related' to each other, and these relationships will be enforced on all existing data and future additions and changes.)
How does a database address the issues we raised about spreadsheets? What other benefits do you get?
- Your data is managed through a database 'engine' and not simply saved to a single file. The engine’s primary purpose is to robustly store and organise data. This organisation makes it easier to search for information when you need it.
- Security is a huge aspect of database design, and your data can be secured in multiple ways. You can bring your business rules and logic into the database - which can be protected with layers of built-in security. An entire database can be encrypted, and it is even possible to encrypt only particular pieces of data – it’s how e-commerce sites store credit card details.
- Although it is certainly possible to copy databases, migrate or move them around, this process is non-trivial. You can't just copy it to a USB drive, and even if you did, there’s no guarantee that you can access the data.
- Nearly all databases have a mechanism whereby data access or data changes can be audited. If you need to know who has looked at or changed something in the database – you’ll be able to find out.
- Databases typically have a built-in tool or have extensions to support scheduled full and incremental backups. Enterprise-level databases also have ‘transactional logging’ which maintains a record of all operations that take place on the database.
- Whereas spreadsheet sizes typically range up to perhaps tens of megabytes, databases at the commercial end can be truly huge – in the order of terabytes. This means that there is no physical reason not to consolidate all your business data into your database.
- All your data sets can be related to other sets using common keys and rules which can be used to maintain data integrity and consistency, and limit data redundancy.
- Databases are designed for ‘scalability’. That is, they can be ‘scaled’ to enable concurrent use from not just multiple users, but also other applications – such as desktop applications, websites or mobile apps. And when the usage grows, so can the database platform.
So, what next for your spreadsheets?
If you are managing small sets of data, that don’t need to be linked to each other, stick to spreadsheets. If, however you have large or fast-expanding volumes of data, or you need to apply strong rules over your data you may want to consider migrating your data to a database.
In my next post I’ll explain how you can start planning for this.