You are currently viewing Four Ways to Store Your Data Other than Excel

Four Ways to Store Your Data Other than Excel

Pop quiz: Do you like the analytic tools you have? 

If the answer is “No,” “Not very much,” or “I didn’t know I had a choice!” it’s time to re-evaluate the tools you have. This isn’t a popularity contest. I‘m not going to rain hate down on any particular programs because I believe they all have their place in analytic work, no matter how loudly the angry data scientist mob chants, “Down with Excel!” What I would like to do instead is to challenge you to think about what you use – and what you aren’t using – to see if there might be a better fit with your needs. What is the best analytic software for you, your non-profit or small business, and your ultimate goals?

Let’s start with some simple questions about your data and analytic needs. If you really want to take this little exercise seriously, you can jot down your answers for reference. If you’re more just curious to learn more about options out there, you can skip ahead.

  1. How much data do you have? Roughly, how many rows (thousands, hundreds or millions) and how many columns (a few, several dozen, hundreds?)
  2. Are they in one place or dozens of separate places? In one file or many?
  3. How often, if ever, are those datasets updated or changed?
  4.  Do you need to clean or change these files before you can use them for you analytic work (e.g. fix errors, combine two different files, create calculated columns)?
  5. Is it necessary to filter to a subset of your data or summarize (e.g. add up) parts of your data to get the information you need?
  6. Are you or do you want to create visual reports from these datasets?
  7. If yes, is all you need a static image or should people be able to interact with the visualization (choosing what years or subcategories to look at, drilling down, etc.)?
  8. How many people need to receive or view the results of your analytics?
  9. Do the underlying data or the reports themselves need to be only made available to certain people?

Now that we’ve thought through the key elements of what we might need from analytic software, it’s time to tackle the plethora of options. Today, we’ll start with Data Storage. It’s the foundational piece: you need a place to keep your data. Many organizations don’t even realize they’ve made a choice here.

Data storage is wherever your data live. Raise your hand if that’s in Excel files! It’s okay. Excel does technically count as data storage because it’s holding data for you to come back to later. However, there are many other ways for you to store your data, and some may be much better for your particular data needs.

Note: If your “data” are in images, emails, Word, or something else that isn’t actually directly accessible then you have an extraction problem you need to solve first.

Excel

Excel can absolutely manage keeping values in columns and rows. It’s often the default choice – the one folks don’t realize they’re making. While Excel can do the job, there are some serious limitations that you should think about before continuing down the “but it’s what we’ve always done” path. 

Excel is a poor choice is if you have data in the 6-figure row range, as it has file size limits. It’s also not a great choice if you need to merge multiple files together, as copying/pasting is very error-prone and functions like VLOOKUPs (joining two tables on a shared key field) have issues like dropping anything from either table that doesn’t match or breaking if a column is added to a table. Finally, the biggest threat from Excel is its auto-formatting habits. If you have a column with ZIP codes or other numbers that need to be stored as text, you can bid your leading zeroes a fond farewell! I’ve also heard horror stories about date fields getting corrupted – suddenly you’ve got things from May 21, 8352.

For this reason alone, my default recommendations is to use CSVs (discussed next) as your basic storage file. Even if Excel is your sole analysis tool, you can still keep the original data in CSV files and only copy over the elements you need to graph or analyze into Excel.

If your datasets are small enough and you’ve done what you can to limit errors, there is a lot you can do in Excel. However, the risk of typos, misplaced columns and “Excel ghosts” leads me to recommend other solutions.

Good for: relatively small data files that can stand alone

PROS:

  • Can analyze data directly in the same file
  • Easy to share
  • Easy to correct/update single values
  • Included in Microsoft Office

CONS

  • Errors from formatting
  • Difficult and error-prone to update the entire data set
  • Merging files is also challenging, requiring manual copying-and-pasting or unstable formulas like VLOOKUPS
  • Proprietary to Microsoft
 

CSV

The humble CSV (“comma separated values”) file stores data in what might look like an Excel file but isn’t. It keeps everything as text, so you don’t have formatting errors, and it can handle much, much larger files than .xslx files can. Converting your data to .csv is an easy step to take. Text files like CSVs serve strictly as data storage and will need to be combined with some kind of analysis tool.

Good for: large, standalone datasets and long-term, stable storage

Note: if your data values themselves have commas in them, consider a version of a csv that separates values with tabs or pipes (|) instead of commas.

PROS:

  • Much larger file size possible
  • Avoid data corruption from formatting
  • Update individual values directly
  • Can be opened in Excel if needed
  • Free

CONS:

  • Can’t analyze or manipulate data with functions, as you can in Excel
  • No ability to create relationships between files (e.g. joining on key fields)
 

Access

If managing the connections between your datasets is a problem, consider Microsoft Access. It’s a database program that comes included in Office 365 and can handle large and diverse datasets. You can use it to enter data through structured forms, you can create relationships between datasets based on a matching field, and you can connect Access databases to analysis tools.

A database program, in comparison to a data file, allows you to organize multiple discreet datasets in a single location, as well as define how those datasets should or should not relate to each other.

Good for: data that exist in multiple tables/files and need to be connected together. Also good for structured manual data collection. 

PROS

  • Included in Microsoft Office
  • Organize related files together
  • Write queries (“asks” of your data)
  • Collect and Report on data
  • Templates can get you started with table structures, input forms, and more
  • Can view and edit tables in a view similar to Excel

CONS 

  • A little more complicated than Excel or CSV to manage
 

SQL

There are many flavors of SQL and SQL storage programs. The main idea is that SQL is a way of talking to big datasets that lets you ask most things. SQL databases are organized and can be connected into relationships through shared columns. Since SQL is technically the language used to boss your data around, you’ll need to select which software you want to use to SQL in. Common examples are SQL Server, mySQL and SQL Developer. 

Consider SQL if you have many tables and multiple people accessing those tables. SQL platforms offer integrated security options for controlling permissions. SQl platforms can also allow you to create different views of your data, combine tables, create calculated columns, and much, much more. There are both free and paid versions of SQL platforms.

The downside to all this functionality is that SQL is a language you have to know. This requires specialized skills and training. 

Good for: complex, large datasets with many tables and dynamic relationships. 

PROS

  • Very robust
  • Manage even complex data structures easily
  • Relational database to connect as many tables as you need
  • Create dynamic data columns and tables

CONS

  • May need to pay for desired platform
  • Required technical skills and experience
 

Cloud-Based Interactive Databases

If your data needs center around business activities rather than direct analysis (think contact/client or project management) consider a web-based data management program like Airtable or Knack. These platforms offer easy-to-use, code-free options for data storage that help organize and manage diverse datasets. They also offer other application-like functions to link business activities to the underlying databases. For example, maybe you need someone to be emailed when certain customer records change, or you need a company table to be updated when the contact table changes. 

The examples listed above are very flexible and can be built out to suit a huge range of functions, from product development to HR and beyond. However, if your business need is very specific you may want to consider a database application that is tailors specifically for want you want to do with it. For example, Improvado is tailored towards managing and working with marketing data. As you might expect, customer relationship managers abound, from Salesforce to Nimble, as do project management systems like Monday or Asana. Even niche fields, like scholarship application management, have programs devoted just to them (e.g. WizeHive or GradSnapp).

These platforms require subscriptions, but this cost can be well-worth it if the investment allows you to do your work more efficiently, eliminate errors in your data, and expand your ability to collect and analyze.

Good for: complex datasets that feed into dynamic business workflows, without technical expertise available

PROS:

  • Easily handle multiple datasets and complex relationships
  • Build functions to automatically update tables under certain conditions or carry out certain business functions based on results
  • Low to no coding needed

CONS:

  • Subscription fee
  • Limited to functions available through the interface

Other SaaS enterprise data management products like Snowflake, Terradata, AWS and Azure typically aren’t necessary for small to medium sized organizations. Additionally, like SQL, they require technical expertise to manage. I only recommend these when you have huge amounts of complex data across a large enterprise, and can afford a database administrator. 

Better than Tradition

Hopefully you either have found a new data storage solution that better fits your needs, or you have a better answer for the one you currently use than “it’s tradition!” If you need any help figuring out the right way for you to store your data, please don’t hesitate to contact me