Excel in Life Sciences
Excel. Love it or hate it, it is ubiquitous in the life sciences. From initial data capture and analysis, to plotting and presentation, workflow management, inventory, and even as makeshift “databases,” Excel permeates all aspects of life science informatics.
- Excel is a godsend to scientists. It provides just enough structure and functionality to empower anyone to collect, organize, and analyze their data.
- Excel is the bane of every data engineer’s existence. It provides just enough structure and functionality to empower anyone to create infinite ways to collect, organize, and analyze their data.
Countless projects and products have attempted to replace Excel within organizations and across the industry, claiming better data management, access, security, user experiences, provenance, and so on. Despite the millions (billions?) spent, none have succeeded. Excel continues to thrive, always finding new niches to occupy in the informatics stack.
At BBC, we know this all too well. Not only are some of us responsible for products trying to dethrone Excel, we also happily encourage the use of Excel for many applications. So, how did we come to stop worrying and love Excel?
In this post, we’ll take a deeper look into why Excel reigns supreme, how Excel is used in the life sciences (and why it sometimes worries us), and what your Excel usage reveals about your informatics needs.
Why We Love Excel
Let’s cut to the chase: Excel is so widely utilized simply because it’s easy to use and readily available. It provides the right abstractions for organizing data and the essential functionalities for most common analysis tasks. Excel doesn’t prescribe how to organize or analyze your data; instead, it provides intuitive tools and a simple framework to do it. Most importantly, there is no right or wrong way to use Excel. It’s a blank canvas waiting for your data to paint a picture.
What contributes to Excel’s widespread adoption?
- Intuitive Interface: Excel’s grid-like structure aligns with how we naturally think, organize, and communicate data in lists and tables. Excel provides a natural interface that works well with us humans.
- Availability: Nearly everyone has access to Excel and has some level of familiarity with it. This starts as early as high school (if not even earlier), and continues into every day jobs, and home organization activities. Excel is ubiquitous, even Google mimics it via Google Sheets.
- Flexibility: While not a full-fledged programming language, Excel offers enough features and functionalities to perform work beyond simple lists. Its semi-structured nature allows for organizing data and content while still maintaining some level of rigor in data management. Excel can be tailored to specific needs via its adaptable structure, customization options, and diverse features.
- Functionality: Specifically, basic statistics and plotting capabilities. By providing a standard set of analysis and visualization tools, Excel gives non-technical users the ability to process and explore their data directly, no math or computer science degree needed.
- Communication Tool: Excel facilitates data communication between scientists and informaticians. It provides a common platform for organizing, analyzing, visualizing, sharing, and discussing findings. Excel provides a means to communicate simple and complex data in a clear and concise manner.
It’s rare that a software application gets it right, but Excel, and spreadsheets as a whole, nailed it. Other than word processors, web browsers, and operating systems, it’s hard to find other examples of widely used tools that have truly stood the test of time.
Why We Worry About Excel: Common Applications of Excel in Life Sciences
Excel offers a solution for numerous applications across various stages of life science research and development. It’s easy to make Excel the foundation of your informatics infrastructure. However, with each new application, data management gets more complex and challenging.
Let’s look at some common uses of Excel in the life sciences and what keeps us up at night (see Table 1).
| Applications | Details | Benefits | Challenges |
|---|---|---|---|
| Scratch Pad | Excel serves as a quick and easy tool for jotting down ideas, preliminary observations, and experimental notes | o Easy to set up o Lack of formalized structure o Easy to iterate through analysis ideas | o Data integrity risk o Sharing o Version control |
| Data Capture | Excel is frequently used to record raw data from experiments, observations, and measurements | o Familiar user interface o Easy to define basic formulas and summarize results o Visualizations aid in informal QC | o Manual data entry errors o Limited to no data validation o Lack of automation |
| Data Processing | Excel is often used to perform basic raw data processing and modeling | o Standard methods for basic statistics o Data and analysis directly auditable o Yeah, we do lose some sleep over this | o Anything beyond basic calculations gets complex and error prone o Limited modeling options |
| Data Exchange | Excel workbooks are a common method of sharing data between organizations | o Human readable o Readily shareable o Generally machine readable | o Lack of standard formats o Provenance and version tracking o Upstream human error |
| Workflow Management | Excel spreadsheets can help track tasks, deadlines, and progress in research projects | o Familiar user interface o No additional software needed o Easy to adapt to changing requirements | o Limited collaboration support o Lack of integrated features o Version control |
| Inventory Management | It can be used to maintain records of samples, reagents, and other laboratory supplies | o Standard user interface o Easy to integrate reports and dashboards o Support for arbitrary inventory types | o Limited scalability o Lack of or limited real-time updates o Limited reporting and analytics |
| Database/Registry | In some cases, Excel is used as a basic database to store and manage information | o Simple entry for tabular data o WYSIWG data o Easy to export to external databases | o Data integrity issues o Limited security o Performance issues |
Table 1: Common Applications of Excel in Life Sciences.
Show Us Your Spreadsheets: “The Whole Point of Capturing Data is Lost if You Keep it a Secret!”
Excel’s widespread use provides valuable insights into an organization’s informatics needs. It’s often deployed as a foundational tool, especially in the early stages of system setup and optimization, even before more sophisticated formal informatics systems are in place.
As such, one of the first things we do with new clients is to ask for their Excel files. This helps us understand how scientists use Excel to collect key data points, identify established processes that need support, determine what information needs to be captured, and pinpoint high-priority processes.
This allows us to strategically engage and develop optimized solutions that mirror an organization’s workflow:
- Excel as the Initial Informatics System: Excel is frequently the first informatics system a company deploys. By observing how scientists use it, we can uncover crucial data points to be collected and identify established processes that need support.
- Template Analysis and Data Modeling: Shared Excel workbooks, such as experiment templates, reveal what information scientists consider important to capture. These workbooks and worksheets are excellent starting points for building data models when migrating to more sophisticated, formal informatics systems.
- Workflow Mapping through Worksheets: The structure of an Excel worksheet often mirrors a scientist’s workflow, providing us a visual representation of their processes. By mapping individual scientists’ approaches to data capture and analysis, we can translate these into a common process to be supported. Variations between different scientists’ Excel workbooks can also highlight nuances in individual workflows, which is helpful for defining system requirements. Commonly used workbooks often indicate processes that would greatly benefit from dedicated software.
- Highlighting Bottlenecks: Frequently used Excel workbooks can point to high-priority processes that would significantly benefit from specialized software solutions.
Let’s revisit the applications in Table 1 and see what they can tell us about your informatics needs (see Table 2).
| Applications | Use Case | Alternatives |
|---|---|---|
| Scratch Pad | Excel serves as a quick and easy tool for jotting down ideas, preliminary observations, and experimental notes | o Wikis allow for semi-structured and indexable notes o Electronic Notebooks provide scientifically relevant tools, such as structure and reaction builders |
| Data Capture | Excel is frequently used to record raw data from experiments, observations, and measurements | o For standard processes, LIMS, LIS, and MES products help standardize data capture |
| Data Processing | Excel is often used to perform basic raw data processing and modeling | o Domain aware products, such as Signals Inventa, Data Warrior, and JMP o Bespoke solutions using scientific software libraries from languages such as R and Python |
| Data Exchange | Excel workbooks are a common method of sharing data between organizations | o Actually, Excel isn’t too bad at this o Use standard formats for structures and sequences, such as SD Files and FASTA |
| Workflow Management | Excel spreadsheets can help track tasks, deadlines, and progress in research projects | o Software productivity tools such as Jira and ZenDesk o No-code, Excel-like products such as SmartSheet or AirTable |
| Inventory Management | It can be used to maintain records of samples, reagents, and other laboratory supplies | o Standalone inventory packages o Inventory tools integrated into ELNs and LIMS applications |
| Database/Registry | In some cases, Excel is used as a basic database to store and manage information | o ELNs with chemical registries such as CDD Vault or Signals Notebook o Bespoke database applications |
Table 2: Alternatives to Excel in Life Sciences.
While it’s not always necessary to replace Excel-based solutions with targeted applications, it’s helpful to understand the options so you can evaluate the tradeoffs. Does your inventory consist of a few hundred compounds? Excel might be fine. Do you have 10,000 compounds with multiple batches per compound? You probably want a proper inventory system.
“Excel usage insights lead to valuable formal system/tool requirements.”
Concluding Thoughts
Excel has undeniably earned its place as a ubiquitous and invaluable tool in life sciences and many other fields. Its availability and flexibility make it ideal for initial data management and analysis tasks. However, as data complexity and volume grow, and as the demand for sophisticated analysis and AI integration intensifies, it’s crucial to optimize and integrate Excel usage within an organization.
We believe that “Excel earns its place, not to erase—but to embrace its space.” By integrating Excel with specialized informatics solutions, we can continue to leverage its strengths while also utilizing the sophisticated, scalable, and robust systems necessary for advanced needs. This approach is particularly important as the future of life sciences will increasingly rely on scalable, secure, and AI-ready data management and analysis systems. Integrating Excel effectively will be crucial for driving innovation and achieving scientific breakthroughs.
“It’s not about abandoning Excel, but rather understanding its place in the broader informatics ecosystem and embracing a hybrid approach that combines its ease of use with the power of dedicated tools.”
- History of Spreadsheets: From the Past to the Present Who invented spreadsheets & What are spreadsheets used for
- Why spreadsheets aren’t going away anytime soon Gartner’s Mike Helsel: Spreadsheets are not just a tool of the past; they will be a vital component of the future of finance
- Microsoft Excel World Championship Excel Esports is a competition where participants solve unusual game tasks in Microsoft Excel. No finance, just Microsoft Excel and logical thinking skills.