It’s Time to Build New Systems for Scientists (Particularly Life Scientists)
Society and the Cambridge Innovation Cluster Will Benefit
Scientists are potentially the most important technology end-users on the planet. They are the people who are conducting research that has the potential to improve and even save lives. Yet, for the most part, scientists have been almost criminally under-served by information technologists and the broad technology community.
Interestingly, life sciences have had a tremendous impact on computer science. Take, for example, Object Oriented Programming (OOP), developed by Dr. Alan Kay. A biologist by training, Dr. Kay based the fundamental concepts of OOP on microbiology: “I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning — it took a while to see how to do messaging in a programming language efficiently enough to be useful).”
It’s time to return the favor.
Pockets of Revolution and Excellence
Of course, there are others who are trying to advance information technology for scientists, like visionaries Jim Gray, Mike Stonebraker, and Ben Fry.
Throughout his career, Jim Gray, a brilliant computer scientist and database systems researcher, articulated the critical importance of building great systems for scientists. At the time of his disappearance, Jim was working with the astronomy community to build the world-wide telescope. Jim was a vocal proponent of getting all the world’s scientific information online in a format that could easily be shared to advance scientific collaboration, discourse and debate. The Fourth Paradigm is a solid summary of the principles that Jim espoused.
My partner and Jim’s close friend Mike Stonebraker started a company — Paradigm4, based on an open source project called SciDB — to commercialize an “array-native” database system that is specifically designed for scientific applications and was inspired by Jim’s work. One of my favorite features in P4/SciDB is data provenance, which is essential for many scientific applications. If the business information community would wake up from its 30-year “one-size-fits-all” love affair with traditional DBMS, it would realize that new database engines that have provenance as an inherent feature can create better audit-ability than any of the many unnatural acts they currently do with their old school RDBMS.
Another fantastic researcher who is working at the intersection of the life sciences and computer science is Ben Fry. Ben is truly a renaissance man who works at the intersection of art, science and technology. He’s a role model for others coming out of the MIT Media Lab and a poster child for why the Lab is so essential. Cross-disciplinary application of information technology to applications in science is perhaps the most value-creating activity that smart young people can undertake over the next 20 years. (At least it’s infinitely better than going to Wall Street and making a ton of dough for a bunch of people that already have too much money).
Time to Step Up IT for Scientists
But ambitious and visionary information technology projects that are focused on the needs of scientists are too rare. I think that we need to do more to help scientists — and I believe that consumer systems as well as traditional business applications also would benefit radically.
As a technologist and entrepreneur working in the life sciences for the past 10+ years, I’ve watched the information technology industry spend hundreds of billions of dollars on building systems for financial people, sales people, marketers, manufacturing staff, and administrators. And, more recently, on systems that enable consumers to consume more advertising and retail products faster. Now the “consumerization of IT” that drove the last round of innovation in information technologies — Google, Twitter, Facebook, and so on — is being integrated into the mainstream of corporate systems. (In my opinion, however, this is taking 5 to 10 times longer than it should because traditional corporate IT folks can’t get their heads around the new tech.)
Meanwhile, scientists have been stuck with information technologies that are ill-suited to their needs — retrofitted from non-science applications and use cases. Scientists have been forced to write their own software for many years and develop their own hardware in order to conduct their research. My buddy Remy Evard was faced with this problem — primarily in managing large-scale physics information — while he was the CIO at Argonne National Labs. Now he and I share the problem in our work together at the Novartis Institutes for Biomedical Research (NIBR).
When I say “systems,” I am talking about systems that capture the data and information generated by scientists’ instruments and let scientists electronically analyze this data as they conduct their experiments and also ensure the “repeatability” of their experiments. (Back to the value of provenance). With the right kind of systems, I believe we could:
- Radically increase the velocity of experimentation. Make it much easier for scientists to do their experiments quickly and move on to the next experiment sooner
- Significantly improve the re-usability of experimentation — and help eliminate redundancy in experiments
- Ensure that experiments — both computational and wet lab — can be easily replicated
- Radically improve the velocity of scientific publication
- Radically improve the ability of peers to review and test the reproducibility of their colleagues’ experiments
Essentially, with radically better systems, we would vastly improve the productivity and creativity of science. I also believe there would be immeasurably large benefits to society as a whole — not only from the benefits of more effective and efficient science, but also as these systems improvements are applied to consumer and business information systems.
How We Got Here
So, what’s holding us back?
Scientific applications are highly analytical, technical, variable and compute-intensive — making them difficult and expensive to build.
Scientific processes don’t lend themselves to traditional process automation. Often the underlying ambiguity of the science creates ambiguity in the systems the scientists need, making development a real moving target. Developers need to practice extreme Agile/Scrum methods when building systems for scientists –traditional waterfall methods just won’t work.
Developers need to treat the core data as the primary asset and the applications as disposable or transitory. They must think of content, data and systems as integrated and put huge effort into the management of meta-data and content.
Great systems for scientists also require radical interoperability. But labs often operate as fiercely independent entities. They often don’t want to share. This is a cultural problem that can torpedo development projects.
These challenges demand high-powered engineers and computer scientists. But the best engineers and computer scientists are attracted to organizations that value their skills. These organizations have traditionally not been in life sciences, where computer scientists usually rank near the bottom of the hierarchy of scientists.
When drug companies have a few million dollars to invest in their lab programs, most will usually choose to invest it in several new new chemists instead of an application that improves the productivity of their existing chemists. Which would you choose? Some companies have just given up entirely on information systems for scientists.
So, scientists are used to going without — or they resort to hacking things together on their own. Some of them are pretty good hackers [watch]. But clearly, hacking is a diversion from their day jobs — and a tremendous waste of productivity in an industry where productivity is truly a life-and-death matter.
No More Excuses
It’s also completely unnecessary, given the changes in technology. We can dramatically lower the cost and complexity of building great systems for scientists by using Web 2.0 technologies, HTML5, cloud computing, software-as-a-service application models, data-as-a-service, big data and analytics technologies, social networking and other technologies.
For example, with the broad adoption of HTML5, apps that used to require thick clients can now be built with thin clients. So we can build them more quickly and maintain them more easily over time. We can dramatically lower the cost of developing and operating flexible tools that can handle the demanding needs of scientists.
Using Web technologies, we can make scientific systems and data more interoperable. With more interoperable systems based on the Web, we can capitalize on thousands of scientists sharing their research and riffing off of it — whether inside one company or across many. There’s tremendous benefit to large scale in the scientific community — if you can make it easier for people to work together.
Many of my friends have been asking me why I’ve been spending so much time at the Novartis Institute for Biomedical Research (NIBR) over the past three years. The simple answer is that I believe that building systems for scientists is important — and I find it incredibly rewarding.
One of the lessons I learned while we were starting Infinity Pharmaceuticals was that while scientists needed much better systems, small biotech companies didn’t have the resources to build these systems. So, every time we looked at spending $1 Million on a software project at Infinity, we always decided it was better to hire another five medicinal chemists or biologists. The critical mass required to build great systems for scientists is significant — and arguably even exceeds the resources of a 6,000+person research group like NIBR. The quality of third-party solutions available for scientists and science organizations such as NIBR are pitiful: there have been only a handful of successful and widely adopted systems, Spotfire being the most notable example. Scientists need better 3rd party systems and delivering these systems as web services might finally provide a viable business model for “informatics” companies. Two great new examples of these types of companies are Wingu and Syapse — check them out.
Calling All Cambridge Technologists
So here’s my challenge to the technology industry: How about investing $3 to $4 billion on systems for scientists? And have the Cambridge Innovation Cluster take the lead?
If you work in Kendall or Harvard Squares, you can’t throw a Flour Bakery & Café sticky-bun without hitting a scientist. It’s one huge research campus with millions of square feet of laboratory space and scientists everywhere.
Get engaged with scientist-as-the-end-user — they’ll welcome you, trust me. Build something in one of these labs. If you build something compelling in a thoughtful way, it’s going to be noticed, sought and adopted by others.
Since Cambridge has one of highest concentrations of scientists in the world, technologists here should focus on building systems for scientists. I’m betting that they can do it better than anyone else in the world.
What do you think?
Originally published: Apr 3, 2012