Teaming Up to Help Solve Complex Problems in Biology
Brookhaven Lab software engineer Arfath Pasha is part of a collaborative multi-lab project focused on advancing our knowledge of plants and microbes to optimize sustainable energy production and improve the environment
May 22, 2018
Arfath Pasha is a software engineer in Brookhaven Lab's Computational Science Initiative, where he is developing the infrastructure for a computational platform designed to enable scientists to predict—and ultimately design and engineer—biological functions for sustainable bioenergy and environmental solutions. In his free time, he coaches a youth robotics team through FIRST LEGO League.
Working with others to solve problems is what Arfath Pasha does as an advanced applications engineer in the Computational Science Initiative at the U.S. Department of Energy’s (DOE) Brookhaven National Laboratory. Since joining Brookhaven Lab in January 2017, Pasha has been helping build out the infrastructure for a web-based bioinformatics platform for predictive biology. This open-source data and software platform, DOE’s Systems Biology Knowledgebase, or KBase, has been under development for the past six years through a collaboration involving more than 30 researchers from four DOE national laboratories—Argonne, Brookhaven, Lawrence Berkeley, and Oak Ridge—and several partnering organizations, including Cold Spring Harbor Laboratory and the University of Tennessee.
KBase integrates systems biology data for plants and microbes, as well as tools for managing, analyzing, and sharing these data. Systems biology refers to the study of the structures and interactions of complex biological systems, from the smallest units (atoms, molecules, cells) to the largest (organisms, populations, species). KBase was designed to facilitate collaboration among biologists and bioinformaticians, advancing our understanding of the biological functions within plant and microbial systems. The ultimate goal is to apply this understanding to produce sustainable biofuels, sequester carbon in the ecosystem, clean up polluted environments, and solve other energy and environmental challenges.
KBase combines information about plants, microbes, and the complex biomolecular interactions that take place inside these organisms into a single, integrated knowledgebase along with computational analysis tools.
“KBase’s integration of data and tools has the potential to empower scientists in a broad range of application areas for systems biology, including environmental analysis, biosystems design, and bioenergy,” explained Nomi Harris, KBase communications lead and bioinformatics project manager at Lawrence Berkeley National Laboratory. “Its sharing capabilities amplify this potential by enabling scientists with differing expertise to easily work together and leverage each other’s work. KBase users have applied the system to address a range of scientific problems, including comparing genomes of plant species, predicting microbiome interactions, and modeling the metabolism of environmental and engineered microbes.”
Finding a needle in a haystack
In order for users to get the most out of KBase, they need to be able to leverage knowledge from relationships within the diverse datasets shared by users and make connections between entities. For instance, they may need to find publications that discuss certain genomes containing particular genes with specific protein functions that depend on one particular type of environment. Microbial genomes consist of several million DNA base pairs—for example, the bacterium E. coli has three million—and plants have significantly larger genomes, with some containing well over 100 billion base pairs. Sifting through these DNA sequences to find genes encoding enzymes that are unique to specific metabolic pathways presents a needle-in-a-haystack kind of problem.
Here is where Pasha comes in. He is working with a team of software engineers to build a knowledge engine so users can easily pinpoint the information they are looking for. According to Pasha, building out the backend infrastructure for this searching capability presents a number of engineering challenges: “Bioinformatics experiments require large amounts of data, lots of customization, and multiple steps to reach the final results. Showing these data trails and allowing for customization adds a level of complexity to the underlying system, as each step may require the application of several configurations.”
While a search capability already exists within KBase, Pasha and fellow software engineers are working together to improve the query response time, user experience, and quality of the search results. This work involves translating product descriptions based on user needs into functionalities and regularly communicating with scientists who possess the appropriate domain knowledge to ensure that the platform produces meaningful results.
For Pasha, KBase not only provides an opportunity for him to apply his engineering skills but also to learn. He spent the past year taking introductory online courses in genomics. The knowledge he gained in RNA sequencing ended up helping improve KBase applications in this area.
“The fact that DNA operates like a tiny machine and holds information required to perform so many functions at such a microscopic level is simply mind boggling,” said Pasha. “It is very rewarding to be a part of this project and to learn of all the scientific discoveries enabled by KBase.”
According to Harris, science performed within KBase has been published in more than 30 peer-reviewed publications. This science includes the reconstruction of more than 8000 models of core metabolism across bacterial species and the reconstruction of semi-curated metabolic models for 773 human gut microbes. Some of the research has been publicly shared as reproducible workflows called Narratives that any user can view, copy, and re-run.
“Through these public Narratives, scientists can rapidly follow the examples set by their peers to apply similar approaches to new data and scenarios,” said Harris. “Thus, KBase goes beyond supporting reproducible science to enable rapid re-purposing, re-application, and extension of scientific techniques.”
“The KBase project is all about teamwork,” said Pasha. “What we are building here can only be achieved through collaboration, and the platform itself is designed to facilitate collaboration. The endeavor to both build KBase and use it to advance our understanding of plants and microbes can’t be done alone. No matter how good you are as an individual, if you can’t work in a team, a successful outcome is impossible.”
Applying lessons learned in teamwork
Pasha’s strong belief in teamwork in part comes from his experiences coaching elementary school children—including his son—who are participating in the FIRST LEGO League. This international robotics competition brings together teams of children in fourth through eighth grade and adult coaches to research real-world scientific problems and develop potential solutions. Part of the competition involves building and programming an autonomous LEGO Mindstorms robot to complete certain missions related to the chosen theme for that year.
Pasha and co-coach Joshua Peskay have been coaching their sons’ team of fifth graders for the past five years under the Forest Hills Robotics League, a nonprofit volunteer-based organization. In March, the team—named H7O—competed against 64 teams during the New York City FIRST LEGO League Robotics Championship held at City College of New York. At this citywide competition, which is only one level below the world championship, H7O won the first-place award for programming.
The 2017–2018 theme was hydrodynamics, and H7O built a question-and-answer game designed to educate players about water misuse. To research the topic, they spoke to game designers, a military officer who went without a shower for 40 days, and an individual who was trying to get fresh water to the residents of Puerto Rico after the recent hurricane. The competition has since ended for this season, but the team continues to work on the project.
Though the technology components of the competition are challenging, Pasha finds that the hardest lessons to teach the young minds relate to core values. The teams are judged not only on the quality of their research and robot design but also on their performance as a team.
The H7O team with their trophy after winning the first-place award for programming at the NYC FIRST LEGO League Citywide Championship on March 11.
“It is not easy for 10-year olds to learn how to cooperate and function in a collaborative environment, but the ability to work with others forms the foundation for everything else that they will go on to do,” said Pasha. “The experience in teaching them core values is very enriching. I can relate to a lot of the areas they struggle with in my own work, and I find myself reflecting on these areas and reinforcing certain ideas to improve my teamwork skills. This experience has taught me how much you can benefit by listening to and being encouraging of others’ ideas and exercising patience.”
Seeking out creative engineering opportunities
Robotics has always been a passion of Pasha’s. Ever since he was a kid growing up in India, he has been fascinated with the mechanics of systems, such as how car wheels turn. He studied mechanical engineering as an undergraduate at the University of Mysore in India and went on to obtain master’s degrees in computer science and mechanical and aerospace engineering at the University of Florida. He specialized in robotics as a graduate student, serving as a research assistant at the Center for Intelligent Machines and Robotics, where he developed algorithms for autonomous systems used in nuclear facilities for the removal of contaminated waste.
After graduating, he built unmanned systems for seven years as a senior software engineer in the Advanced Robotics Group at the Air Force Research Laboratory, Tyndall Air Force Base. He then worked briefly in industry on cloud infrastructure development before joining Columbia University’s Center for Computational Learning Systems. Here, he led a team of developers to build commercial natural language processing tools for Arabic. Three years later, he reentered industry as a software engineer at a startup software security firm.
“I like working in areas where I can exercise my creativity,” said Pasha. “Over my career, I have sought out projects that I felt would be a lot of fun in terms of learning new things. KBase is one of those projects. I had not touched biology since high school. My previous work in robotics has no connection with biology, but I am carrying over my engineering skills and creatively applying them to improve the performance, efficiency, simplicity, and maintainability of KBase. It is exhilarating to be part of a team made up of really talented people all working toward the same goal of solving complex biological problems.”
KBase is funded by the DOE Office of Science.
Brookhaven National Laboratory is supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.
2018-12811 | INT/EXT | Newsroom