UTM’s High-Performance Computing cluster drives data-driven research
Of the quadrillions of bits of data UTM’s High-Performance Computing (HPC) cluster has probably processed for the campus’s researchers since its debut in 2018, one fact has emerged as the most important: Using this mega-computation system is more accessible than you might think.
“When I first meet (potential users), it’s like a deer in the headlights,” says Brian Novogradac, a UTM I&ITS senior system analyst who worked with Dell to design the cluster. He now helps walk researchers through the system. “There’s a lot of information there, it can really intimidate users. I tell them from the beginning that I’m going to be using terminology that won’t make sense to you, but you’ll soon get used to using it.”
With just under 500 core systems and 3.5 terabytes of RAM, the HPC offers UTM researchers high-power data processing capabilities in shared Linux-based software programs such as Python, R and other open source tools, with improved security and data storage.
Novogradac reassures new users they will only need to create the scripts specific to their research — I&ITS will handle the heavy lifting.
“We took the computer science out of the research for most users,” says Novogradac, who has been on UTM’s IT team for almost 15 years. “We want them to be able to just get in there and get going. We take care of all the administration, the hardware, the software.”
The HPC system provides high-power data processing capabilities faculty need for their research. The cluster now has just under 100 registered users, and, although the science departments are its heaviest users, it is also currently supporting studies related to other fields. Novogradac says it is intriguing the innovative ways the users use the cluster.
“I am always open to new ideas,” he says. “You bring something new to me and I’m excited to play with it.”
The cluster is designed to provide UTM’s faculty researchers with a ‘step up' from trying to compute data on desktops or standalone systems in their lab, while still leaving them the option to use supercomputer consortiums like U of T’s SciNet or the national CompuCanada.
UTM’s HPC system is restricted to UTM faculty, or students who are directly sponsored by faculty. Its exclusivity has made it a source of envy in the research community, Novogradac says.
The cluster hopefully helps attract potential faculty to UTM.
“If new researchers want to come on campus they are going to ask, ‘Do you have this or that research component?’,” he says. “(The cluster has) become a selling technique to bring them to UTM.”
The research cluster is funded by the Office of the Vice-Principal, Research and ongoing support from UTM’s I&ITS team. It’s also growing through additions from the cluster’s users themselves who often get grants that include funds for equipment.
Tapping into the resources already in the HPC allows researchers to stretch their grant money further because there is a central resources available that they can contribute to, rather than purchasing standalone equipment that is not scalable and requires service, support and administration.
Being a part of the central resource that way, they have access to not just their resources but the services as a whole. It helps expand the cluster’s resources, which is one of the reasons the HPC has advanced so quickly, Novogradac says. It also ensures their equipment continues to receive updates and maintenance.
The researchers pool not only their computers, but also their knowledge, Novogradac says, through a shared list of resources, cheat sheets and a wiki.
“Once I introduce them to the system and basic workflows, if they pick up additional tips or tricks, we have the wiki to share with others as it could assist others,” he says, noting many even share the coding they have created for their research.
And the researchers are recognizing the value of Novogradac’s guidance and the HPC in their work, including inviting him to their thesis interviews, and citing him in their research, including that of biology professor Adriano Senatore, who used the HCR to examine the genomes of many species to look for a gene that plays a crucial role in structuring synapses in both vertebrates and invertebrates. The results were featured in a cover story in the journal Genome Biology and Evolution, complete with a credit to Novogradac.
“It was a real surprise and honour,” says Novogradac, who partly attributes his passion for coaching the researchers to his experience as a referee with Ontario Soccer. “But even without my name, it would be exciting, because it shows what the cluster can do.”
Novogradac says the acquisition of knowledge is a two-way street.
“I’ve learned so many things helping these researchers,” he says. “Now, I’m an IT administrator using bioinformatics terminology…and actually understanding it. It’s a great experience for all of us.”