Show simple item record

dc.contributor.advisorRadhakrishnan, Sridar
dc.contributor.authorRay, Randy
dc.date.accessioned2022-12-08T22:32:05Z
dc.date.available2022-12-08T22:32:05Z
dc.date.issued2022-12
dc.identifier.urihttps://hdl.handle.net/11244/336896
dc.description.abstractOne of the fastest growing concerns in the technology sector is the increased demand for power in the world's data centers. Global data center electricity use in 2021 was estimated as between 220 and 320 terawatt-hours (TWh), as much as 1.3% of global electricity demand. As the data center industry continues to expand, so too will power usage, and therefore the need for increased energy efficiency in software development. This thesis introduces a methodology that evaluates a set of programming languages based on three key metrics: performance, expressiveness, and energy use, demonstrating a fair consideration of each language's strengths and weaknesses. The framework presented creates a collection of string-matching algorithms used on DNA sequences to demonstrate the capabilities of each language, and draw out their distinctiveness. DNA sequencing was chosen due to its growing uses and applications as technology evolves and makes such sequencing faster and less expensive. This in turn has lead to a growing percentage of compute-time being spent on this field. Using the methodology presented here it will be shown that using a newer language, like Rust, has advantages that help it balance speed, ease of use, and power consumption when used for advanced scientific computing. A key part of this work introduces a novel approximate-matching algorithm to aid in this evaluation process. This new algorithm differs from current algorithms in use, in its ability to hold the gap between nucleotides to a specific maximum while allowing other gaps to exist. It will offer an alternative technique to other current approximate-matching algorithms and hopes to offer researchers another tool to consider for sequence-matching problems. The expectation is that this research will show how testing and evaluating via performance, expressiveness and energy use metrics allows for rating and ranking programming languages in a consistent and reproducible manner. This will enable developers to make educated choices when selecting a language for a project. The methods described here will be applicable to other languages as well, given similar data to work with. This research will benefit the programming field by providing methods and techniques that can be used in the language selection process, particularly when energy efficiency is as important as overall performance.en_US
dc.languageen_USen_US
dc.subjectstring matchingen_US
dc.subjectRusten_US
dc.subjectenergy efficiencyen_US
dc.subjectsoftware performanceen_US
dc.subjectdna sequencesen_US
dc.subjectapproximate matchingen_US
dc.subjectBiologyen_US
dc.subjectBioinformaticsen_US
dc.titleEvaluating Languages for Bioinformatics: Performance, Expressiveness and Energyen_US
dc.contributor.committeeMemberGrant, Christan
dc.contributor.committeeMemberGruenwald, Le
dc.date.manuscript2022-12
dc.thesis.degreeMaster of Scienceen_US
ou.groupGallogly College of Engineering::School of Computer Scienceen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record