The “Unknome” Database: Cataloging Lesser-Known Human Proteins for Advanced Research

by Liam O'Connor
7 comments
Unknome database

Scientists based in the United Kingdom have established a public database known as the “Unknome,” featuring thousands of under-researched proteins that are products of human genes. The platform employs a metric referred to as “knownness” score for each listed protein, relying on extant scientific data, thereby facilitating investigators in delving into the roles these proteins perform, some of which are crucial in cellular activities.

Accelerating Scientific Inquiry by Zeroing in on Unidentified Proteins.

The British researchers aim for this database to eventually contract in size, as its purpose is to serve as a comprehensive archive of the myriad proteins encoded by the human genome whose functions largely remain mysterious.

Originating from the collaborative efforts of Matthew Freeman of the Dunn School of Pathology at the University of Oxford, England, and Sean Munro of the MRC Laboratory of Molecular Biology in Cambridge, England, among others, the “Unknome” is detailed in the open-access scientific journal PLOS Biology. Investigations into a selection of proteins within the database show that a substantial number are implicated in vital cellular activities, encompassing development and adaptability to stress.

The completion of human genome sequencing reveals that it holds thousands of potential protein sequences whose functional roles and identities are yet to be discovered. This ambiguity is attributed to several factors, such as limited research funding allocated to well-known targets and the absence of appropriate investigative instruments like antibodies.

The scholars emphasize the considerable risks involved in overlooking these lesser-known proteins, advocating that many could be essential in critical cellular mechanisms, offering both insights and avenues for medical treatments.

To expedite research into these unidentified proteins, a “knownness” score is allocated to each protein in the “Unknome” database, based on information available in scientific literature concerning factors like functionality, cross-species conservation, sub-cellular localization, among other variables.

According to this rating system, a significant number of proteins have a near-zero “knownness” score. The database is inclusive of proteins from model organisms as well as those originating from the human genome and is configurable, thereby enabling researchers to apply their own weighting criteria to determine priorities in their investigative endeavors.

To assess the database’s effectiveness, the authors analyzed 260 human genes that had analogous genes in flies, all with “knownness” scores of 1 or lower, indicating a lack of substantive knowledge about them. Many of these genes, when fully deactivated, resulted in the fly’s non-viability; whereas partial or tissue-specific deactivations revealed that a large portion had a substantial impact on functions like fertility, growth, protein regulation, and stress tolerance.

Despite comprehensive research spanning several decades, the study concludes that there exists a vast number of unexplored genes in flies, and the same holds true for the human genome. As Munro points out, these unidentified genes do not warrant their current state of neglect. The “Unknome” serves as an efficient, versatile platform to pinpoint and select crucial genes with unidentified functions for scrutiny, thereby hastening the process of filling the knowledge gap that the database epitomizes.

Munro further states, “Given that the function of thousands of human proteins remains ambiguous, current research disproportionately centers around those that are well-defined. Our Unknome database seeks to counter this by ranking proteins according to their level of obscurity and then subjecting a subset of these enigmatic proteins to functional testing, thereby showing how gaps in understanding can propel biological discovery.”

This research has been funded by the Medical Research Council as a segment of United Kingdom Research and Innovation. Additional funding was provided by the Engineering and Physical Sciences Research Council and the Alan Turing Institute via a Turing Fellowship. The financial sponsors had no influence on the study design, data gathering and interpretation, publication decision, or manuscript preparation.
DOI: 10.1371/journal.pbio.3002222

Frequently Asked Questions (FAQs) about Unknome database

What is the “Unknome” database?

The “Unknome” is a publicly accessible database developed by scientists in the United Kingdom. It lists thousands of under-researched proteins that are encoded by human genes. The database aims to facilitate and accelerate research into these lesser-known proteins, some of which play critical roles in cellular functions.

Who developed the “Unknome” database?

The database was developed through a collaboration between Matthew Freeman of the Dunn School of Pathology, University of Oxford, England, and Sean Munro of the MRC Laboratory of Molecular Biology in Cambridge, England, along with their colleagues.

Where is the “Unknome” database described?

The “Unknome” database is described in the open-access journal PLOS Biology.

What is a “knownness” score?

A “knownness” score is a metric assigned to each protein listed in the “Unknome” database. This score is based on the existing scientific literature and reflects the extent to which the function, conservation across species, and other attributes of a protein are understood.

What is the purpose of the “knownness” score?

The purpose of the “knownness” score is to aid researchers in prioritizing their investigation of proteins. Proteins with lower “knownness” scores represent areas where little is known, hence offering avenues for new research and discovery.

Why do the database’s developers hope to see it shrink over time?

The creators of the database anticipate its contraction as a positive development because it would mean that many of the currently understudied proteins have been researched and understood, thereby reducing the number of proteins with low “knownness” scores.

How can the “Unknome” database be useful for therapeutic intervention?

The database identifies proteins that are less studied but may play important roles in cellular processes. Understanding these proteins could offer new targets for therapeutic interventions, providing both insights and avenues for medical treatments.

Is the database customizable?

Yes, the “Unknome” database is open to the public and is customizable. Researchers can apply their own weighting criteria to the “knownness” scores, thereby tailoring the database to prioritize their specific research interests.

Who funded this research?

The research was funded by the Medical Research Council as part of United Kingdom Research and Innovation. Additional funding came from the Engineering and Physical Sciences Research Council and the Alan Turing Institute through a Turing Fellowship.

More about Unknome database

You may also like

7 comments

John Doe September 15, 2023 - 9:38 pm

Wow, this Unknome thing is game changing. I mean, how often do you see a database that actually wants to get smaller? Really flips the script on what success looks like in science.

Reply
Mike Johnson September 16, 2023 - 12:45 am

Are they sure they wanna go public with it? What if, like, bad actors use the database for something dodgy?

Reply
Laura Davis September 16, 2023 - 12:58 am

I’m curious how they’ll keep the database updated. With new research coming out, those knownness scores could change real fast.

Reply
Sarah Williams September 16, 2023 - 2:27 am

science moves fast, but it’s awesome that they’re looking at the areas we’re behind in. Filling gaps that we didn’t even know we had. Props to the team!

Reply
Emily Brown September 16, 2023 - 4:50 am

This is like the Google for under-researched proteins. The knownness score is brilliant, gives you a roadmap to the unknown. The future of medicine’s right here, people.

Reply
Tom Clark September 16, 2023 - 8:26 am

i didn’t get the point of a database getting smaller. but hey, if it means we learn more bout these proteins, that’s a win, right?

Reply
Jane Smith September 16, 2023 - 10:30 am

fascinating stuff but what caught my eye is how they got funding from multiple sources including Turing Fellowship. Those folks don’t throw money around, so this must be legit.

Reply

Leave a Comment

* By using this form you agree with the storage and handling of your data by this website.

SciTechPost is a web resource dedicated to providing up-to-date information on the fast-paced world of science and technology. Our mission is to make science and technology accessible to everyone through our platform, by bringing together experts, innovators, and academics to share their knowledge and experience.

Subscribe

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!