Jonathan Lawson, Author at Terra https://terra.bio/author/jlawson/ Science at Scale Wed, 27 Dec 2023 04:55:59 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.1 https://terra.bio/wp-content/uploads/2023/12/Terra-Color-logo-300-150x150.pngJonathan Lawson, Author at Terrahttps://terra.bio/author/jlawson/ 32 32 Discover How AnVIL Empowers Compliance with NIH DMS Policy – Join Our Webinar!https://terra.bio/discover-how-anvil-empowers-compliance-with-nih-dms-policy-join-our-webinar/ https://terra.bio/discover-how-anvil-empowers-compliance-with-nih-dms-policy-join-our-webinar/#respond Tue, 05 Sep 2023 12:00:47 +0000 https://terrabioappdev.wpenginepowered.com/discover-how-anvil-empowers-compliance-with-nih-dms-policy-join-our-webinar/Summary: In the rapidly evolving landscape of scientific research, data sharing has emerged as a key driver of innovation and collaboration. Recognizing this, the National Institutes of Health (NIH) unveiled the Data Management and Sharing (DMS) Policy on January 25, 2023. This policy mandates the sharing of all scientific data funded or generated by NIH, […]

The post Discover How AnVIL Empowers Compliance with NIH DMS Policy – Join Our Webinar! appeared first on Terra.

]]>
Summary:

In the rapidly evolving landscape of scientific research, data sharing has emerged as a key driver of innovation and collaboration. Recognizing this, the National Institutes of Health (NIH) unveiled the Data Management and Sharing (DMS) Policy on January 25, 2023. This policy mandates the sharing of all scientific data funded or generated by NIH, with the goal of enhancing the findability, accessibility, reusability, and impact of research. To help researchers navigate the DMS Policy’s requirements effectively, we are thrilled to introduce AnVIL, a NIH-supported Scientific Data Repository. Join us for an exciting webinar on “How the AnVIL Helps Meet DMS Policy Requirements” to learn more about this revolutionary platform.

Webinar Details:

Date: November 8th

Time: 1:00 PM ET

Topic: How the AnVIL Helps Meet DMS Policy Requirements

Registration Link

Webinar Agenda:

  1. Brief Background on the NIH Data Management and Sharing Policy
  2. Intro to AnVIL and its key components to meet DMS requirements (Terra and DUOS)
  3. Interactive Q&A & discussion session with hosts and representatives from other institutions

 

Background:

With the explosion of data generation from various sources, including genomics and single-cell profiling, the need for comprehensive data sharing policies became evident. NIH has been at the forefront of supporting robust data sharing practices, and the new DMS Policy further strengthens its commitment to responsible data management. By adhering to the DMS Policy, researchers contribute to the acceleration of biomedical research, validation of results, and accessibility to high-value datasets.

Introducing AnVIL:

AnVIL is an NIH-supported Scientific Data Repository that facilitates the management, sharing, and accessibility of scientific data. Already renowned as a gold-standard repository for large genomics datasets, AnVIL has expanded its offerings to cater to the broader requirements of the DMS Policy. This platform is open to all Institutions, Initiatives, Consortia, and Researchers seeking to meet DMS Policy requirements efficiently.

Key Components of AnVIL for DMS:

Cloud-Native Data and Metadata Storage

Terra provides intuitive self-service tools for secure, cloud-native data storage and management. Researchers can organize and document their data, ensuring it adheres to the FAIR principles (Findable, Accessible, Interoperable, and Reusable) with ease.

Data Sharing & Controlled-Access Management

DUOS is a data sharing platform that allows researchers to register datasets with UIDs/PIDs in a public catalog and leverage controlled-access management software to support Data Access Committees (DACs) when needed. Researchers can customize data access controls based on specific Data Use Limitations (DULs), ensuring secure and responsible data sharing.

Why Attend the Webinar?

This webinar presents a unique opportunity to explore AnVIL’s capabilities and understand how it can streamline your data management and sharing journey. Whether you’re a researcher, part of an institution, or representing an initiative or consortia, the webinar will equip you with valuable insights and resources. Engage in real-time discussions with our team and peers from other institutions facing similar challenges.

Don’t Miss Out! Register Now:

Join us on November 8th at 1:00 PM ET to unlock the potential of AnVIL and optimize your data management and sharing practices. Register here and secure your spot for this exciting event.

Conclusion:

The NIH Data Management and Sharing Policy ushers in a new era of transparent and collaborative research. With AnVIL, researchers gain a powerful ally in meeting the DMS Policy requirements, making data sharing easier, more secure, and more impactful. Join us at the webinar and be part of the data-driven future that promises to revolutionize biomedical research. See you there!

The post Discover How AnVIL Empowers Compliance with NIH DMS Policy – Join Our Webinar! appeared first on Terra.

]]>
https://terra.bio/discover-how-anvil-empowers-compliance-with-nih-dms-policy-join-our-webinar/feed/ 0
Streamlining Data Access: DUOS Introduces Automatic Access Permission Updateshttps://terra.bio/streamlining-data-access-duos-introduces-automatic-access-permission-updates/ https://terra.bio/streamlining-data-access-duos-introduces-automatic-access-permission-updates/#respond Mon, 14 Aug 2023 12:00:31 +0000 https://terrabioappdev.wpenginepowered.com/streamlining-data-access-duos-introduces-automatic-access-permission-updates/Introduction Data access for researchers is critical to modern scientific endeavors. But new data access requests are hampered by an often cumbersome and manual process. Now, DUOS (Data Use Oversight System) automatically updates users’ access permissions, simplifying and expediting the process for researchers, ensuring they can swiftly and securely access the data they need. In […]

The post Streamlining Data Access: DUOS Introduces Automatic Access Permission Updates appeared first on Terra.

]]>
Introduction

Data access for researchers is critical to modern scientific endeavors. But new data access requests are hampered by an often cumbersome and manual process. Now, DUOS (Data Use Oversight System) automatically updates users’ access permissions, simplifying and expediting the process for researchers, ensuring they can swiftly and securely access the data they need. In this blog post, we explore the benefits and implications of the new automatic access permission update feature in DUOS, and link to resources to leverage DUOS to streamline data access.

The Current Data Access Landscape

Data access committees (DACs) and Data Custodians play a crucial role in managing the flow of valuable research data, ensuring the data is used ethically and responsibly. Traditionally, researchers had to manually request updates to their access permissions from the DAC when they were granted secondary access to new datasets. DACs, the centerpiece of the approval process, were overwhelmed by exponentially increasing amounts of data and data requests. The cumbersome process often led to delays and hindered the timely advancement of research initiatives.

DUOS’ Automatic Access Permission Updates

DUOS has emerged as a leading software platform for facilitating data access and management. Now, DUOS automatically updates researchers’ access permissions when they are approved for secondary access to requested research data. This eliminates the need for manual intervention, allowing researchers to seamlessly access the approved data without any delays. By streamlining the data access process, DUOS empowers researchers to focus on their work rather than navigating administrative barriers.

How Automatic Access Updates Benefit Researchers

  • Expedited research process, as scientists can access new data promptly upon approval. 
  • Automation reduces the administrative burden on researchers, freeing up more time for analysis and experimentation
  • Increased efficiency ensures that researchers can focus on their science

Strengthening Data Security and Privacy

DACs and Data Custodians know manual access permission updates carry an inherent risk of human error or data breaches. DUOS’ automated permission updates ensure that access permissions are updated accurately, promptly, and in compliance with privacy regulations, reducing the chances of data mishandling, and safeguarding sensitive information.

Conclusion

DUOS’ automatic access permission update marks a significant milestone in the realm of data access and management. By simplifying the process and enhancing security, researchers can focus on their core scientific pursuits and data custodians can ensure efficient, broad and secure access to their datasets. As DUOS continues to innovate and streamline data access, it promises to revolutionize the landscape of research, empowering scientists worldwide in their quest for knowledge and breakthrough discoveries.

DACs and Data Custodians can learn more about how to leverage this new feature here.

The post Streamlining Data Access: DUOS Introduces Automatic Access Permission Updates appeared first on Terra.

]]>
https://terra.bio/streamlining-data-access-duos-introduces-automatic-access-permission-updates/feed/ 0
Introducing Terra Data Repository public previewhttps://terra.bio/introducing-terra-data-repository-public-preview/ https://terra.bio/introducing-terra-data-repository-public-preview/#respond Thu, 08 Dec 2022 18:44:51 +0000 https://terrabioappdev.wpenginepowered.com/introducing-terra-data-repository-public-preview/Discover the newest component of the Terra platform, designed to provide data storage and access management capabilities tailored for the life sciences.

The post Introducing Terra Data Repository public preview appeared first on Terra.

]]>
Jonathan Lawson is a Senior Software Product Manager in the Broad Institute Data Sciences Platform, overseeing data management products including the Terra Data Repository and the Data Use Oversight System. In this guest blog post, Jonathan announces the public preview phase of the Terra Data Repository, a new component of the Terra platform designed to provide data storage and access management capabilities tailored for the life sciences.


 

Life sciences research has entered an age of extraordinary opportunity thanks to the rapid technological developments of the past decade. We are now able to generate vast amounts of molecular information, such as genomic sequencing, and we can put that molecular data in the context of phenotypes and clinical history to probe the biology of both health and disease in unprecedented detail. These capabilities are already starting to revolutionize how we approach everything from fundamental research into population genetics to diagnostics and drug development.

Yet these technological prowesses also bring forth new technical challenges. The resulting datasets are complex, combining enormous files of molecular data with structured information —such as phenotypic data— that is best stored in database form. In addition, data assets collected from human participants are subject to various constraints with regard to how they can be shared, and with whom. 

Solving this challenge calls for data storage and sharing solutions that empower data owners and custodians to make their datasets available for analysis to the research community securely, responsibly and effectively.

Today, we are excited to introduce the Terra Data Repository (TDR), a new component of the Terra platform designed to provide data storage and access management capabilities tailored for the life sciences. It is already actively being used for large collaborative projects including the Human Cell Atlas and the NHGRI’s AnVIL. 

The system supports using formal schemas to represent relationships between different data entities, and generating versioned snapshots that can be used to grant collaborators access to specific subsets of data depending on research purpose and authorizations. Data snapshots are immutable, making it possible to release continuous updates to datasets while ensuring reproducibility of analyses over time. 

For a complete overview of features, usage instructions and detailed technical information, please visit the TDR documentation in the Terra knowledge base. 

The Terra Data Repository is available as a public preview to all registered users of Terra. Please note that the graphical user interface is still under active development, and many operations can currently only be performed through API calls. During this time, we recommend reaching out to the Terra support team to discuss whether the Terra Data Repository might be a good fit for your project’s needs.

 

The post Introducing Terra Data Repository public preview appeared first on Terra.

]]>
https://terra.bio/introducing-terra-data-repository-public-preview/feed/ 0
Expediting scientific discovery by streamlining data access with DUOShttps://terra.bio/expediting-scientific-discovery-by-streamlining-data-access-with-duos/ https://terra.bio/expediting-scientific-discovery-by-streamlining-data-access-with-duos/#respond Thu, 24 Jun 2021 14:24:17 +0000 https://terrabioappdev.wpenginepowered.com/expediting-scientific-discovery-by-streamlining-data-access-with-duos/Jonathan Lawson explains how cumbersome data access processes that create major bottlenecks in accessing genomic data and limit scientific impact can be streamlined through ontologies and automation.

The post Expediting scientific discovery by streamlining data access with DUOS appeared first on Terra.

]]>
Jonathan Lawson is a Senior Software Product Manager in the Broad Institute Data Sciences Platform, Vice Chair of the Broad Data Access Committee, and Co-Lead for Data Use in the GA4GH DURI workstream. In this DUOS feature blog post, Jonathan explains how cumbersome data access processes that create major bottlenecks in accessing genomic data and limit scientific impact can be streamlined through the use of DUOS, a system developed by the Broad Institute to send and receive data access requests.

 


 

Currently, the amount of available genomic data and the number of skilled researchers with the right tools to analyze it has significantly increased, leading requests for this data to increase exponentially. However, the world is losing out on valuable opportunities to transform this data into scientific insights that improve human health due to unnecessarily cumbersome data access processes.

 

A major bottleneck to scientific impact

Genomic data for human participants is highly identifiable and therefore often categorized as controlled-access data by virtue of the data sharing language in its consent forms. This means access to the data must be regulated by a data access committee (DAC) who assures the data will be shared and used in accordance with the data use terms in the consent forms.

Therefore, researchers must apply for access to such datasets through their designated DAC. In turn, the DAC must review the application to ensure that the researcher’s intended use of the data aligns with the permitted uses of the data, per the participants’ consent. Unfortunately, since the DACs receive language describing permitted data use and intended use from different parties (IRBs/participants vs. researchers) who use unique and sometimes ambiguous language, they are left with an apples:oranges comparison, which is difficult for them to decipher.

 

Current data access processes rely on unique, custom narrative text to describe data uses in consent forms and the datasets they represent, as well as in data access requests for those datasets. This makes adjudicating requests difficult if not impossible for DACs in certain cases.

 

The added weight of the liability involved in granting someone access to data under uncertain terms often leads DACs to undershare rather than overshare, meaning data is unnecessarily locked down, delaying or blocking significant scientific impact entirely. 

 

Facilitating data access for researchers, by enhancing DAC workflows

To address this issue and others like it, the Broad Institute developed DUOS (Data Use Oversight System), a suite of software services and policies for managing data access and sharing (more detail here). Principally, DUOS aims to consistently and transparently answer the overarching question posed to DACs: “Does the researcher’s intended use align with the permitted use of the dataset?”

In fact, DUOS aims to do this in a semi-automated fashion by leveraging the GA4GH Data Use Ontology (DUO). The GA4GH DUO provides both human-readable and machine-readable terms and definitions for data use. DUOS utilizes the GA4GH DUO to tag datasets with their permitted use, and data access requests with their intended use in human-readable and machine-readable terms. This standardization allows for DACs and the DUOS matching algorithm to easily determine if a researcher’s intended use aligns with a dataset’s permitted uses in a more apples:apples format, and allows for DUOS to leverage a matching algorithm to suggest a decision to the DAC. The DUOS team is working to further increase the accuracy of the algorithm in making a decision that aligns with that of the DAC. 

 

The GA4GH DUO allows for standardizing data use in consent forms and the datasets they represent, as well as in data access requests for those datasets. With data use language expressed in standardized and machine-readable terms, adjudicating requests is much easier for DACs and even possible to automate algorithmically.

 

While this automated matching can alleviate a serious bottleneck, other hurdles and data access and sharing remain. We aim to address those concerns in future articles and through our work on DUOS. In the meantime, please visit our website at duos.broadinstitute.org for more information and detailed documentation. Thanks for reading and we hope DUOS serves you well!

The post Expediting scientific discovery by streamlining data access with DUOS appeared first on Terra.

]]>
https://terra.bio/expediting-scientific-discovery-by-streamlining-data-access-with-duos/feed/ 0
Ontology and automation: a powerful DUO for streamlining data accesshttps://terra.bio/ontology-and-automation-a-powerful-duo-for-streamlining-data-access/ https://terra.bio/ontology-and-automation-a-powerful-duo-for-streamlining-data-access/#respond Wed, 25 Sep 2019 18:49:32 +0000 https://terrabioappdev.wpenginepowered.com/ontology-and-automation-a-powerful-duo-for-streamlining-data-access/Life as a biomedical researcher is a rollercoaster of emotions. One moment you're marvelling at how technology has revolutionized the field -- SO. MUCH. DATA. -- and the next, you're tearing your hair out at how antiquated some of our processes still are -- SO. MUCH. PAPERWORK. Yes, there is a veritable cornucopia [...]

The post Ontology and automation: a powerful DUO for streamlining data access appeared first on Terra.

]]>
Life as a biomedical researcher is a rollercoaster of emotions. One moment you’re marvelling at how technology has revolutionized the field — SO. MUCH. DATA. — and the next, you’re tearing your hair out at how antiquated some of our processes still are — SO. MUCH. PAPERWORK.

Yes, there is a veritable cornucopia of data out there now, but most of the human data is subject to complex access restrictions that can be a huge pain to deal with. To be clear, I’m not talking about basic information security like checking identity and credentials. I’m referring to restrictions on future use and sharing of the data based on the consent of study participants – generally established through study-specific consent forms. These restrictions exist for legitimate reasons, but the way they’ve traditionally been implemented leaves much to be desired.

 

Data use restrictions and the data access dating game

Due to the historical lack of standardized consent forms, each study typically uses unique language to describe their data use restrictions. There are usually no templates or standards for researchers who seek to request access to the data; they must describe in their own words what kind of research they intend to do with the data, at times not even knowing which data use restrictions apply.

Who, then, is responsible for granting or denying access to the data? That gatekeeping role belongs to a Data Access Committee (DAC), who is tasked with interpreting the data use requests from researchers and evaluating whether they match or conflict with applicable restrictions. Given the lack of consistency in how data use restrictions are formulated across the biomedical ecosystem, it can be challenging for DAC members to make these decisions with confidence or consistently, so the process takes time and effort. In addition, there’s a lot of variability in the composition and governance of these committees. Some DACs are part of large organizations like NIH and have a formal structure, dedicated staff, and a regular cadence for reviewing access requests, but many within research institutions are simply composed of research faculty members who are expected to perform these duties on top of their full-time job.

To be fair, this cumbersome system predates the advent of cloud-based data sharing, and is just as much of an obstacle in the context of traditional data sharing systems. What’s new is that cloud-based initiatives are contributing to a sharp increase in the number of researchers who are able to find datasets of interest and request access to data. As a result, DACs are increasingly unable to respond to the onslaught of requests they are receiving. When it comes down to it, human committees are simply not going to be able to scale.

All_NIH_DACs_DAR_Decision_Responses__June_2007-August_2019_.PNG

Introducing the GA4GH Data Use Ontology… and using it to automate access

To tackle this challenge, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) workstream developed an ontological standard for defining data use restrictions. The Data Use Ontology (DUO) provides a controlled vocabulary that data generators can use to formulate data use restrictions and that researchers can use to express their intended purposes when they apply for access.

In addition to solving the challenge of consistent interpretation of data access requests, DUO can also greatly enhance the searchability of access-restricted datasets. By tagging datasets with their usage restrictions, we can enable search algorithms to take into account whether the researcher is likely to be approved to use the data based on their stated research purpose. This empowers researchers to filter out data that would be out of bounds to them anyway, which is a huge time-saver given the increasingly large numbers of available datasets.

Taking this idea one step further, we can move from consistency in the human review of data access requests to automating the data access decision-making process entirely — our team built the Data Use Oversight System (DUOS) to do just that. Using DUOS, you can search for datasets with computer-readable data use restrictions, then apply for access to those datasets with a system-generated, computer-readable data access request. Having computer-readable DUO codes for both the data use restrictions and the data access request, DUOS is able to algorithmically evaluate whether or not the researcher should be granted access to the data!

What_is_DUOS.PNG

What is DUOS?

  • Interfaces to transform data use restrictions and data access requests to machine-readable code (GA4GH Standard)
  • A matching algorithm that checks if data access requests are compatible with data use restrictions
  • Interfaces for the Data Access Committee to adjudicate whether structuring and matching has been done appropriately

For more  details:

If you’d like to learn more about this powerful duo of ontology and automation, watch the recording of this past GA4GH webinar: Automating access to controlled datasets: the GA4GH Data Use Ontology in action, here on their YouTube channel. I presented our team’s DUOS implementation and talked about how researchers can already use it to access real datasets, and you’ll also hear more about the history of DUO and the work of several other implementers of DUO including the GEnome Medicine Alliance Japan (GEM Japan), Australian Genomics Health Alliance (AGHA), European Genome/Phenome Archive (EGA), DNAStack, Elixir, and the Wellcome Sanger Institute.

DUO_presenters.jpg

The post Ontology and automation: a powerful DUO for streamlining data access appeared first on Terra.

]]>
https://terra.bio/ontology-and-automation-a-powerful-duo-for-streamlining-data-access/feed/ 0