Kyle Vernest, Author at Terra https://terra.bio/author/kvernest/ Science at Scale Wed, 27 Dec 2023 04:55:54 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.1 https://terra.bio/wp-content/uploads/2023/12/Terra-Color-logo-300-150x150.pngKyle Vernest, Author at Terrahttps://terra.bio/author/kvernest/ 32 32 Introducing Terra on Azurehttps://terra.bio/terra-on-azure-preview/ https://terra.bio/terra-on-azure-preview/#respond Wed, 25 Jan 2023 14:30:25 +0000 https://terrabioappdev.wpenginepowered.com/terra-on-azure-preview/We have been working closely with our partners at Microsoft to expand Terra to a multi-cloud offering. We are now launching the preview phase of Terra on Azure. Sign up today to become an early adopter of Terra on Azure.

The post Introducing Terra on Azure appeared first on Terra.

]]>
Kyle Vernest is Head of Product in the Data Sciences Platform at the Broad Institute. In this blog post, Kyle announces the availability of Terra on Azure.


 

As you may have read previously, we have been working closely with our partners at Microsoft to expand Terra to a multi-cloud offering in order to support storing data and running analyses on Microsoft Azure. It’s with tremendous excitement that we are now launching the preview phase of Terra on Azure.

For additional context about Terra and the significance of this milestone, check out the Microsoft Research blog about Terra on Azure.

Starting today, you can request access to join Terra on Azure. To register for access, visit the Terra welcome screen, select Login from the menu in the top left, and select the “Sign in with Microsoft” option. Once your request for access to Terra on Azure has been granted, you can expect to receive an email notification that will include pointers to resources for getting started, including Azure-specific Terra documentation and instructional videos. There may be a waiting period before you are granted access; letting in new users gradually will enable us to tune our system to ensure you have an optimized experience.

 

What’s in the box?

The initial scope of functionality offered by the preview of Terra on Azure is focused on enabling efficient and scalable data storage in workspaces, multi-modal analysis using JupyterLab, and global collaboration. In the coming weeks, we’ll roll out additional features, such as workflow execution on Azure.

We’ll also be making rapid iterative improvements to the user experience based on the feedback we receive, so if you join us as an early adopter, please don’t hesitate to tell us how it’s working for you, either in the community forum or through the Helpdesk.

 

Sign up today to try it out

We’re excited to see the amazing work you’ll do with Terra on Microsoft Azure. Subscribe to the Terra Blog to stay up to date on the latest feature announcements, and sign up today to become an early adopter of Terra on Azure!

The post Introducing Terra on Azure appeared first on Terra.

]]>
https://terra.bio/terra-on-azure-preview/feed/ 0
Celebrating a year of progress — and a sneak peek at what’s coming nexthttps://terra.bio/celebrating-a-year-of-progress-and-a-sneak-peek-at-whats-coming-next/ https://terra.bio/celebrating-a-year-of-progress-and-a-sneak-peek-at-whats-coming-next/#respond Thu, 15 Dec 2022 17:29:02 +0000 https://terrabioappdev.wpenginepowered.com/celebrating-a-year-of-progress-and-a-sneak-peek-at-whats-coming-next/Highlights from Terra's development and growth in 2022, heading into the multi-cloud future of 2023.

The post Celebrating a year of progress — and a sneak peek at what’s coming next appeared first on Terra.

]]>
Kyle Vernest is Head of Product in the Data Sciences Platform at the Broad Institute. In this guest blog post, Kyle takes a look back at how Terra has grown over the past year, and gives us a preview of what to expect in the first quarter of 2023. 


 

It’s been an incredible year for Terra, with a lot of new users coming to the platform as more labs, groups, and organizations move their computational work to the cloud. We’re also thrilled to see user growth being fueled by scientific consortia such as the Human Cell Atlas, and NIH-driven programs such as AnVIL, rallying their communities around Terra as a platform for secure data sharing and collaboration. 

The Terra development teams spanning the Broad Institute, Microsoft, and Verily have worked tirelessly to continue to expand the platform’s capabilities in service of these growing communities. Highlights of the year’s releases include an improved user interface for managing cloud environments for interactive analysis, increased scalability of the workflow management system, and better tooling for uploading and organizing data in workspaces. We also rolled out numerous useability improvements, like email notifications for workflow status and better organization of the list of workspaces. Most recently, we launched the public preview of the Terra Data Repository, a new component of the Terra platform designed to provide data storage and access management capabilities tailored for the life sciences.  

Yet all these upgrades are in many ways only the tip of the iceberg. Behind the scenes, an enormous amount of work has gone into laying the groundwork for a major development that will come to fruition in the first quarter of 2023: support for storing data and running analyses on Microsoft Azure. 

 

Coming soon to a cloud near you

We have been working closely with our partners at Microsoft to expand Terra to a multi-cloud offering, and we are nearing the launch of Terra on Azure coming early in the new year. Leading up to the launch, you may notice a new “Sign in with Microsoft” option on the Terra welcome screen (which will take you to a “Coming Soon” page until the preview phase starts). 

But don’t worry if you’re planning to stick with Terra on Google; we have plenty of upgrades in store for you as well! In particular, you can look forward to taking advantage of WDL 1.1’s workflow language updates, and switching from Jupyter Notebook to JupyterLab for a more full-featured code development experience.

Whether you’re using Terra on Google or on Azure, you’ll be presented with a new version of the Terra Terms of Service, which we’ve updated to reflect the expanded functionality and new multi-cloud nature of the platform.

☁

Finally, as we close out this brief tour of the year’s achievements, we’re especially proud to celebrate the many scientific successes that Terra has already enabled. These have covered an impressive range of domains, from the Telomere-to-Telomere reference genome project to the CDC’s efforts to empower public health labs across the country to adopt genomics for biosurveillance. We look forward to many more in the coming year, featuring even greater variety — including more ‘omics data technologies beyond genomics.

 

 

The post Celebrating a year of progress — and a sneak peek at what’s coming next appeared first on Terra.

]]>
https://terra.bio/celebrating-a-year-of-progress-and-a-sneak-peek-at-whats-coming-next/feed/ 0
Update your Terra Notebooks Utilities for continued access to data via DRS/DOShttps://terra.bio/update-your-terra-notebooks-utilities-for-continued-access-to-data-via-drs-dos/ https://terra.bio/update-your-terra-notebooks-utilities-for-continued-access-to-data-via-drs-dos/#respond Tue, 17 Nov 2020 15:03:05 +0000 https://terrabioappdev.wpenginepowered.com/update-your-terra-notebooks-utilities-for-continued-access-to-data-via-drs-dos/If you have been using the Terra Notebook Utilities (TNU) to access data through DRS/DOS URIs, you need to update your version of TNU before December 1, 2020, as described further in this article. Read on to learn more.

The post Update your Terra Notebooks Utilities for continued access to data via DRS/DOS appeared first on Terra.

]]>
TL;DR: If you have been using the Terra Notebook Utilities (TNU) to access data through DRS/DOS URIs, you need to update your version of TNU before December 1, 2020 as described further below. 

Some of the many data repositories that are accessible through Terra use systems called Data Repository Service (DRS, pronounced “duhrs”) and Data Object Service (DOS) to manage file locations in a way that allows you to get access to data without having to know exactly where it is stored. In other words, you can run an analysis on the data without actually knowing the exact path to where it lives. Without going into the details of how this sorcery works, the basic idea is that you give the system a unique identifier, and it gives whatever tool you’re using access to the data.

You can learn more about the DRS/DOS system and how to use it in Terra in this documentation article

If you’ve already been using a dataset that’s accessed through DRS/DOS, you’ve probably had to use identifiers to work with some or all of the data.

For workflows this is pretty transparent; you just point to the identifiers listed in the data table as your inputs, and the Cromwell workflow manager will work with Terra’s DRS/DOS processing system, which is called Martha, to get the files localized at runtime. All the relevant components are managed for you behind the scenes so you don’t need to do anything to stay up to date.

For notebooks there’s an extra layer involved; you have to use a Python package called Terra Notebook Utilities (TNU) to connect to Martha and access the files. TNU is a package you install yourself in your notebook environment, so you may occasionally need to update your version of the package to keep up with system updates.

Which is where today’s ACTION ITEM comes in. Our engineering team has recently updated the Martha service to provide new functionality, including better error messages! This is valuable progress, but it involves some functional changes that require updating the Terra Notebook Utilities to use the new version. Importantly, the old version will stop working on December 1, 2020, so you must update your installed version of the TNU package (to version 0.5.0 or later) if you want to continue accessing DRS/DOS-mediated datasets. 

The good news is that the update process is fairly straightforward; you just need to use the command corresponding to the environment you’re working in:

From any Jupyter notebook in Terra: (be sure to include the leading “%”)

%pip install --upgrade --no-cache-dir terra-notebook-utils

From the CLI on standard Terra-provided Notebook environments:

/usr/local/bin/pip install --upgrade --no-cache-dir terra-notebook-utils

Note that all standard Notebook environments on Terra are based on this Docker image.

For other environments: it should be enough to do the following:

pip install --upgrade --no-cache-dir terra-notebook-utils

If you run into any trouble, please reach out to the Terra support team through the Helpdesk form or the community forum. For more information on how to use the Terra Notebook Utilities to access data through DRS/DOS, see this documentation article.

The post Update your Terra Notebooks Utilities for continued access to data via DRS/DOS appeared first on Terra.

]]>
https://terra.bio/update-your-terra-notebooks-utilities-for-continued-access-to-data-via-drs-dos/feed/ 0
Coming Soon – Faster, cheaper workflowshttps://terra.bio/coming-soon-faster-cheaper-workflows/ https://terra.bio/coming-soon-faster-cheaper-workflows/#respond Tue, 25 Aug 2020 12:50:15 +0000 https://terrabioappdev.wpenginepowered.com/coming-soon-faster-cheaper-workflows/Whether you’re processing ten data files or ten thousand, making your workflows run faster and cost less is always a goal. The Terra Workflow (aka “Batch”) team has been working on some cost and performance improvements. These aren’t available quite yet, but we wanted to give you a preview of two of the [...]

The post Coming Soon – Faster, cheaper workflows appeared first on Terra.

]]>
Whether you’re processing ten data files or ten thousand, making your workflows run faster and cost less is always a goal. The Terra Workflow (aka “Batch”) team has been working on some cost and performance improvements. These aren’t available quite yet, but we wanted to give you a preview of two of the improvements that are in progress:

Less wait time to load Broad’s public genome references

One performance improvement that’s almost ready will speed things up if your workflow uses one of the Broad Institute’s public genome reference data. Until now, having to copy gigabytes of references to the virtual machine meant waiting a few minutes before a task started. To reduce the wait time, we’re adding a reference disk that is automatically mounted when using one or more of the Broad references (gs://gcp-public-data–broad-references/) in a workflow. Since most references are quite large, it’s usually much faster to reference the file via an attached disk than to copy it onto the drive. Best of all, you don’t need to change anything to take advantage of this improvement.

Run workflows optimized with Google Pipelines API V1 on Terra

Using pipelines developed or optimized with Google Pipelines API V1? You’ll soon be able to use these pipelines in Terra. With Google Pipelines API V1, there were fewer machine options available, and some pipelines were developed and optimized with this in mind. A new option is in the works that will allow you to leverage Google Pipelines API V1 machine optimization within V2 – Terra’s current version.

Other improvements

Along with the aforementioned items, the team is also looking at Docker image caching. We hope you enjoyed this sneak peek of what’s coming next for workflows. Look for more on these improvements in future updates.

The post Coming Soon – Faster, cheaper workflows appeared first on Terra.

]]>
https://terra.bio/coming-soon-faster-cheaper-workflows/feed/ 0
Delete intermediates option now available for Workflows in Terrahttps://terra.bio/delete-intermediates-option-now-available-for-workflows-in-terra/ https://terra.bio/delete-intermediates-option-now-available-for-workflows-in-terra/#respond Tue, 21 Jul 2020 13:12:15 +0000 https://terrabioappdev.wpenginepowered.com/delete-intermediates-option-now-available-for-workflows-in-terra/Intermediate files generated by your workflow may be an unexpected source of storage costs. Fortunately, you now have an easy option to delete these files immediately after your workflow runs. Just select the new “delete intermediate outputs” box on your workflow configuration. Saving money on your workflow analysis is always a top [...]

The post Delete intermediates option now available for Workflows in Terra appeared first on Terra.

]]>
Summary: Intermediate files generated by your workflow may be an unexpected source of storage costs. Fortunately, you now have an easy option to delete these files immediately after your workflow runs. Just select the new “delete intermediate outputs” box on your workflow configuration.

Saving money on your workflow analysis is always a top priority, but you may not realize one unexpected source of your workflow costs: storing intermediate workflow files. Intermediate files have multiple benefits; they allow you to call cache partial results, saving you time and money troubleshooting unsuccessful workflows. But they also require extra storage and often have no use once workflow output has been computed. This is why we want to offer flexibility in your choice of storing them. Terra now provides a new workflow configuration option to delete intermediate files after a successful run, saving you money on Google Cloud storage cost.

The new “Delete intermediate outputs” option is available as a checkbox (highlighted in image below) on your workflow configuration page.

Screen_Shot_2020-07-14_at_6.00.01_PM.png

 

By checking the “Delete intermediate outputs” box, your intermediate workflow files will be deleted after the workflow successfully completes.

As a reminder, the delete intermediate outputs option will preclude call caching on your workflow once it’s successfully completed. However, if the option is selected and the workflow fails part of the way through, the results can still be cached. If you want to keep intermediate files, simply leave the box unchecked. You can always look to manually delete these files later.

For more information about the delete intermediates option, see the following help article.

Coming Soon:

We hope to offer a manual option for deleting intermediate files in a workspace at a later time, allowing you to complete your analyses first and then look to delete the intermediate files when you are ready to share your workspace and to reduce ongoing storage costs.

The post Delete intermediates option now available for Workflows in Terra appeared first on Terra.

]]>
https://terra.bio/delete-intermediates-option-now-available-for-workflows-in-terra/feed/ 0