Have you ever uploaded a bunch of files, but forgot to set the format, only to then have to go and manually reconfigure each dataset individually? Wouldn’t it be nice if there was an easier way to set these attributes on multiple datasets at a time?
The workflow trick tip
There is an open issue
requesting the ability to do this using the existing multiple dataset
operations that will hopefully be addressed soon. In the meantime, here is
a fast workaround that you can use today to quickly set the format on multiple
datasets at once. We’ll walk through a quick example where I uploaded 10
fastq
files, but their format was not set to the more specific fastqsanger
format that is required by many tools.
First, we are going to put all of our datasets into a collection, if they are not already. If you were not planning on using these datasets in a collection, just choose to create a simple list.
Now we are going to create a new workflow. Click “Workflow” in the masthead at the top of your browser, then click the “Create” button. Name and save your new workflow.
In the workflow editor, on the left-hand side, click to expand the Collection Operations tool section and click to add the Filter failed tool to your workflow. Click on the newly created Filter failed tool within the workflow editor. On the right-hand side of the workflow interface, you are able to edit the configuration of this tool.
In the Filter failed tool configuration, click Configure Output: 'input
dataset(s) (filtered failed datasets)', to enable post job actions and select
Change datatype. Set it to fastqsanger
. Click to save the workflow at the
top right, and click the play button to open the workflow run interface.
Make sure the correct collection is selected as input and then click Run
Workflow. After the workflow completes, you will have a new collection that
contains new versions of your datasets that have been properly set the datatype
to the fastqsanger
format. These new datasets reference the original
underlying file content and as a result do not add to your disk usage.
Using the converted datasets outside a Collection
If you want to have access to these datasets outside of the collection, they are available in your history as hidden datasets. To expose them, click hidden beneath the history name, activate the multiple dataset actions (click the checkbox), select the datasets that you want to unhide (we can just choose All in this case). Once desired datasets are selected, click For all selected, and choose Unhide datasets. The hidden datasets (22-31 in our example) are now visible within your history, and able to be easily selected within tools as needed.