Workflows
What is a Workflow?Filters
[!NOTE] All data files in the
src/ethos/tokenize/maps
directory are under the CC0 public domain waiver.
ETHOS - EHR foundation model
This repository implements Adaptive Risk Estimation System (ARES) for Hospital Mortality, ICU Admission, Prolonged Length of Stay, and Composite (HM+IU+PLoS). In addition, it contains all the experiments conducted in our paper (preprint). It builds on our previous work on EHR foundation models by completely reimplementing ...
GENome EXogenous (GENEX) sequence detection
This is a computational workflow for detecting coordinates of microbial-like or human-like sequences in eukaryotic and procaryotic reference genomes. The workflow accepts a reference genome in FASTA-format and outputs coordinates of microbial-like (human-like) regions in BED-format. The workflow builds a Bowtie2 index of the reference genome and aligns pre-computed microbial (GTDB v.214 or NCBI RefSeq release 213) or human (hg38) pseudo-reads to the ...
FREEPII (Feature Representation Enhancement End-to-end Protein Interaction Inference) is an end-to-end learning method encompassing autonomous feature extraction and feature representation enhancement for PPIs and protein complexes inference.
Portable genotype-free demultiplexing benchmarkign pipeline.
A portable pipeline for benchmarking genotype-free single-cell demultiplexing methods on simulated data.
The pipeline is designed to be generelisable to different datasets with arbitrary numbers of simulated mulitplexed samples. All software as part of pipeline is run through Apptainer containers to ensure reproducibility and ease of use. The pipeline default configuration is to be run on a cluster with a SLURM scheduler, but can be ...
Type: Nextflow
Creators: Michael P Lynch, Leverages scripts developed by Weber et al (2021) DOI: https://doi.org/10.1093/gigascience/giab062
Submitter: Michael Lynch
demux_doublet_sim
Repository for Nextflow pipeline used in demuxSNP demultipelxing paper
Overall workflow
- Simulate doublets
- Add per-sample suffix to barcodes in BAM
- Merge per-sample BAMs
- Generate lookup of barcodes to rename to reach a set % doublets
- Rename barcodes in BAM as per lookup
- Benchmark methods
- Experiments 1: Vary doublet rate
- Experiment 2: Vary SNP subsetting
Inputs
Most inputs are specified in nextflow.config: container__souporcell: path to souporcell apptainer ...
Type: Nextflow
Creators: Michael Lynch, Leverages scripts developed by Weber et al (2021) DOI: https://doi.org/10.1093/gigascience/giab062
Submitter: Michael Lynch
PaSTa is a nextflow-based end-to-end image analysis pipeline for decoding image-based spatial transcriptomics data. It performs imaging cycle registration, cell segmentation and transcripts peak decoding. It is currently supports analysis of three types of ST technology:
- in-situ sequencing-like encoding
- MERFISH-like encoding
- RNAScope-like labelling
Prerequisites:
- Nextflow. Installation guide: https://www.nextflow.io/docs/latest/getstarted.html
- Docker or Singularity. Installation guide: ...
A workflow for performing alignment and phylogeny using protein sequences when studying genes/gene families.
EC-Earth3 workflow with wrappers running in MeluXina with Autosubmit v3.15.14, used to assess the effects of task aggregation on queueing times. Workflow configuration is based on the Auto-EC-Earth3's testing suite [1].
In order to reduce the size of the workflow, the /tmp directory has been deleted. Additionally, the experiment has been cleaned up with the Autosubmit clean
command. The
...
Type: Autosubmit
Creators: Pablo Goitia, Eric Ferrer, Alejandro Garcia, Genis Bonet, Gilbert Montane, Miguel Castrillo
Submitter: Pablo Goitia
EC-Earth3 workflow without wrappers running in MeluXina with Autosubmit v3.15.14, used to assess the effects of task aggregation on queueing times. Workflow configuration is based on the Auto-EC-Earth3's testing suite [1].
In order to reduce the size of the workflow, the /tmp directory has been deleted. Additionally, the experiment has been cleaned up with the Autosubmit clean
command. The
...
Type: Autosubmit
Creators: Pablo Goitia, Eric Ferrer, Alejandro Garcia, Genis Bonet, Gilbert Montane, Miguel Castrillo
Submitter: Pablo Goitia
EC-Earth3 workflow without wrappers running in MareNostrum 5 with Autosubmit v3.15.18, used to assess the effects of task aggregation on queueing times. Workflow configuration is based on the Auto-EC-Earth3's testing suite [1].
In order to reduce the size of the workflow, the /tmp directory has been deleted. Additionally, the experiment has been cleaned up with the Autosubmit clean
command.
...
Type: Autosubmit
Creators: Pablo Goitia, Eric Ferrer, Alejandro Garcia, Genis Bonet, Gilbert Montane, Miguel Castrillo
Submitter: Pablo Goitia