Background

Overview of Taverna workflow structure

Workflow, processor, activity, ports, datalinks
Taverna Reloaded and architecture
Brief on Engine architecture (iterations, dispatch)
Some picture

Earlier formats: scufl and t2flow

scufl: Simple XML format - describing only the dataflow
- Many implications - e.g. a datalink between two ports mean those ports exist
- Almost no metadata
- No structured information about services - "Insert XML here" logic
- Easily generated by third-parties
- Some-what executeable by other third-parties
  - .. but they usually get the workflow semantics wrong
- Easy to edit by hand
t2flow - not so simple XML with everything
- Built around T2 engine and its implementation
  - e.g. support for multiple activities, dispatch stack, richer iteration strategies
- XMLBeans serialization of engine state
- Execution engine not separated from design workbench
- Stronger annotation support
- Parsing takes a long time as it recreates the engine state - even if you just need to edit
- Hard to consume - very noisy
- Exposes the complete structure of the engine
- Very hard to generate - need to have template-based copy-and-paste
- Impossible to edit by hand

Motivations for SCUFL2

Sharable
- .. and rerunnable workflows
Modular
Workbench vs Command line vs Server vs Grid
Independence from engine implementation
Programmatic access outside Taverna
- Reuse existing formats like ZIP and XML
Semantic annotations
Semantic inspection
Flexibility - should not need to say everything
Translations - load/save other workflow formats
Embedding - add resources without having to serialize them within a massive XML

Not a requirement:

Editing by hand (Still need to know a lot about the services)
Too much implicitly - only implicit-where-appropriate (e.g. useful defaults)

Review of workflow languages

Galaxy
Knime
WINGS and OPM-W
Airavata
BPEL!! ?? Uuuh..
..?

Related languages and technologies