Background

Overview of Taverna workflow structure

Earlier formats: scufl and t2flow

  • scufl: Simple XML format - describing only the dataflow
    • Many implications - e.g. a datalink between two ports mean those ports exist
    • Almost no metadata
    • No structured information about services - "Insert XML here" logic
    • Easily generated by third-parties
    • Some-what executeable by other third-parties
      • .. but they usually get the workflow semantics wrong
    • Easy to edit by hand
  • t2flow - not so simple XML with everything
    • Built around T2 engine and its implementation
      • e.g. support for multiple activities, dispatch stack, richer iteration strategies
    • XMLBeans serialization of engine state
    • Execution engine not separated from design workbench
    • Stronger annotation support
    • Parsing takes a long time as it recreates the engine state - even if you just need to edit
    • Hard to consume - very noisy
    • Exposes the complete structure of the engine
    • Very hard to generate - need to have template-based copy-and-paste
    • Impossible to edit by hand

Motivations for SCUFL2

  • Sharable
    • .. and rerunnable workflows
  • Modular
  • Workbench vs Command line vs Server vs Grid
  • Independence from engine implementation
  • Programmatic access outside Taverna
    • Reuse existing formats like ZIP and XML
  • Semantic annotations
  • Semantic inspection
  • Flexibility - should not need to say everything
  • Translations - load/save other workflow formats
  • Embedding - add resources without having to serialize them within a massive XML

Not a requirement:

  • Editing by hand (Still need to know a lot about the services)
  • Too much implicitly - only implicit-where-appropriate (e.g. useful defaults)

Review of workflow languages

  • Galaxy
  • Knime
  • WINGS and OPM-W
  • Airavata
  • BPEL!! ?? Uuuh..
  • ..?

Related languages and technologies