ScraperFlow - A Composable Workflow Framework

ScraperFlow is a framework which enables flow-based programming in a declarative way. It is based on two main components: the core which translates the declarative description (JSON or YAML) into a format that is understood by the framework, and the actual nodes which can be used to construct a workflow. The architecture is plugin-based, so nodes can be implemented on their own and provided to the framework.

The main goal of this framework is to facilitate reuse of code (nodes) and help managing control flow of programs in an easy way (declarative workflow specification).

The workflow specification is statically checked to ensure that the configuration is well-typed against the composition of nodes.

Links

ScraperFlow Node Documentation
- The documentation contains all nodes, including extra nodes, not only the nodes in the core framework
ScraperFlow Wiki
ScraperFlow Editor (prototype, deprecated)
Example Workflows

Documentation

The documentation can be found at the ScraperFlow Wiki.

Quickstart - Specification

A minimal specification that can be used for any of the quickstart sections:

start:
 - {f: log, log: hello world}

Quickstart - Docker

ScraperFlow is deployed to Dockerhub.

To use a ScraperFlow container once, use

docker run -v "$PWD":/rt -v "$PWD":/nodes -v "$PWD":/plugins -v "$PWD":/runtime-nodes --rm albsch/scraperflow:latest help

and place your workflow in the current workflow directory. '$PWD' can be changed to another working directory if needed. If custom nodes or plugins are to be supplied (like dev-nodes), place the jar(s) in the current working directory (or change '$PWD'), too.

Quickstart - Java

ScraperFlow is fully modularized.

Get the latest modular jar bundle and any plugin jar or additional node jars you like.

Place the additional plugin and node java modules in a var folder where the run script resides. Use the provided run script to run ScraperFlow.

ScraperFlow will look for workflows relative to the working directory.

Quickstart - Java Native

Execute ./gradlew installDist. This will install scraper in your home directory at ~/opt/scraperflow. A scraperflow start script can then be executed via ~/opt/scraperflow/scraperflow. Additional plugin jars can be put into ~/opt/scraperflow/var.

Quickstart - Development

Using

  gradle clean build codeCov

will

compile the project
test the project
package the project at application/build/distributions
generate code coverage report at build/reports/jacoco/codeCoverageReport/html/index.html

Specification parsers are plugins and need to be provided on the module path. Executing scraper in a IDE requires the module path to be extended with the following JVM parameter:

--add-modules ALL-MODULE-PATH

Name	Name	Last commit message	Last commit date
Latest commit History 324 Commits 324 Commits
.github/workflows	.github/workflows
api	api
core-parsers	core-parsers
core-plugins	core-plugins
core	core
dist	dist
gradle	gradle
src/main/java	src/main/java
test-framework	test-framework
utilities	utilities
.gitignore	.gitignore
.gitlab-ci.yml	.gitlab-ci.yml
.jitpack.yml	.jitpack.yml
LICENSE.md	LICENSE.md
Readme.md	Readme.md
build.gradle	build.gradle
gradlew	gradlew
gradlew.bat	gradlew.bat
install.sh	install.sh
jlink.sh	jlink.sh
settings.gradle	settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScraperFlow - A Composable Workflow Framework

Links

Documentation

Quickstart - Specification

Quickstart - Docker

Quickstart - Java

Quickstart - Java Native

Quickstart - Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

Folders and files

Latest commit

History

Repository files navigation

ScraperFlow - A Composable Workflow Framework

Links

Documentation

Quickstart - Specification

Quickstart - Docker

Quickstart - Java

Quickstart - Java Native

Quickstart - Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages