Task Manager¶

Managed chemical functional modules
Microservices system mode
Standalone web application mode
Download
System requirements
Installation
Licenses
Logging
Configuration
High Availability (HA)
Running the server
API documentation
Usage
- DB Web Services tasks
- Reactor Web Services tasks

Task Manager provides the possibility of scheduling special long-lasting tasks for other services.

This documentation describes installation, administration and usage of Task Manager. As all JChem Microservices modules, it is available in two modes:

As part of a microservices system
As a standalone web application

In both modes, it has to be connected to one or more managed modules.

Managed chemical functional modules¶

Task Manager does not provide chemical functionality on its own, but rather manages other chemical functional modules, so these modules must be running along with Task Manager.

The managed modules you have to start depend on the endpoints of Task Manager that you want to use:

endpoints	managed module
/rest-v1/work-queue/db/*	DB Web Services
/rest-v1/work-queue/reactor/*	Reactor Web Services

Note: more modules will be supported in the future.

Microservices system mode¶

In microservices system mode, Task Manager runs together with the Config, Discovery and Gateway services. These three services are mandatory, and optionally other services can also be part of the system, including the chosen managed modules. All configuration must be done in the Config service.

The default configuration applies to the microservices system mode.

The web application runs on host <server-host> and listens on port <gateway-server-port>.

In microservices system mode, Task Manager and the managed modules connect automatically.

Standalone web application mode¶

In standalone web application mode, Task Manager and the managed modules run without the Config, Discovery and Gateway services (however, the installer installs them as well).

The default configuration must be changed according to the standalone web application mode; set eureka.client.enabled=false in the application.properties file of Task Manager and the managed modules as well.

The address of the managed modules should be set in <server-host>:<server-port> format in the application.properties file of Task Manager.

managed module	application.properties	example
DB Web Services	com.chemaxon.taskmanager.service.jwsdb	http://localhost:8062
Reactor Web Services	com.chemaxon.taskmanager.service.jws-reactor	http://localhost:8067

The managed modules also need access to Task Manager. The com.chemaxon.taskmanager.service property in their application.properties file need to be set in <server-host>:<server-port> format. Example: http://localhost:8068

Download¶

See here.

System requirements¶

See here.

Installation¶

See here.

Module is installed into folder: jws/jws-taskmanager

Licenses¶

See here.

Logging¶

See here.

Configuration¶

**Default configuration: **

application.properties
server.port=8068
logging.file.name=../logs/jws-taskmanager.log
eureka.client.enabled=true

bootstrap.properties
spring.cloud.config.failFast=true
spring.cloud.config.uri=${CONFIG_SERVER_URI:http://localhost:8888/}
spring.cloud.config.retry.initialInterval=3000
spring.cloud.config.retry.multiplier=1.2
spring.cloud.config.retry.maxInterval=60000
spring.cloud.config.retry.maxAttempts=100

For more configuration options, see the Spring documentation page.

Database configuration¶

Task Manager service has its own database to store added structures and task statuses. H2 and PostgreSQL databases are supported.

Default configuration:

application.properties	description
${CXN_TASK_JDBC_URL:jdbc:h2:file:./data/task_db}
spring.datasource.driverClassName=${CXN_TASK_DRIVER:org.h2.Driver}	Value is org.postgresql.Driver in PostgreSQL case
spring.datasource.username=${CXN_TASK_JDBC_USER:user}
spring.datasource.password=${CXN_TASK_JDBC_PASSWORD:password}
spring.jpa.database-platform=${CXN_TASK_DIALECT:org.hibernate.dialect.H2Dialect}	Value is org.hibernate.dialect.PostgreSQLDialect in PostgreSQL case

File and S3 import configuration¶

Task manager can import structures from tables which are exported from DB Web Services. It can be uploaded from file or from an S3 bucket.

AWS credential configuration details can be found here.

Import setup can be configured with below properties.

application.properties	description
chemaxon.microservices.db.import-export.strategy=FILE	Specifies whether use FILE based import or S3 based. Default: FILE
chemaxon.microservices.db.import-export.dir=./data/export	Import file path in FILE based import strategy case
chemaxon.microservices.db.import-export.s3-bucket-base-url:s3://export-bucket/	Import S3 bucket in S3 strategy case.

DB Web Services configuration¶

Below configuration can be added to the DB Web Services application.properties file regarding Task Manager communication.

application.properties	description
com.chemaxon.taskmanager.scheduler.enabled=false	When Task Manager starts a job in DB Web Services, only one DB service instance starts processing, regardless of running instances. If this attribute is true all DB service instances check running task and join processing (e.g. molecule import). If DB Web Service instance is restarted during task is in progress then process continues if this attribute is configured as true. Default value is false. Highly recommended to switch on with task manager usage.
com.chemaxon.taskmanager.scheduler.frequency=60000	DB Web Services tries to join running task with this regularity if `com.chemaxon.taskmanager.scheduler.enabled=true`. It is in milliseconds.
com.chemaxon.taskmanager.service=http://${TASKMANAGER_SERVICE_URL}	When DB Web services runs in standalone mode, it can connect to Task Manager via this URL.
com.chemaxon.taskmanager.db.batchSize:5000	DB Web Services request structure for processing from Task Manager with this size.

Reactor Web Services configuration¶

Task Manager service has its own algorithm to create batches by requestLimit and responseLimit properties. The tighter constraint applies on batches.

Default configuration:

application.properties	description
com.chemaxon.taskmanager.reactor.requestLimit=1000	Maximum number of molecules in a batch
com.chemaxon.taskmanager.reactor.responseLimit=1000	Highest possible result number of executed batch

Below configuration can be added to the Reactor Web Services application.properties file regarding Task Manager communication.

application.properties	description
com.chemaxon.reactor.task.service.scheduler.enabled=false	When Task Manager starts a job in Reactor Web Services, only one reactor service instance starts processing, regardless of running instances. If this attribute is true all Reactor service instances check running task and join processing (e.g. react). If Reactor Web Services instance is restarted during task is in progress then process continues if this attribute is configured as true. Default value is false.
com.chemaxon.reactor.task.service.scheduler.frequency=60000	Reactor Web Services tries to join running task with this regularity if `com.chemaxon.reactor.task.service.scheduler.enabled=true`. It is in milliseconds.
com.chemaxon.taskmanager.service=http://${TASKMANAGER_SERVICE_URL}	When Reactor Web Services runs in standalone mode, it can connect to Task Manager via this URL.

Retry mechanism¶

Task Manager has a retry mechanism implemented in it when it communicates with a managed module. This is adjustable with these two properties in application.properties file in Task Manager:

application.properties	description
com.chemaxon.retrytemplate.backOffPeriod=10000	The time in milliseconds the service waits between two attempts to send the request to the other service again, if no response was given.
com.chemaxon.retrytemplate.maxAttempts=20	The maximum number of attempts. Note that it also contains the very first call.

With the above written default values, the service tries to get a response for about 3 minutes.

When a managed module is restarted during task execution, structures can get stuck in IN_PROGRESS state. Task Manager reprocesses these structures according to below configuration.

application.properties	description
com.chemaxon.taskmanager.scheduler.frequency=60000	Regularity of task manager job which identifies stuck sructures. It is in milliseconds.
com.chemaxon.taskmanager.scheduler.timelimit=600000	Time limit before task manager tries to reprocess stuck structure.
com.chemaxon.taskmanager.scheduler.retrylimit=3	Number of structure process retry

High Availability (HA)¶

Running more instances of the task manager service ensures HA and load balancing.

HA mode requires PostgreSQL database. It is not supported with H2 database.

Running the server¶

Prerequisites in case of microservices system mode:

Config service is running
Discovery service is running
Gateway service is running

Run the service in command line in folder jws/jws-taskmanager:

jws-taskmanager-service start (on Windows)

jws-taskmanager-service start (on Linux)

or

run-jws-taskmanager.exe (on Windows)

run-jws-taskmanager (on Linux)

API documentation¶

Find and try out the API on the Swagger UI.

Mode	URL of Swagger UI	Default URL of Swagger UI
microservices system	<serverhost>:<gateway-port>/jws-taskmanager/API/	localhost:8080/jws-taskmanager/API/
standalone web application mode	<serverhost>:<server-port>/API/	localhost:8068/API/

Usage¶

The guidelines on the Swagger UI API documentation of your installed module display the methods and syntax implemented for reaching the functionalities of the Task Manager toolkit.

Db Web Services tasks¶

Task Manager can be used to import structures into an already existing table. The table must be created at the /rest-v1/db/additional/createTable/{tableName} endpoint of DB Web Services.

The structures can be added via a REST endpoint, or can be uploaded from file or from an S3 bucket.

All uploaded structures have an automatically generated, unique key, which is only used by Task Manager and does not affect the behavior of the import. It can be used to query or delete structures.

If the structures have their own ID property, this property can also be set as an identifier in the table. The Add structure endpoint (PUT - /rest-v1/work-queue/db/task/{taskId}) has an optional id attribute for this purpose. It is recommended to provide the id property.

** Example - Import structure **

Create task

PUT - /rest-v1/work-queue/db/task

{
  "params": {
    "isDuplicateFiltering": "true",
    "tableName": "mytable"
  },
  "task": "batchInsert"
}

Add structures with ID

PUT - /rest-v1/work-queue/db/task/{taskId}

{
  "inputFormat": "smiles",
  "structures": [
    {
      "id": 1,
      "structure": "O=C1CCCC=C1"
    },
    {
      "id": 2,
      "structure": "C1CCCCC1"
    }
  ]
}

Note: provided structure ID will be used as the ID in DB Web Services import.

Manage task (Start and Pause)

You can start task with POST call below.

POST - /rest-v1/work-queue/db/task/{taskId}/manage

{
  "action": "START"
}

Started task status becomes "WAITING", which means it is in the execution queue. If our process started to
execute the task, its status becomes "IN_PROGRESS". When a task is in "WAITING" or "IN_PROGRESS" status, it can be paused by
the call below, and then it transitions to the "PAUSED" states and its execution stops.
A "PAUSED" task can be continued with calling start again as shown above.

Pause call:

POST - /rest-v1/work-queue/db/task/{taskId}/manage

{
  "action": "PAUSE"
}

Monitor task progress

GET - /rest-v1/work-queue/db/task/{taskId}

When task is in progress, the response contains progress information.

{
  "status": "IN_PROGRESS",
  "processedPercentage": 12
}

If import is done, task is placed in READY status.

{
  "status": "READY"
}

If the task has any issues, the status is updated to ERROR, and the response contains the number of processed and failed inputs.

{
  "status": "ERROR",
  "processedSuccess": 0,
  "processedError": 2
}

Task result

GET - /rest-v1/work-queue/db/task/{taskId}/results?resultIdType=ALL

When the task has finished, the results can be retrieved.

Example result in case of success:

{
  "failedIds": [],
  "failedKeys": [],
  "duplicatedIds": [],
  "duplicatedKeys": [],
  "successfulIds": [
    1,
    2
  ],
  "successfulKeys": [
    "s100",
    "s105"
  ]
}

Note: if IDs were not provided when the structures were added, the "successfulIds" list will contain the automatically
generated IDs.

Example result in case of error:

{
  "failedIds": [
    1,
    2
  ],
  "failedKeys": [
    "s100",
    "s105"
  ],
  "duplicatedIds": [],
  "duplicatedKeys": [],
  "successfulIds": [],
  "successfulKeys": []
}

Reactor Web Services tasks¶

Task Manager can be used to execute reaction on already added structures.

The structures can be added via a REST endpoint, or can be uploaded from file or from an S3 bucket.

All uploaded structures have an automatically generated, unique key, which is only used by Task Manager and does not affect the behavior of the import. It can be used to query or delete structures.

The structures may have their own ID property, which is the identifier in the reaction results.
The Add structure endpoint (PUT - /rest-v1/work-queue/reactor/task/{taskId}) has an optional id attribute for this purpose. It is recommended to provide the id property.

When uploading structures, it is mandatory to specify the position, which indicates the position of the uploaded reactant lists in relation to each other.

** Example - Import structure **

Create task

PUT - /rest-v1/work-queue/reactor/task

{
  "params": {
    "copyPropertyByReactant": [
      {
        "copyAs": "NewPropertyName",
        "copyFrom": 1,
        "propertyName": "PropertyName"
      }
    ],
    "outputFormat": "smiles",
    "productIndexes": [
      1
    ],
    "ratio": [
      1
    ],
    "reactantInputFormat": "smiles",
    "reaction": "[#6:8]-[#6:1](=[O:3])-[#6:2](-[#6:9])=[O:6]>>[H:5][#8:4]-[#6:1](=[O:3])[C:2]([#6:8])([#6:9])[#8:6][H:7]",
    "resultType": "product",
    "showUnsuccessfulReactions": true,
    "unambiguousOnly": false
  },
  "task": "react"
}

Add structures with ID and position

PUT - /rest-v1/work-queue/reactor/task/{taskId}

{
  "inputFormat": "smiles",
  "position": 0,
  "structures": [
    {
      "id": 1,
      "structure": "OC(=O)CC(=O)C(=O)CC(O)=O"
    },
    {
      "id": 2,
      "structure": "C1CCCCC1"
    }
  ]
}

Note: provided structure ID will be seen in result.

Manage task (Start and Pause)

You can start task with POST call below.

POST - /rest-v1/work-queue/reactor/task/{taskId}/manage

{
  "action": "START"
}

Started task status becomes "WAITING", which means it is in the execution queue. If our process started to
execute the task, its status becomes "IN_PROGRESS". When a task is in "WAITING" or "IN_PROGRESS" status, it can be paused by
the call below, and then it transitions to the "PAUSED" states and its execution stops.
A "PAUSED" task can be continued with calling start again as shown above.

Pause call:

POST - /rest-v1/work-queue/reactor/task/{taskId}/manage

{
  "action": "PAUSE"
}

Monitor task progress

GET - /rest-v1/work-queue/reactor/task/{taskId}

When task is in progress, the response contains progress information.

{
  "status": "IN_PROGRESS",
  "processedPercentage": 12
}

If reaction is done, task is placed in READY status.

{
  "status": "READY"
}

If the task has any issues, the status is updated to ERROR, and the response contains the number of processed and failed inputs.

{
  "status": "ERROR",
  "processedSuccess": 0,
  "processedError": 2
}

Task result

GET - /rest-v1/work-queue/reactor/task/3/results?resultIdType=SUCCESS&pageNumber=0&pageSize=20

When the task has finished, the results can be retrieved.

Example result in case of success:

{
  "products": [
    "{\"result\":\"OC(=O)CC(O)(CC(O)=O)C(O)=O\",\"reactantIds\":[\"1\"]}"
  ]
}

Note: if IDs were not provided when the structures were added, the "reactantIds" list will contain NULL-s.

Example result in case of error:

GET - /rest-v1/work-queue/reactor/task/3/results?resultIdType=ERROR&pageNumber=0&pageSize=20

{
  "results": [
    {
      "errorMessage": "Unable to use specified reaction.",
      "structures": [
        {
          "id": 1,
          "structure": "OC(=O)CC(=O)C(=O)CC(O)=O",
          "position": 0
        },
        {
          "id": 2,
          "structure": "C1CCCCC1",
          "position": 0
        }
      ]
    }
  ]
}