Choral Installation on Amazon Oracle RDS and Fargate

    Prerequisite

    An RDS database running with a user that the Choral service can use to create its backend data. In this docs this user will be referred to as <RDS user> and <RDS password>.Fargate service CPU and memory capacity is limited, you can check if the suggested values do not exceed Fargate’s capacity. If memory seems insufficient, you can uncheck caching options to lower memory requirements. If it still exceeds Fargate’s capacity then our recommendation would be to run Choral service in an EC2 instance as it is described in this documentation.

    System architecture diagram

    By default the index data of the Choral service is stored on the filesystem, but in case of Fargate to prevent loss of data on service failure and for easier backup the Choral service can write back the index data to the RDS database as depicted on the diagram below.

    Installation steps

    1. Initialize Choral DB The RDS database must be initialized before the Choral Fargate service starts to operate on it. Initialization can be executed via EC2 instance. See our detailed description here.
    2. Docker image build and upload Please see the provided docker file and entrypoint.sh example. These can be used as customizable templates to build a docker image usable for AWS Fargate service. entrypoint.sh is used by the docker image to initialize Choral application properties via fix values and environment variables. <RDS_USER> and <RDS_PASSWORD> should be set as JDBC_USER and JDBC_PASSWORD environment variables along with the RDS database endpoint as JDBC_URL. The default values of the caching properties are the same that can be found at JChem Engines cache and memory calculator by default.A container created from this image also starts the Choral service and can log in AWS if e.g. Cloudwatch is configured and LOG environment variable is set to true. Time between sending log messages can be set with SLEEP_TIME_WHEN_LOG environment variable.
    3. Create Fargate service The Fargate task for Choral should be created from an image pushed to ECR.Beforehand, a health check for Choral service can be set with http://localhost:<CHORAL_SERVICE_PORT>/health endpoint. By default, the port number is 8128 which is also exposed in the Dockerfile.
    4. Link Choral service to RDS RDS DB requires a static hostname for Choral service but a Fargate service is only accessible via dynamic link. This issue can be resolved with e.g. a Load balancer in front of the Fargate service.Once a static hostname is available, we can update the settings in the database in accordance with this documentation.

    Known issue

    Duration of chemical searches where Oracle and Choral Server communication has the larger share of the search time depend on RDS and EC2/Fargate communication. Such searches are e.g. limited searches or similarity searches. On the other hand Choral Server intensive searches' speed depend rather on the Choral Server speed.