Web API Gateway

See also:

The Web API Gateway is a standalone application which serves as entry point for Web API Endpoints and redirects and load balances traffic to responsible Web API Agents. It uses the AI Hub Server's registry (see WEBAPI_REGISTRY_ properties) to get information about connected instances.

It is possible to deploy multiple instances to raise reliability and availability and by putting an external proxy in front, e.g., nginx.

Property	Default	Description
`WEBAPI_REGISTRY_PROTOCOL`	`http`	The internal registry protocol used by Gateway to retrieve information about routes and available Web API Agents
`WEBAPI_REGISTRY_HOST`		The internal registry host used by Gateway to retrieve information about routes and available Web API Agents
`WEBAPI_REGISTRY_PORT`	`8080`	The internal registry port used by Gateway to retrieve information about routes and available Web API Agents
`WEBAPI_REGISTRY_USERNAME`	`admin`	The internal registry username used by Gateway to retrieve information about routes and available Web API Agents
`WEBAPI_REGISTRY_PASSWORD`		The internal registry username used by Gateway to retrieve information about routes and available Web API Agents
`WEBAPI_REGISTRY_URL`	See below	List of registries (high availability), comma separated of registry used by Gateway to retrieve information about routes and available Web API Agents
`LOADBALANCER_METRIC_STYLE`	`CPU_MEMORY`	The metric value load balancing is based on (options: CPU, MEMORY, CPU_MEMORY)
`LOADBALANCER_REQUEST_INTERVAL`	`10s`	The interval metric values are requested for load balancing
`LOADBALANCER_REQUEST_TIMEOUT`	`10s`	The maximum timeout tolerated for metric values requested used for load balancing
`LOADBALANCER_CLEAN_UP_INTERVAL`	`10s`	The interval in which to check route changes, to dispose of no longer needed load balancers
`RETRY_AGENT_RETRIES`	`3`	The amount of times one request is retried on a Web API Agent
`RETRY_GROUP_RETRIES`	`3`	The amount of times one request is retried on a Web API Group
`RETRY_SERIES`	`server_error`	The HTTP status code series which is retried on a Web API Agent (server_error, client_error, redirection, successful, informational)
`RETRY_STATUS`	`not_found`	The HTTP status code which is retried on a Web API Agent e.g. unauthorized, forbidden
`RETRY_METHODS`	`post`	The HTTP method which is retried on a Web API Agent
`RETRY_EXCEPTIONS`	`java.io.IOException, org.springframework.cloud.gateway.support.TimeoutException`	The exception classes which are retried on a Web API Agent
`RETRY_BACKOFF_ENABLED`	`off`	If the Web API Gateway should backoff before retrying a request
`RETRY_BACKOFF_INTERVAL`		The interval in which the Web API Gateway should backoff before retrying a request in duration format e.g. 100ms
`SPRING_CLOUD_GATEWAY_HTTPCLIENT_CONNECT_TIMEOUT`	15000	The connect timeout in millis
`SPRING_CLOUD_GATEWAY_HTTPCLIENT_RESPONSE_TIMEOUT`	`5m`	The response timeout in duration format

WEBAPI_REGISTRY_URL: $WEBAPI_REGISTRY_PROTOCOL://$WEBAPI_REGISTRY_USERNAME:$WEBAPI_REGISTRY_PASSWORD@$WEBAPI_REGISTRY_HOST:$WEBAPI_REGISTRY_PORT/eureka

Loadbalancer

The Web API Gateway supports smart load balancing, which is enabled by default. When a request is routed to a Web API Group, the least used Web API Agent is used to service this request. Usage is calculated by the LOADBALANCER_METRIC_STYLE, which could be one of CPU, MEMORY or CPU_MEMORY. The Web API Gateway is requesting the metric from all Web API Agents, and caches a ranked list of the instances. The interval of this schedule is set with LOADBALANCER_REQUEST_INTERVAL. To disable or change the smart load balancing, specific Spring profiles can be added to the SPRING_PROFILES_ACTIVE environment variable. Options include:

load-balancer-round-robin
- This disables the smart load balancing and uses a round-robin based one
load-balancer-prometheus
- If this profile is set, the Web API Gateway will use the Prometheus endpoint of the Web API Agents
- Make sure that Prometheus is enabled on the Web API Agents

Retry

The Web API Gateway supports retries with Web API agents belonging to the same Web API Group. To enable the retry, the Spring profile retry can be added to the SPRING_PROFILES_ACTIVE environment variable. The environment variable RETRY_AGENT_RETRIES sets how often one Web API Agent is retried, and the environment variable RETRY_GROUP_RETRIES sets how often other Web API Agents of this Web API Group should be retried. Requests are first retried on the same Web API Agent. If there are more RETRY_GROUP_RETRIES than actual Web API Agents, the first Web API Agent is retried again. To fine tune what is retried, the RETRY_SERIES, RETRY_STATUS and RETRY_METHODS can be set to include specific http status. When there should be a backoff between retries, RETRY_BACKOFF_ENABLED can be set to true and RETRY_BACKOFF_INTERVAL can bet set to a duration, e.g. 10s.

Using the Gateway in retry mode involves a tradeoff between memory consumption and high availability. The retry mechanism caches the requests and requires therefor memory space. Under the hood the cache uses two layers of memory. The first layer is defined by the Java heap. If the Java heap is not sufficient, the requests are written to the native memory. Both can be configured via parameters of the JVM. For example set the Java heap to 1024MB with -Xmx1024m or the native memory to 2GB with the -XX:MaxDirectMemorySize=2G parameter. Since the native memory is not managed by the GC and unused memory should be cleaned up directly the -Dio.netty.allocator.type=unpooled parameter can be utilized. If the cached requests are then processed too slowly, an out of memory can arise. This means that when using the Gateway in retry mode, a suitable number of Web API Agents and enough cache memory should be available.

Categories

Versions

Web API Gateway

Loadbalancer

Retry