Healthchecks and Monitoring

The Status Page and JSON

Visiting the host and port that a sequins node is running on with your browser will give you a simple status page with sharding and version information for each database:

The information presented represents the whole cluster in the distributed case, and should be the same no matter which node you ask. You can also ask for a specific db, by visiting /<db> (in this example: localhost:9599/flights).

Finally, you can get a JSON representation of the same information by fetching with Accept: application/json:

$ http localhost:9599 'Accept:application/json'
{
    "dbs": {
        "flights": {
          ...

A simplified healthcheck interface is available at the /healthz and /healthcheck endpoints which will return a JSON representation of the state of each node. The status code will either be 200 if they all have the status AVAILABLE or ACTIVE, or 404 if at least one node does not:

$ http localhost:9599/healthz
HTTP/1.1 200 OK
Content-Length: 46
Content-Type: application/json
Date: Wed, 26 Jul 2017 21:49:52 GMT

{
    "baby-names": {
        "1": {
            "localhost": "ACTIVE"
        }
    }
}

Version States

Each version within a database on a Sequins node can exist in one of 5 states. These states are specific to the version of the database of the Sequins node, so identical versions can have different states depending on which node they are on.

  • ACTIVE: The version is actively being served. Only one version within a database should hold this state at a time. For a given Sequins node, its version won't be upgraded to ACTIVE from AVAILABLE until all of its peers also mark the same version as AVAILABLE. The result of this is that all of the nodes in a cluster should be in agreement as to which version is ACTIVE.

  • AVAILABLE: The version has been fully built and is capable of being served by the node.

  • BUILDING: The existence of the version has been noted and is currently being downloaded and indexed.

  • REMOVING: The version is in the process of being removed from Sequins as a newer version has taken its place and is ACTIVE.

  • ERROR: The version has problems and is unable to be processed. For example, one of the blocks could be invalid.

Expvars

You can bind the sequins "debug" HTTP server to a different port, and it'll publish go "expvars" at /debug/vars.

In addition to the built in expvars, sequins publishes the following sequins-specific ones:

  • sequins.Qps.ByStatus: A map of HTTP status to a count of requests in the. past second

  • sequins.Qps.Total: A total of the above.

  • sequins.Latency: A histogram of latency values for the last second, with Max, Mean, and PXX keys.

  • sequins.DiskUsed: The amount of local storage used by sequins.

Datadog

At Stripe, we use Datadog for statsd-like monitoring with lots of bells and whistles. We've open-sourced our Datadog plugin for sequins on github.

Sequins can also report metrics concerning file downloads from S3 using the DogStatsD protocol if the datadog.url (by default localhost:8200) is set.

results matching ""

    No results matching ""