Healthchecks and Monitoring
The Status Page and JSON
Visiting the host and port that a sequins node is running on with your browser will give you a simple status page with sharding and version information for each database:
The information presented represents the whole cluster in the distributed case,
and should be the same no matter which node you ask. You can also ask for a
specific db, by visiting /<db>
(in this example: localhost:9599/flights
).
Finally, you can get a JSON representation of the same information by fetching
with Accept: application/json
:
$ http localhost:9599 'Accept:application/json'
{
"dbs": {
"flights": {
...
A simplified healthcheck interface is available at the /healthz
and
/healthcheck
endpoints which will return a JSON representation of the state
of each node. The status code will either be 200
if they all have the status
AVAILABLE
or ACTIVE
, or 404
if at least one node does not:
$ http localhost:9599/healthz
HTTP/1.1 200 OK
Content-Length: 46
Content-Type: application/json
Date: Wed, 26 Jul 2017 21:49:52 GMT
{
"baby-names": {
"1": {
"localhost": "ACTIVE"
}
}
}
Version States
Each version within a database on a Sequins node can exist in one of 5 states. These states are specific to the version of the database of the Sequins node, so identical versions can have different states depending on which node they are on.
ACTIVE: The version is actively being served. Only one version within a database should hold this state at a time. For a given Sequins node, its version won't be upgraded to ACTIVE from AVAILABLE until all of its peers also mark the same version as AVAILABLE. The result of this is that all of the nodes in a cluster should be in agreement as to which version is ACTIVE.
AVAILABLE: The version has been fully built and is capable of being served by the node.
BUILDING: The existence of the version has been noted and is currently being downloaded and indexed.
REMOVING: The version is in the process of being removed from Sequins as a newer version has taken its place and is ACTIVE.
ERROR: The version has problems and is unable to be processed. For example, one of the blocks could be invalid.
Expvars
You can bind the sequins "debug" HTTP
server to a different port, and
it'll publish go "expvars" at /debug/vars
.
In addition to the built in expvars, sequins publishes the following sequins-specific ones:
sequins.Qps.ByStatus
: A map of HTTP status to a count of requests in the. past secondsequins.Qps.Total
: A total of the above.sequins.Latency
: A histogram of latency values for the last second, withMax
,Mean
, andPXX
keys.sequins.DiskUsed
: The amount of local storage used by sequins.
Datadog
At Stripe, we use Datadog for statsd-like monitoring with lots of bells and whistles. We've open-sourced our Datadog plugin for sequins on github.
Sequins can also report metrics concerning file downloads from S3 using the
DogStatsD protocol if the datadog.url
(by default
localhost:8200
) is set.