Configuration Reference

The sequins configuration is in the toml format. Sequins will look for a sequins.conf file in the local directory, and then /etc/sequins.conf if that doesn't exist.

Below is a full list of the configuration properties. Some configuration properties are nested under headers, like [s3]. See the sequins.conf.example file that ships with releases (also on github) for an example of the file layout.

A few properties below are durations. These are strings with a shorthand unit, like "1s" or "20m". Valid units are ns, us (or µs), ms, s, m, and h.

Top Level Properties


Type Default
string unset (eg "hdfs://<namenode>:<port>/path/to/stuff")

The url or directory where the sequencefiles are. This can be a local directory, an HDFS url of the form hdfs://<namenode>:<port>/path/to/stuff, or an S3 url of the form s3://<bucket>/path/to/stuff. This should be a a directory of directories of directories; each first level represents a 'database', and each subdirectory therein represents a 'version' of that database. This must be set, but can be overriden from the command line with --source.


Type Default
string ""

The address to bind on. This can be overridden from the command line with --bind.


Type Default
string "/var/sequins"

This is where sequins will store its internal copy of all the data it ingests. This can be overriden from the command line with --local-store.


Type Default
string unset (eg 4)

If this flag is set, sequins will only update this many databases at a time, minimizing disk usage while new data is being loaded. If you set this to 1, then loads will be completely serialized.


Type Default
string unset (eg "800μs")

If this flag is set, sequins will sleep this long between writes while loading data, artificially slowing down loads and reducing disk i/o. If you are using disks where the latency is extremely sensitive to activity, then loading large amounts of data can negatively impact your latency, and you may want to experiment with this setting.


Type Default
string unset (eg "10m")

If this flag is set, sequins will periodically download new data this often. If you enable this, you should also enable require_success_file, or sequins may start automatically downloading a partially-created set of files.


Type Default
bool false

If this flag is set, sequins will only ingest data from directories that have a _SUCCESS file (which is produced by hadoop when it completes a job).


Type Default
string unset (eg "application/json")

If this is set, sequins will set this Content-Type header on responses.



Type Default
string "snappy"

This can be either 'snappy' or 'none', and defines how data is compressed on disk.


Type Default
int 4096

This controls the block size for on-disk compression.



Type Default
string unset (eg "us-west-2")

The S3 region for the bucket where your data is. If unset, and sequins is running on EC2, this will be set to the instance region.


Type Default
string see below (eg "AKIAIOSFODNN7EXAMPLE")


Type Default
string see below (eg "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY")

The access key and secret to use for S3. If unset, the env variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY will be used, or IAM instance role credentials if they are available.



Type Default
bool false

If true, sequins will attempt to connect to zookeeper at the specified addresses (see zk.servers), and coordinate with peer instances to shard datasets. For a complete description of the sharding algorithm, see the manual.


Type Default
int 2

This is the number of replicas responsible for each partition.


Type Default
int 1

This is the minimum number of replicas required for sequins to switch to a new version. Set this to a higher value to ensure data redundancy before upgrading.

You probably don't want this to be equal to replication, or sequins will never upgrade versions if any node at all is down.


Type Default
string "10s"

Upon startup, sequins will wait this long for the set of known peers to stabilize.


Type Default
string "100ms"

This is the total timeout (connect + request) for proxied requests to peers in a sequins cluster. You may want to increase this if you're running on particularly cold storage, or if there are other factors significantly increasing request time.


Type Default
string see below (eg "50ms")

After this interval, sequins will try another peer concurrently with the first, as long as there are other peers available and the total time is less than proxy_timeout. If left unset, this defaults to the proxy_timeout divided by replication_factor - enough time for all peers to be tried within the total timeout.


Type Default
string "sequins"

This defines the root prefix to use for zookeeper state. If you are running multiple sequins clusters using the same zookeeper for coordination, you should change this so they can't conflict.


Type Default
string see below (eg "")

This is the hostname sequins uses to advertise itself to peers in a cluster. It should be resolvable by those peers. If left unset, it will be set to the hostname of the server.


Type Default
string see below (eg "sequins1")

The shard ID is used to determine which partitions the node is responsible for. By default, it is the same as advertised_hostname. Unlike the hostname, however, it doesn't have to be unique; two nodes can have the same shard_id, in which case they will download the same partitions. This can be useful if you don't have stable hostnames, but want to be able to rebuild a server to take the place of a dead or decomissioning one.



Type Default
array of string ["localhost:2181"]

If set and 'sharding.enabled' is true, sequins will connect to zookeeper at the given addresses.


Type Default
string "1s"

This specifies how long to wait while connecting to zookeeper.


Type Default
string "10s"

This specifies the session timeout to use with zookeeper. The actual timeout is negotiated between server and client, but will never be lower than this number.



Type Default
string "localhost:8200"

If set, sequins will send metrics concerning S3 file downloads using the DogStatsD protocol to this address.



Type Default
string unset (eg "localhost:6060")

If set, binds the golang debug http server, which can serve expvars and profiling information, to the specified address.


Type Default
bool true

If set, this adds expvars to the debug HTTP server, including the default ones and a few sequins-specific ones.


Type Default
bool false

If set, this adds the default pprof handlers to the debug HTTP server.

results matching ""

    No results matching ""