Scaling

As your database grows...

Qbix apps are built to scale with the number of users and information added to the system, using a technique called sharding. To achieve this in your own apps, you must simply design your models to:

use Qbix's database layer for storage,
perform relational "joins" in app layer instead of the database layer,
include a primary key in every table that you will want to do sharding on.

The plugins that come bundled with Qbix, including Users and Streams, have models that do the above three things, so out of the box they can literally grow to support millions of users posting billions of messages to millions of streams.

How it works technically

Sharding is of the most ambitious and complex features that Qbix's database layer supports. Your app's local/app.json config file holds the info needed needed to access the various database connections. In addition, information about routing queries to shards is stored in the config.

By looking at each field in the WHERE clause in your queries, Q's database layer is able to determine which shard(s) to run the query on. The config also specifies whether a given field should be hashed or normalized (the latter is used if e.g. preserving alphabetical is important)

When one of the shards becomes too "busy" -- for example if it begins to contain too many rows (as determined by your monitoring software), you can tell the Qbix database layer to "split" the shard into one or more shards, while the entire app continues to be responsive online. This involves the following steps:

First of all, more machines are brought online and install.php is run on each of them
Then, a temporary config is generated, which will redirect queries to the fresh new shards instead of the one being split
A transaction is done with this new config, selecting all the rows on the shard being split and inserting them on the fresh new shards
Meanwhile, all queries to the shard being split are also written to a query log for that shard
When the operation has completed, we replay this query log on all the new shards, using the temporary config
We repeat steps 4 and 5 until the operation takes less than 10 seconds. Then we finally take the shard offline, and do the operation one last time.
The shard has now been split. The temporary configuration is merged onto the app's config server, and within 1 minute, all the app servers will have the updated config.
When the app server hits a shard that is offline, it tries to download a new configuration, and uses that.
In the end, all app servers are now issuing queries to the new shards instead of the one that has been split

The above system is different from the "pre-sharding" that is done with traditional sharding schemes, at places such as facebook. Its advantage is that we can partition the data on demand, splitting any shard into as many parts as we need, instead of being limited to the "logical shards" that were created before first the data was even saved. With this scheme, the system can scale almost indefinitely, being constrained only by the number of database connections that can be open at one time.