Skip to content

Latest commit

 

History

History
107 lines (78 loc) · 5.83 KB

readme.md

File metadata and controls

107 lines (78 loc) · 5.83 KB

Global ID Generation

IDs are usually generated by database engines. However, this direct approach is not usable in distributed systems where many database instances can be deployed on physically different hosts.

Pseudo-random IDs like GUID/UUIDs, which are often used as a unique IDs, are not a good solution for DB keys because GUIDs are not monotonic/homogeneous and consequential database page scattering which really impedes the performance. IDs are usually used in database indexes - sequential IDs are a much better fit because the BTree index pages are better organized. For example, PK(primary key) index-organized tables perform quick lookups of data by PK, but need to re-organize records on insert if PKs are not consequitive.

Another important benefit of monotonically increasing IDs is the range partitioning. It is used to organize large volumes of data by ranges. This would have been impossible to accomplish with scattered ids.

NFX and Agni libraries provide a solution to aforementioned problems which is suitable for large-scale distributed systems - generation of Global Distributed Identifiers (GDIDs)

GDID Structure and Properties

GDIDs meet all requirements for IDs in a distributed system:

  • Global Uniqueness within the system
  • Monotonically increasing homogeneous (1,2,3,4,5…) segments
  • Large resolution - Named sequences in scopes each having 2^96 resolution
  • Ability to obtain consecutive GDIDs in batches (i.e. request 25 sequential IDs)
  • No single point of failure guaranteed by up to 16 independent ID authorities (see below)
  • Compact design - only 12 bytes (era(4) + id(64))
  • Stored as byte[12] - good performance for keys in MongoDB and MySQL (and others)
  • Compressible - as the majority of business entity IDs are "small" (less than 1 billion), and due to the structured nature of the GDID, variable-bit encodings (i.e. LEB(uint)+LEB(ULONG)) can compress "small" ids to 3-5 bytes (instead of 12)
+---------+---------------+-------------------------------------+
|   Era   |   Authority   |               Counter               |
+---------+---------------+-------------------------------------+
   32 bit       4 bit                     60 bit

To illustrate what a GDID is: 12 bytes = 96 bits A 96 bit integer can hold:

   2^96 = 79,228,162,514,264,337,593,543,950,336 combinations.

To illustrate the resolution of this number, suppose that we have, 1,000,000 clients constantly consuming 1,000 GDIDs per second each. This still will be enough for 2,500,000,000,000 years of operation (two and a half trillion years of operation).

Another example from the IoT application:

2 ^ 96 / (10B users * 100 devices * 100 msg/sec * 86400 sec/day * 366 d/yr) = 25,000,000 years

GDID Generation

GDIDs are generated by specially purposed services - GDID Authorities (agdida service). Authority services are declared in a cluster Metabase root $.acmb file:

  gdid
  {
    authority
    {
      host="World/US/Center/SH.chi2/SBOX1/app0001"
      network="internoc"
    }
    // other authorities ...
  }

There can be up to 16 different authorities in the Agni OS instance, virtually eliminating any single point of failure for GDID generation. GDID authority service implements a contract:

  [Glued]
  [LifeCycle(ServerInstanceMode.Singleton)]
  public interface IGDIDAuthority : IClusterService
  {
    GDIDBlock AllocateBlock(string scopeName, 
                            string sequenceName, 
                            int blockSize,
                            ulong? vicinity = GDID.COUNTER_MAX);
  }

where GDIDBlock is a unit of allocation. The IGDIDAuthority service is used in Agni.Identification.GDIDGenerator - the main class responsible for GDID generation in cluster apps.

GDIDs are generated within logical ‘Scopes’ and ‘Sequences’ and are unique within (scope, sequence) pair. One can think about a scope and a sequence as a database name and a table name inside a database correspondingly.

A consumer of GDIDs obtains them via a GDIDGenerator instance. AgniOS exposes a global GDID generation service to any app:

  /// <summary> References distributed GDID provider </summary>
  public static IGDIDProvider GDIDProvider { get; }

Used like this:

var gdid = AgniSystem.GDIDProvider.GenerateOneGDID("My Namespace", "Sequence A");

The provider will automatically select a closest authority to the host which originates the call, and retry on the next closest authority if the first call fails. GDIDProvider also caches the ID block and adjusts the block size dynamically - so if the process consumes an ID infrequently the system will allocate a few IDs, if the consumption picks up the IDGenerator will ask for larger blocks - to make less calls. GDIDGenerator replenishes blocks asynchronously - when the block depletes below LWM (low water mark) level.

The following illustrates the GDID generation process:

ID leaks are expected, for example when process asks for IDs, system allocates a block of 10 IDs and then only uses a few, however this is normal and expected because of the dynamic block sizing it is unlikely that system gets large blocks and does not use them to the fullest.

These are more of a low-level ways of obtaining GDIDs and should be rarely used, instead unique IDs are usually injected in a declarative fashion via attributes:

  ///<summary> Represents User root record data <summary>
  [Table(targetName: SysConsts.Myi_DS_MYSQL_TARGET, name: "tbl_user")]
  [UniqueSequence(SysConsts.MDB_AREA_USER, "user")] //<--- UNIQUE ID Sequence
  public sealed class UserRow : RowWithGdidPKAndInUse
  {
    public UserRow():base(){}
    ...
  }