Skip to main content

IRaft API

IRaft is the primary application-facing interface.

AreaMembers
LifecycleJoinCluster, LeaveCluster, UpdateNodes
MembershipGetMembership, LocalRole, OnMembershipChanged
Cluster stateJoined, IsInitialized, GetNodes, GetLocalEndpoint, GetLocalNodeId, GetLocalNodeName, GetLastNodeActivity, GetActiveNodes, GetFollowerLagAsync
LeadershipAmILeaderQuick, AmILeader, WaitForLeader, WaitForLeaderStableAsync
ReplicationReplicateLogs, ReplicateCheckpoint, CommitLogs, RollbackLogs
Elastic partitionsCreatePartitionAsync, RemovePartitionAsync, SplitPartitionAsync, MergePartitionsAsync, GetPartitionGeneration, GetPartitionMap, RegisterStateMachineTransfer
Partition routingGetPartitionKey, GetPrefixPartitionKey
Transport entry pointsHandshake, RequestVote, Vote, AppendLogs, CompleteAppendLogs
ComponentsWalAdapter, Communication, Discovery, Configuration, HybridLogicalClock, ReadScheduler, WalScheduler
EventsOnRestoreStarted, OnRestoreFinished, OnReplicationError, OnLogRestored, OnReplicationReceived, OnLeaderChanged, OnPartitionMapChanged

The transport entry points are intended for communication adapters and HTTP/gRPC endpoint handlers. Normal application writes should use the replication APIs.

RaftManager also exposes system-partition callbacks on the concrete type for internal configuration replication. They are not part of IRaft.

Lifecycle Notes

JoinCluster accepts an optional cancellation token:

using CancellationTokenSource joinTimeout = new(TimeSpan.FromSeconds(30));
await raft.JoinCluster(joinTimeout.Token);

If you do not supply your own cancellation, RaftManager still applies an internal 60-second timeout while waiting for cluster initialization to complete.

There is also a seed-based overload:

await raft.JoinCluster(
new[] { "node-a:7000", "node-b:7000" },
joinTimeout.Token
);

Current membership-capable builds join new nodes as learners first and only return once the node has been promoted to a committed voter.

Membership

GetMembership returns a point-in-time snapshot of the committed cluster roster.

LocalRole tells you whether the local node is currently a:

  • Voter
  • Learner
  • Leaving
  • NotMember
ClusterMembership roster = raft.GetMembership();
ClusterMemberRole localRole = raft.LocalRole;

Use OnMembershipChanged to observe roster version changes:

raft.OnMembershipChanged += membership =>
{
};

MembershipVersion is monotonic for the life of the cluster and is the main fence for membership updates.

Cluster Activity

GetLastNodeActivity returns the last HLC timestamp when the local node observed activity from a specific endpoint.

GetActiveNodes returns non-local endpoints seen within a time window. This is useful for diagnostics, health displays, and tests that need to confirm recent follower activity.

HLCTimestamp lastSeen = raft.GetLastNodeActivity("node-b:2070");
IReadOnlyList<string> activeNodes = raft.GetActiveNodes(TimeSpan.FromSeconds(2));

GetFollowerLagAsync returns the observed lag for a follower on a partition when the local node has that progress information:

long? lag = await raft.GetFollowerLagAsync(
partitionId: 1,
followerEndpoint: "node-b:2070"
);

null means there is no recorded lag value for that follower and partition on this node.

Events

Subscribe before JoinCluster if you need restore callbacks.

raft.OnRestoreStarted += partitionId => { };
raft.OnRestoreFinished += partitionId => { };

raft.OnLogRestored += (partitionId, log) =>
{
return Task.FromResult(true);
};

raft.OnReplicationReceived += (partitionId, log) =>
{
return Task.FromResult(true);
};

raft.OnReplicationError += (partitionId, log) => { };

raft.OnLeaderChanged += (partitionId, leaderEndpoint) =>
{
return Task.FromResult(true);
};

raft.OnPartitionMapChanged += ranges =>
{
};

raft.OnMembershipChanged += membership =>
{
};

Leadership Helpers

Use WaitForLeader when you need the current leader endpoint before routing a request. Use WaitForLeaderStableAsync when you need the same non-empty leader to remain stable for a minimum duration.

string leader = await raft.WaitForLeader(1, cancellationToken);

string stableLeader = await raft.WaitForLeaderStableAsync(
1,
TimeSpan.FromMilliseconds(500),
cancellationToken
);

WaitForLeaderStableAsync is especially useful in tests and operational flows where you want to avoid reacting to a leader that is still flapping.

Test Hooks

IRaft exposes several advanced members marked with EditorBrowsable(EditorBrowsableState.Never):

  • ForceLeaderForTestingAsync
  • StepDownAsync
  • TransferLeadershipAsync
  • SuspendHeartbeatsAsync
  • ResumeHeartbeatsAsync

These are intended for deterministic tests and fault-injection scenarios, not ordinary application traffic or public API endpoints.

Operation Status Values

StatusMeaning
SuccessOperation completed successfully.
ErroredOperation failed with an internal error.
NodeIsNotLeaderThe local node is not leader for the requested partition.
LeaderInOldTermA request came from a leader with an old term.
LeaderAlreadyElectedA leader was already known for the term.
LogsFromAnotherLeaderA follower received logs from a node other than the expected leader.
ActiveProposalAnother proposal is still active.
ProposalNotFoundThe supplied proposal ticket was not found.
ProposalTimeoutThe proposal did not complete in time.
ReplicationFailedReplication failed before commit.
PendingInternal state used while asynchronous work is in progress.
ProposalQueueFullThe per-partition client proposal queue is full. Retry with backoff.
RestoreInProgressThe partition is still restoring from the WAL. Retry after a short delay.
PartitionMovedThe partition generation changed. Refresh the partition map and retry on the current owner.
StaleMembershipThe roster version changed. Re-read membership and retry against the current version.
ConcurrentMembershipChangeAnother membership change is already in flight. Retry after it commits.
InsufficientVotersThe requested removal would leave the cluster unavailable. Do not retry blindly.
LogMismatchA follower rejected an anchored backfill append because its log did not match PrevLogIndex / PrevLogTerm. The leader backs up and retries.
SnapshotRequiredThe follower needs entries below the leader's compaction floor. Ordinary log backfill cannot catch it up.

Elastic Partition APIs

Kommander also exposes runtime partition lifecycle operations:

RaftPartitionLifecycleResult created = await raft.CreatePartitionAsync(10);
RaftPartitionLifecycleResult split = await raft.SplitPartitionAsync(2);
RaftPartitionLifecycleResult merged = await raft.MergePartitionsAsync(2, 3);
RaftPartitionLifecycleResult removed = await raft.RemovePartitionAsync(10);

Useful companion APIs:

long generation = raft.GetPartitionGeneration(2);
IReadOnlyList<RaftPartitionRange> map = raft.GetPartitionMap();
raft.RegisterStateMachineTransfer(new MyTransfer());

See Elastic Partitions for the full behavior and application responsibilities.

Replication Signature Note

ReplicateLogs takes expectedGeneration before cancellationToken in the optional-parameter list.

That makes named arguments the safest style for most callers:

RaftReplicationResult result = await raft.ReplicateLogs(
partitionId: 1,
type: "OrderCreated",
data: payload,
expectedGeneration: generation,
cancellationToken: cancellationToken
);