Rejourney Swift Package Is Now in Open Beta
The session state machine, two start paths, URLProtocol swizzle, visual capture backpressure, ANR ping-pong, and crash recovery checkpoints — how the native iOS SDK actually works.
The session state machine, two start paths, URLProtocol swizzle, visual capture backpressure, ANR ping-pong, and crash recovery checkpoints — how the native iOS SDK actually works.
The native Rejourney Swift Package is now in open beta. This article covers how the recorder actually works: the session state machine, the two start paths, how we intercept network traffic without intercepting our own uploads, what happens to a session that dies mid-recording, and why the ANR sentinel lives on a separate thread.
The package targets iOS 15.1+, requires Swift tools 5.9, and links only libz. There is no CocoaPods podspec, no JavaScript runtime, and no React Native dependency.
RejourneyNativeController is a @MainActor singleton that owns all session transitions. Its state is a Swift enum with five cases:
private enum SessionState: Equatable {
case idle
case starting(sessionId: String)
case active(sessionId: String)
case paused(sessionId: String, backgroundedAt: TimeInterval)
case terminated
}The starting case uses a "pending_\(timestampMs)" placeholder ID. A 5-second poll loop (50 iterations × 100 ms) waits for ReplayOrchestrator.shared.replayId to become non-nil before transitioning toactive. If the orchestrator never produces an ID — usually because credential fetch failed — the controller drops back to idle and disables URL interception.
Background/foreground is handled by two NotificationCenter observers wired insetupLifecycleListeners(). When the app backgrounds, state moves to paused(sessionId:, backgroundedAt:) with the current Unix timestamp. On foreground the controller reads the elapsed duration and compares it against a 60-second timeout. Under the threshold the session resumes; over it, the controller races two triggers — a 2-second DispatchWorkItem grace timer and the endReplayWithReason("background_timeout") completion callback — to start a fresh session without blocking on the prior session's upload.
var restartStarted = false
let triggerRestart: (String) -> Void = { source in
guard !restartStarted else { return }
restartStarted = true
Task { @MainActor in await self.startNewSessionAfterTimeout() }
}
// Grace path: fire after 2s if callback hasn't arrived yet
DispatchQueue.main.asyncAfter(deadline: .now() + 2) {
triggerRestart("grace_timeout")
}
// Callback path: fire as soon as old session is finalized
DispatchQueue.global(qos: .utility).async {
ReplayOrchestrator.shared.endReplayWithReason("background_timeout") { _, _ in
triggerRestart("end_replay_callback")
}
}Every call to Rejourney.start() first hits /api/sdk/config to fetch remote configuration — sampleRate, recordingEnabled, maxRecordingMinutes, and billing state. The response determines whether visual capture runs at all. A 401/403/404 is treated as hard denial and returns RejourneyStartResult(success: false, error: "access_denied_\(statusCode)"). A network failure falls back to RejourneyRemoteConfig.defaultConfig and continues with local defaults.
After remote config is resolved, the orchestrator needs upload credentials. There are two code paths:
Calls DeviceRegistrar.shared.obtainCredential, which performs a credential handshake and stores the result in Keychain. Only then does it start an NWPathMonitor and wait for a satisfied network path before _beginRecording is called.
Uses a cached Keychain credential directly. Skips the credential fetch and the network monitor startup entirely. Calls _beginRecording on the main queue synchronously — measurably faster for the session-rollover case after a background timeout.
Sample rate enforcement happens in RejourneySessionPolicy.derive. It draws a Double.random(in: 0..<100) and compares it against the remote sampleRate integer. Sessions that are sampled out return before replay, network interception, or capture is started, so the SDK avoids the native recording path for that launch.
URLProtocol.registerClass(RejourneyURLProtocol.self) covers URLSession.shared and any session created from the default configuration. It does not cover sessions built with a custom URLSessionConfiguration — which is exactly what SDWebImage, Alamofire, and most third-party SDKs use. To reach those, we swizzle the protocolClasses getter on URLSessionConfiguration itself.
let didAdd = class_addMethod(
URLSessionConfiguration.self,
swizzledSel,
method_getImplementation(swizzledMethod),
method_getTypeEncoding(swizzledMethod)
)
if didAdd, let addedMethod = class_getInstanceMethod(configClass, swizzledSel) {
originalProtocolClassesIMP = method_getImplementation(originalMethod)
method_exchangeImplementations(originalMethod, addedMethod)
}The replacement getter calls through to the original IMP via a saved function pointer, then inserts RejourneyURLProtocol at index 0 if not already present. This means every URLSessionConfiguration instance — existing or future — gets the protocol injected at the point it queries its class list.
Self-interception is prevented by stamping forwarded requests with a property under the key "co.rejourney.handled". canInit(with:) returns false immediately if that property is set. The forwarding session itself is initialized from URLSessionConfiguration.ephemeral with protocolClasses = [], so even the swizzled getter produces an empty list for our internal session.
The original implementation created a new URLSession per intercepted request, which leaked 1–3 MB per request under heavy traffic. The current design uses one shared forwarding session with a SessionDelegateAdapter that routes callbacks through an NSMapTable<URLSessionTask, RejourneyURLProtocol>.strongToWeakObjects(). The weak value side means protocol instances that are stopped by the URL loading system get collected without a leak, and the map never accumulates stale entries.
UIKit requires that drawHierarchy(in:afterScreenUpdates:) runs on the main thread. There is no way around this. What we can control is how much time we spend there and how we handle encode and upload without blocking the render pipeline.
Screenshots are taken at a configurable interval (default 0.33s, translating to roughly 3 fps) and immediately handed off to a serial OperationQueue named "co.rejourney.encode" with .utility QoS. JPEG compression runs entirely on that queue. The main thread is only involved for the initial pixel read — not for compression, buffering, or upload.
Two backpressure limits protect against queue growth under slow network conditions: 50 pending encode batches and 500 buffered screenshots. Frames are dropped — not queued indefinitely — when either limit is reached. The capture scale is 1.25, which means the framebuffer is read at 80% of linear screen size before JPEG encoding, matching the ratio used by the Android recorder.
One non-obvious guard: we skip drawHierarchy while the keyboard is animating. Calling it during a keyboard transition causes UIKit to stall the main thread — we measured 7+ seconds — while it resolves conflicting layout constraints between the keyboard window and the app window. We observe both keyboardWillShow and keyboardWillHide, and only resume capture 0.45 seconds after keyboardDidShow or keyboardDidHide fires.
View hierarchy snapshots run on a separate Timer scheduled in the default run loop mode — intentionally not .common. This lets the timer pause during scrolling, preventing main-thread pressure from a hierarchy walk through deep subviews while the user is actively scrolling. Hierarchy snapshots are also skipped when MapKit is visible and actively animating; the Metal and OpenGL subview tree under an animating map adds meaningful main-thread cost to a full hierarchy scan. Deduplication uses a cheap hash of the current screen name and root child count — if neither changes, the snapshot is not uploaded.
AnrSentinel runs a watch loop on a dedicated Thread named "co.rejourney.anr" at .utility QoS. Every 2 seconds it posts a block to DispatchQueue.main.async and records the dispatch time. When the main queue actually executes the block it stamps ProcessInfo.processInfo.systemUptime as the response time. If 2 seconds pass without a response and the delta exceeds the 5-second freeze threshold, the sentinel declares an ANR.
State shared between the watch thread and the main thread is protected by os_unfair_lock, which is appropriate here because the critical sections are short (a handful of struct assignments) and the lock is never held across I/O. A lastAnrReport timestamp prevents duplicate reports while a single long freeze persists — if the freeze hasn't cleared for another 5-second window, the sentinel stays quiet.
On ANR detection, Thread.callStackSymbols is captured and the incident is handed to StabilityMonitor, which persists it to a JSON file in the caches directory. This mirrors how crash reports survive process termination: if the app is killed while an ANR is in progress, the next session start will find and upload the stored incident.
When a session starts, the orchestrator writes a checkpoint to rejourney_recovery.json in the app's Documents directory. The file contains the session ID, start timestamp, API token, endpoint, upload credential, and a timingVersion field (currently 3). A background DispatchSourceTimer fires every 5 seconds on a .utility queue to update lastActiveCheckpointMs and re-write the file. The timer does not fire while the app is backgrounded, so the file always reflects the last known foreground timestamp.
On the next app launch, recoverInterruptedReplay reads the file, re-hydrates SegmentDispatcher with the stored credentials, and calls VisualCapture.shared.uploadPendingFrames for any frames that were buffered to disk but not uploaded. Only after those frames are confirmed uploaded does it call SegmentDispatcher.concludeReplay with endReason: "recovery_finalize".
The closeAnchorAtMs parameter in the finalize call is where timingVersion matters. Version 3 semantics: for a "background_timeout" end reason, the close anchor is set to lastBackgroundEntryMs — the exact moment the app last entered the background — rather than the crash recovery time. This keeps the session duration accurate in the replay timeline even when the finalize call happens minutes or hours later.
The Rejourney enum is @MainActor and exposes both async and callback-based overloads for start() and stop() so UIKit apps without Swift concurrency adoption can still call it from an AppDelegate.
// Configure — call before start, safe to call multiple times
Rejourney.configure(publicKey: "rj_...", options: .init(
wifiOnly: false,
captureANR: true,
autoTrackNetwork: true
))
// Async start — returns RejourneyStartResult with sessionId + telemetryOnly flag
let result = await Rejourney.start()
// Identity — persisted to UserDefaults, restored across sessions
Rejourney.identify("user_abc123")
// Screen tracking — queued before session is ready, replayed on active
Rejourney.trackScreen("Checkout")
// Custom events — typed properties accept Swift literals directly
Rejourney.logEvent("checkout_started", properties: ["plan": "pro"])
// View-level redaction — registered in VisualCapture's RedactionMask
Rejourney.mask(sensitiveLabel)
// Graceful stop — drains and finalizes the session
let stopResult = await Rejourney.stop()Custom event properties use RejourneyMetadataValue, an indirect enum with ExpressibleByStringLiteral, ExpressibleByIntegerLiteral, ExpressibleByFloatLiteral, ExpressibleByBooleanLiteral, and ExpressibleByNilLiteral conformances. You can pass a string, int, double, bool, array, nested object, or nil literal directly without wrapping.
Screen names tracked before start() returns are queued in RejourneySessionContext (capped at 50 entries, consecutive duplicates removed). When the session becomes active, the queue is drained and each screen is replayed as a telemetry view transition event, so pre-start navigation appears correctly in the replay timeline.
The recorder, the ingest protocol, the session lifecycle semantics, and the privacy defaults are production-quality — they have been exercised through the React Native SDK at scale. What we are collecting signal on in this beta:
UserDefaults and Keychain access groups behave differently under extension sandboxing.viewDidAppear equivalent, we want to understand how teams prefer to wire trackScreen — .onAppear, NavigationStack path observation, or a custom modifier.PrivacyInfo.xcprivacy manifest is being picked up correctly by App Store submission pipelines.Native iOS versioning is independent from the React Native package. Tags follow plain semver (v0.2.0). A CI check validates that packages/ios/VERSION and RejourneySDKInfo.version are in sync before a tag is created.