Receive Per-Record Duplicate Checks During Bulk Registration
When a coordinator submits a bulk registration batch, the Duplicate Detection Service evaluates each activity record in the batch individually before the batch insert is executed. For any record that exceeds the duplicate confidence threshold, the coordinator is notified with the details of the conflicting existing record. The coordinator can then choose to exclude the flagged record from the batch, override the warning for that specific record, or cancel the entire batch. This per-record approach ensures that a single duplicate in a large batch does not silently corrupt reporting data, while also avoiding the rejection of an entire valid batch due to one problematic entry.
User Story
Acceptance Criteria
- Given a coordinator submits a bulk registration batch, when the batch is processed, then each record is individually evaluated by the Duplicate Detection Service before any record is committed
- Given a bulk batch contains one or more records with confidence scores above the duplicate threshold, when detection completes, then the coordinator is shown the flagged records with details of each conflicting existing activity
- Given flagged records are shown, when the coordinator reviews them, then the coordinator can selectively exclude specific flagged records from the batch while allowing the rest to proceed
- Given flagged records are shown, when the coordinator chooses to override for a specific record, then that record is saved with an audit marker indicating coordinator-confirmed override in a bulk context
- Given a bulk batch where no records exceed the duplicate threshold, when detection completes, then the entire batch is committed without interruption
- Given a bulk batch of N records, when duplicate checking runs, then all N records are checked before the first record is saved — partial commits do not occur before all checks complete
Business Value
Bulk and proxy registration workflows are specifically identified as high-risk for duplication because coordinators registering on behalf of peer mentors may be unaware of what the peer mentor has already submitted directly. A per-record check in bulk flows prevents silent data corruption at scale, where the impact on Bufdir statistics and grant calculations would be proportionally larger than a single individual duplicate.