TAP – The Advancement Services
Created By: BI DRS
Created Date: 19 March 2024
Uploading Employment Record
De-duplication Rules
Records deduplication involves Deletion and Consolidation activities. In principle, any deletion requires checking all data fields to avoid orphan records or information lost. Below is a table of the rules for the deduplication activities.
Scenario | Explanation | Action |
R001 Same (Record Type, Contact ID, Employer, Job Title, Job Status)keep the oldest StartDate record. | ||
Count of Employment Records with Duplicates (27/03.2024) | ||
Segment | Count of Unique Employment | Count of Duplicates |
Record Type: Linked Employer | 20,505 | 411 |
Record Type: Non-Linked Employer | 140,924 | 3,368 |
Check Contact ID (No linked Contact) | If no Contact ID, ensure no lost of important data or generating orphan record. Requires a reviewer’s review and validation before deletion
| Reviewer to cross check linkages and the root cause of the missing Contact ID to decide one of the following actions: Keep, Merge, or Delete
|
Check Source and Comment | Check Source AQB__Source__c and Comment AQB__EmploymentComment__c for unique and useful information such as a $ amount | Flag to keep the record or merge the field. Sample: |
2. Check ownership | Check the record owner CreatedById (e.g. head of TAP, development, donation, gift…) to avoid losing important information created by key stakeholders.
| Flag to keep the record and link the existing or add new Contact to link |
Check Contact ID (Has linked Contact) | If has Contact ID, do the following |
|
3. Check StartDate | Which Employment record iteration has the oldest Start date AQB__StartDate__c
| Keep the record with the oldest start date |
4. Check Comment | If holds data AQB__EmploymentComment__c | Concatenate the comments (to be kept in the oldest record)
|
5 Check Earnings | If holds data | Go to the oldest record, and:
|
6. Check Primary | If the to-be-deleted record/s is set to YES and Ended field AQB__Ended__c IS NOT “No longer at this position”
| Go to the oldest StartDate record, and: Update AQB__IsPrimaryEmployer__c to TRUE
|
7. Check Occupation | If the field AQB__Occupation__c holds data | Go to the oldest record, and:
|
8. Check Industry | If holds data in USYD_Global_Industry__c | Go to the oldest record, and:
|
9. Check Level | If holds data AQB__Level__c | Go to the oldest record, and:
|
10. Check End Date | If holds data AQB__Ended__c | Go to the oldest record, and:
|
11. Check Business phone, Fax Number, Corporate Phone | If holds data AQB__BusinessPhone__c AQB__Fax_Number__c AQB__Corporate_Phone__c | Go to the oldest record, and:
|
12. Check Assistant Name, Assistant Phone, Assistant Email | If holds data AQB__Assistant_s_Name__c AQB__Assistant_s_Phone__c AQB__Assistant_s_Email__c
| Go to the oldest record, and: |
Dedicated view for duplicates review is available in Snowflake: https://app.snowflake.com/zp65878/sydneyuni/#/data/databases/SP_ADVANCEMENT_DB/schemas/TDA_SCH/view/EMPLOYMENT_DQ_VW | ||
R002 xxx |
|
|
|
|
|
Contacts
TAP BI DRS Team
DRS Manager: Yeng Sembrano (yeng.sembrano@sydney.edu.au)
Senior Data Quality Analyst: Ping Xiao (ping.xiao@sydney.edu.au)
Data Steward Assistant: Vanessa Pinheiro (vanessa.pinheirodesouza@sydney.edu.au)