TAP – The Advancement Services
Created By: BI DRS
Created Date: 19 March 2024
Employment upload process
Identify Duplicates
Remove duplicate rows from Employment_file.
Verify unique information for each Contact ID. If there are discrepancies, the file should be reviewed by the requester.
Check for Newer Data
Check if the primary employment of that Contact was created before or after the Employment_file Date collection.
Download all the data from the Employment object in an Excel sheet (where
AQB__SetAsPrimary__c = TRUE
) with Contact ID, Start Date, and Created Date.Create a VLOOKUP column in Employment_file to retrieve the Created Date from Employment object sheet by running a VLOOKUP on Contact IDs.
Create an “Action” column with the following criteria:
If the Created Date is greater than or equal to the Employment_file Date: No Action (remove from your sheet as newer data is recorded).
If the Created Date is less than the Employment_file Date: Data Matching.
If the result is #N/A, it indicates no employment exists in the system for that Contact ID: Insert Data.
* Correct date format if needed (dd/mm/yyyy)
Data Cleansing
For free text fields, correct typos and inconsistent capitalization.
Field Mapping
Map values from the file to the correct picklist values contained in the specified field.
For the Industry field, please map to
USYD_Global_Industry__c
.Do not forget:
AQB__SetAsPrimary__c = TRUE
AQB__Status__c = Current or Retired
(in accordance the person’s circumstances).AQB__StartDate__c = If information is not given by the requester, the field should be populated with the day the data was received followed by
AQB__Started__c = On or Before`.AQB__Source__c = File source + year
.AQB__EmployerName__c
: If it is blank, replace with “Not Provided”AQB__Title__c
: If it is blank, replace with “Not Provided”.
Action Folder Instructions
Insert Employment folder:
Create a folder named “Insert Employment.”
Filter the Employment_file by “Insert Data” from the Action Column.
Save the filtered file as Employment_file_insert_data.
Data Matching folder:
Create a folder named “Data Matching.”
Filter the Employment_file by “Data Matching” from the Action Column.
Save the filtered file as Employment_file_data_matching.
Data Matching Instructions
Upload Data: Request an Analyst or Manager to upload the Employment_file_data_matching to Snowflake as a table.
Query for Duplicates: Use the Employment_file_data_matching worksheet in Snowflake to query the table. Ensure you modify the script to include the correct fields and table name based on the uploaded file.
Export Results: Download the query results to an Excel sheet for further analysis.
Data Matching Input File (Manual Check)
Filter by Review Flag:
100% Match: Use the employment ID to update Jarvis employment data if necessary (e.g., new information from Employment_file_data_matching where the field in Jarvis is blank).
No further action is needed if the data is already complete.> 80% Match or Between 60-80% Match: Conduct a sanity check.
If it’s a match, follow the process outlined in step 1.
If it’s not a match, add it to Employment_file_insert_data.
Between 60-80% Match or Isolate for Review: Likely not a match. Manually review to confirm the results and follow the appropriate process as mentioned in step 2.
Employment Name Matching for Employment_file_insert_data
* Employer Name Matching Pathway: Data Visualisation and Reporting\Power BI\Current Power BI Dashboards\Power BI Data Matching\TAP Data Matching\Ad-Hoc Requests\Employment Name Matching
Check Employer Account in Jarvis:
Before inserting new employment data in Jarvis, verify if the employer has an account in Jarvis to determine the Record Type (Linked Employer or Non-Linked Employer).
Input Employer Name:
Use the Employment Name Template to input Employer Name from Employment_file_insert_data and Row Number . It’s recommended to use the
=UNIQUE
function to ensure only unique Employer Names are included.
Perform Data Matching:
Utilize the Employment Data Matching dashboard to perform data matching. Change the source in the transform data step and refresh the table as per the usual process.
Retrieve Data Matching Input File:
Obtain the Data Matching input file, which should include the Row Number, Employer Name, Jarvis Account, Jarvis Account ID, Result Comparison, and Record Type ID.
Review Matching Results:
If the name is not an exact match, the Matching Result column will display “Review”. Filter by this result to determine the appropriate Record Type and Record Type ID.
Getting Information to “Insert Data” and Upload
Linked Employer:
Use the VLOOKUP function with Employer Name as the key.
Retrieve Account ID, and add Linked Employer Record Type ID.
e.g. Data Loader Mapping:
Non-Linked Employer:
Only include Record Type ID and use Employer Name from Employment_file_insert_data.
e.g. Data Loader Mapping:
Insert (Data Loader):
Upload the files separately.
De-duplication Rules
Records deduplication involves Deletion and Consolidation activities. In principle, any deletion requires checking all data fields to avoid orphan records or information lost. Below is a table of the rules for the deduplication activities.
Scenario | Explanation | Action |
R001 Same (Record Type, Contact ID, Employer, Job Title, Job Status)keep the oldest StartDate record. | ||
Count of Employment Records with Duplicates (27/03.2024) | ||
Segment | Count of Unique Employment | Count of Duplicates |
Record Type: Linked Employer | 20,505 | 411 |
Record Type: Non-Linked Employer | 140,924 | 3,368 |
Check Contact ID (No linked Contact) | If no Contact ID, ensure no lost of important data or generating orphan record. Requires a reviewer’s review and validation before deletion
| Reviewer to cross check linkages and the root cause of the missing Contact ID to decide one of the following actions: Keep, Merge, or Delete
|
Check Source and Comment | Check Source AQB__Source__c and Comment AQB__EmploymentComment__c for unique and useful information such as a $ amount | Flag to keep the record or merge the field. Sample: |
2. Check ownership | Check the record owner CreatedById (e.g. head of TAP, development, donation, gift…) to avoid losing important information created by key stakeholders.
| Flag to keep the record and link the existing or add new Contact to link |
Check Contact ID (Has linked Contact) | If has Contact ID, do the following |
|
3. Check StartDate | Which Employment record iteration has the oldest Start date AQB__StartDate__c
| Keep the record with the oldest start date |
4. Check Comment | If holds data AQB__EmploymentComment__c | Concatenate the comments (to be kept in the oldest record)
|
5 Check Earnings | If holds data | Go to the oldest record, and:
|
6. Check Primary | If the to-be-deleted record/s is set to YES and Ended field AQB__Ended__c IS NOT “No longer at this position”
| Go to the oldest StartDate record, and: Update AQB__IsPrimaryEmployer__c to TRUE
|
7. Check Occupation | If the field AQB__Occupation__c holds data | Go to the oldest record, and:
|
8. Check Industry | If holds data in USYD_Global_Industry__c | Go to the oldest record, and:
|
9. Check Level | If holds data AQB__Level__c | Go to the oldest record, and:
|
10. Check End Date | If holds data AQB__Ended__c | Go to the oldest record, and:
|
11. Check Business phone, Fax Number, Corporate Phone | If holds data AQB__BusinessPhone__c AQB__Fax_Number__c AQB__Corporate_Phone__c | Go to the oldest record, and:
|
12. Check Assistant Name, Assistant Phone, Assistant Email | If holds data AQB__Assistant_s_Name__c AQB__Assistant_s_Phone__c AQB__Assistant_s_Email__c
| Go to the oldest record, and: |
Dedicated view to review Duplicates is available in Snowflake: https://app.snowflake.com/zp65878/sydneyuni/#/data/databases/SP_ADVANCEMENT_DB/schemas/TDA_SCH/view/EMPLOYMENT_DQ_VW | ||
R002 xxx |
|
|
|
|
|
Contacts
TAP BI DRS Team
DRS Manager: Yeng Sembrano (yeng.sembrano@sydney.edu.au)
Senior Data Quality Analyst: Ping Xiao (ping.xiao@sydney.edu.au)
Data Steward Assistant: Vanessa Pinheiro (vanessa.pinheirodesouza@sydney.edu.au)