Finding unique records
Setting the scene: We have a set of records - in this case, company names - and we want to weed out the duplicates and find a unique set of names (by implication, we’re going to end up with the “first” occurrence of any company name and discard second and subsequent occurrences). Our data looks like this:
Here’s our script:
let companiesT = base.getTable('Companies');
let companiesQ = await companiesT.selectRecordsAsync();
const records = companiesQ.records
const seen = new Set();
const filteredArr = records.filter(el => {
const duplicate = seen.has(el.getCellValue('Name'));
seen.add(el.getCellValue('Name'));
return !duplicate;
});
output.table(filteredArr)
Let’s walk though this line by line to explain what is going on. The first 3 lines set up our table, a query and we assign the query records to the variable records
:
let companiesT = base.getTable('Companies');
let companiesQ = await companiesT.selectRecordsAsync();
const records = companiesQ.records
The next line creates a new Set
:
const seen = new Set();
A Set is an object that stores unique values and a value in a Set
can only occur once.
Next, we’re going to filter the records returned by the query from our table. Here’s the code:
// set up the resulting filtered array using records.filter
const filteredArr = records.filter(el => {
// use the `has` method on the Set to check if the
// current record's Name value is in the array
// `seen.has` returns true or false, so the variable duplicate will be true or false
const duplicate = seen.has(el.getCellValue('Name'));
// Add the current Name value to the set
// If the value is already in the Set
// then it won't be created a second time
seen.add(el.getCellValue('Name'));
// return the opposite duplicate
return !duplicate;
});
The last part of this filter is where the magic happens! If duplicate
is true
(so we’ve seen the Name value before), then return the opposite of duplicate
, so in this case false. Returning false excludes the current record from the final filtered array. Conversely, is duplicate
is false
(so we haven’t seen this Name value before), then return true
. Returning true includes the current record in the filtered array.
Finally, we output the filtered array to view using output.table
.
From here we can use the filtered array to mark these records as the originals or mark anything not in this array as a duplicate.