abstract |
An apparatus includes a data access circuit that interprets data records, each having a number of data fields, a record parsing circuit that determines a number of n-grams from terms of each of the data records and maps the number of n-grams to a corresponding number of mathematical vectors, and a record association circuit that determines whether a similarity value between a first mathematical vector for the first data record and a second mathematical vector for the second data record is greater than a threshold similarity value, and associates the first and second data records in response to the similarity value exceeding the threshold similarity value. An example apparatus includes a reporting circuit that provides a catalog entity identifier, associates each of the first term and the second term to the catalog entity identifier, and provides a summary of activity for an entity. |