How DataSense Works

  • DATA SIMILARITY MATCHING
  • APPLICATION TO ANY TYPE OF STRING DATA
  • DATASENSE USES THE JACCARD INDEX TO COMPUTE SIMILARITY
  • EASY TO INTEGRATE INFRASTRUCTURE
  • DATASENSE USES HTTP REQUEST/RESPONSE
  • BATCH PROCESSING CAPABILITY

FINDING SIMILARITY

  • text search isn't accurate and not scalable.
  • most vendor solutions offered are not real time.
  • batch processing solutions aren't scalable and not easily integrated
  • how do we get rid of false differences?
NAME
Smith Corporation
Smith Corporate Enterprise
Address
582 Ahmad Pike
56 Adah Walk
City
Estellburgh
Estellburgh
State
AL
AL
Zip
48474
48474
Phone
776-731-0514
669-365-7169

DataSense: Jaccard Index

datasense offers a real time search solution matching similars using set probability.

Data Sample A

  • Smith Corporation
  • 582 Ahmad Pike
  • Estellburgh
  • AL
  • 48474
  • 776-731-0514

78% Similarity

Data Sample B

  • Smith Corporate Enterprise
  • 56 Adah Walk
  • Estellburgh
  • AL
  • 48474
  • 669-365-7169

DATASENSE INFRASTRUCTURE IN A NUTSHELL

  • Client 1
  • Client 2
  • Client 3

DataSense API serves http requests for search in JSON format, and sends back any matching records in a JSON Response.

  • Node 1
  • Node 2

DataSense API Nodes

DataSense API Nodes are segregated to serve different indexes of the same data. For example, Node A can index name, address, and city and Node B can index name, address, city, state and zip.

DataSense Index Engine

DataSense indexing engine will create indexes for data that will be searched and keep it ready for real time searching.

Enterprise Data

Enterprise Data is exported to DataSense Database. Can be done in variety of ways, some include:

  • 1. SSIS
  • 2. Linked Server Queries
  • 3. Batch Processing
  • 4. CSV File export/import
  • 5. And more

DataSense Database

Key Features

  • datasenseatasense can index millions of records.
  • datasense indexes can contain any number of data points, and can run concurrently in your environment.
  • integration is easy since datasense is a web service.
  • requests and response data formats are json
  • datasense can run scheduled jobs to index your data periodically for accurate search.

DATASENSE EXAMPLE JSON REQUEST/RESPONSE

POST v1/search/dataset

  • {
    • "MatchString" : ILenna Paporocki 639 Main Street Anchorage County Anchorage AK 99501,
    • "Likelihood" : "40"
    • "DataIndexSetID" : "4"
    }

Response is real time:

  • {
    • "jaccardIndex" : 0.71111232,
    • "matchingRecord" : {
      • "recordID" : 71423232,
      • "dataSetID" : 2,
      • "refID" : "9",
      • "indexColumn" : Ileina Paporuki 629 main St. Anchorage County anochrage AK 99510,
      }
    }

DATASENSE KEYWORD EXCLUSIONS

You can identify keywords to ignore when processing indexes and processing batches. This allows for cleaner data, and reduces false positives.

BATCH PROCESSING SOLUTION

Need to know more about Data Sense

Know how we can help you build better Data

Contact us