Global Patent Index - EP 3707612 B1

EP 3707612 B1 20220511 - DUPLICATIVE DATA DETECTION

Title (en)

DUPLICATIVE DATA DETECTION

Title (de)

ERKENNUNG VON DOPPELTEN DATEN

Title (fr)

DÉTECTION DE DONNÉES DUPLIQUÉES

Publication

EP 3707612 B1 20220511 (EN)

Application

EP 18782819 A 20180828

Priority

  • US 201715805047 A 20171106
  • US 2018048376 W 20180828

Abstract (en)

[origin: US2019138628A1] In some implementations, a computer-implemented method includes analyzing first data from a first data source to determine a first schema of the first data source, and analyzing second data from a second data source to determine a second schema of the second data source. The method can further include generating a first two-dimensional aggregation of a first time data series having a time dimension and a dimension corresponding to aggregated values of a first metric, and generating a second two-dimensional aggregation of a second time data series having a time dimension and a dimension corresponding to aggregated values of a second metric. The method can also include computing a correlation value between the first two-dimensional aggregation and the second two-dimensional aggregation, and providing an indication of duplicated data between the first data source and the second data source if the correlation value meets a threshold.

IPC 8 full level

G06F 16/00 (2019.01)

CPC (source: EP US)

G06F 16/215 (2018.12 - EP US); G06F 16/2365 (2018.12 - US); G06F 16/2379 (2018.12 - US); G06F 16/254 (2018.12 - EP US); G06F 40/295 (2020.01 - US)

Designated contracting state (EPC)

AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DOCDB simple family (publication)

US 10789240 B2 20200929; US 2019138628 A1 20190509; EP 3707612 A1 20200916; EP 3707612 B1 20220511; WO 2019089116 A1 20190509

DOCDB simple family (application)

US 201715805047 A 20171106; EP 18782819 A 20180828; US 2018048376 W 20180828