integration - How to keep track of which rows have been imported in SQL? -
let's want import customers (or rows in other specific table) external system. not @ once every 1 after have been created in database. have keep record of rows have been reported because want find ones have not been reported yet. better add column or create kind of batchlog table?
i'm using ms sql server if relevant
a simplified example:
select * customer reportedtoexternalsystem null
or
select * customer cus_id not in (select cus_id integrationbatchlog)
or there maybe more ways that might better? first time don't know best practise yet.
the simple solution add column marks row imported. status int (0/1) or if want keep track of when imported imported date. solution have limitations:
you can import row once. need import customer again when record updated? going clear update field when customer updated?
it causes row locked when update row status. sure application inserts customer record happy code locking records?
on system causes entire row written log system recovery. depending on size of row can lot of log writing 1 field.
in highly parallel import system can have lot of contention resources. if 1 import program locking table, think how bad if many import programs locking table @ same time.
if customer data updated several times between import polling interval, see latest data , skip on intermediate updates. issue if care intermedaite updates. customers might not care, order statuses might care lot.
you have modify table structure. might not allowed source application due data/support/political issues.
besides putting status column in table, 1 technique works put trigger on table , mirror import data second table. 'consume' data in second table. has several advantages:
it keeps locking issues contained second table.
it allows process every update main table.
you can add index second table used keep track of update statuses without issues of changing main table.
if delete rows second table (either consumed or after short audit period) size of table/index kep minimum.
when use technique in sql server put second table in seperate schema. since apps store tables in dbo, can end dbo.customers , import.customers. can keep track of tables importing , keeps having come new names import tables.
Comments
Post a Comment