Wednesday, September 25, 2013

GoldenGate (GG) replicat abended process troubleshooting.

GoldenGate (GG) replicat abended process troubleshooting.

1. Try to start the replicate
ggsci > start replicate TEST

2. check the replicate
ggsci > info all

It should be abended

3. check the process report
cd $GG_DIR/bin/dirrpt
less test.rpt

4. find the trail file and RBA from report

5. get sql statement using logdump
/*
configure below setting on logdump
fileheader detail
ghdr on
detail on
usertoken detail
reclen 128
*/

cd $GG_DIR/bin
logdump
logdump> open app/oracle/gg/trail/test/dirdat/pt0001
logdump> pos 100000
logdump> fileheader detail
logdump> ghdr on
logdump> detail on
logdump> usertoken detail
logdump> reclen 128
logdump> n

___________________________________________________________________
Hdr-Ind    :     E  (x45)     Partition  :     .  (x00)
UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41)
RecLength  :  1577  (x0629)   IO Time    : 2013/09/24 23:15:09.334.111
IOType     :   160  (xa0)     OrigNode   :     0  (x00)
TransInd   :     .  (x03)     FormatType :     R  (x52)
SyskeyLen  :     0  (x00)     Incomplete :     .  (x00)
AuditRBA   :          0       AuditPos   : 0
Continued  :     N  (x00)     RecCount   :     1  (x01)

2013/09/24 23:15:09.334.111 DDLOP                Len  1000 RBA 100000
Name:
After  Image:                                             Partition 0   G  s
 2c43 353d 2733 3735 3139 272c 2c42 373d 2733 3735 | ,C5='37519',,B7='375
 3139 272c 2c42 323d 2727 2c2c 4233 3d27 4d4f 4e41 | 19',,B2='',,B3='TTT
 444d 494e 272c 2c42 343d 2744 4841 5f54 4d50 5f23 | TTT',,B4='TTT_TMP_#
 5447 5439 5f33 3138 3237 3133 3227 2c2c 4331 323d | TGT9_000001',,C12=
 2727 2c2c 4331 333d 2727 2c2c 4235 3d27 5441 424c | '',,C13='',,B5='TABL
 4527 2c2c 4236 3d27 4352 4541 5445 272c 2c42 383d | E',,B6='CREATE',,B8=
 2747 4755 5345 522e 4747 535f 4444 4c5f 4849 5354 | 'GGUSER.GGS_DDL_HIST
 272c 2c42 393d 274d 4f4e 4144 4d49 4e27 2c2c 4337 | ',,B9='TTTTTTT',,C7
 3d27 3130 2e32 2e30 2e34 2e30 272c 2c43 383d 2731 | ='10.2.0.4.0',,C8='1
 302e 322e 302e 332e 3027 2c2c 4339 3d27 272c 2c43 | 0.2.0.3.0',,C9='',,C
 3130 3d27 3127 2c2c 4331 313d 2764 7264 6976 6572 | 10='1',,C11='test
 7327 2c2c 4733 3d27 4e4f 4e55 4e49 5155 4527 2c2c | t',,G3='NONUNIQUE',,
 4331 343d 274e 4f27 2c2c 4331 353d 274e 4f27 2c2c | C14='NO',,C15='NO',,
 4331 393d 2731 3727 2c2c 4331 3728 2731 2729 3d27 | C19='17',,C17('1')='
 4e4c 535f 4c41 4e47 5541 4745 272c 2c43 3138 2827 | NLS_LANGUAGE',,C18('
 3127 293d 2741 4d45 5249 4341 4e27 2c2c 4331 3728 | 1')='AMERICAN',,C17(
 2732 2729 3d27 4e4c 535f 5445 5252 4954 4f52 5927 | '2')='NLS_TERRITORY'


/*
reset reclen to record length according to output
*/
logdump> reclen 1000
/*
position back to 100000
*/
logdump> pos 100000
logdump> n

/*
you will have output for whole record. The left Panel C1 variable is the SQL statement.
you can try it to find out the real error
*/

6. fixed the issue and restart the replicate
how to fix depends on errors and database settings and even data.

7. skip the transaction and restart the replicate
if you find out that the transaction can be skip.
find RBA position for next record using logdump

logdump> pos 100000
logdump> n
logdump> n

2013/09/24 23:15:09.334.111 DDLOP                Len  115 RBA 101000
Name:
After  Image:                                             Partition 0   G  s
 2c43 353d 2733 3735 3139 272c 2c42 373d 2733 3735 | ,C5='37519',,B7='375
 3139 272c 2c42 323d 2727 2c2c 4233 3d27 4d4f 4e41 | 19',,B2='',,B3='TTT
 444d 494e 272c 2c42 343d 2744 4841 5f54 4d50 5f23 | TTT',,B4='TTT_TMP_#
 5447 5439 5f33 3138 3237 3133 3227 2c2c 4331 323d | TGT9_000001',,C12=
 2727 2c2c 4331 333d 2727 2c2c 4235 3d27 5441 424c | '',,C13='',,B5='TABL
 4527 2c2c 4236 3d27 4352 4541 5445 272c 2c42 383d | E',,B6='CREATE',,B8=
 2747 4755 5345 522e 4747 535f 4444 4c5f 4849 5354 | 'GGUSER.GGS_DDL_HIST
 272c 2c42 393d 274d 4f4e 4144 4d49 4e27 2c2c 4337 | ',,B9='TTTTTTT',,C7
 3d27 3130 2e32 2e30 2e34 2e30 272c 2c43 383d 2731 | ='10.2.0.4.0',,C8='1
 302e 322e 302e 332e 3027 2c2c 4339 3d27 272c 2c43 | 0.2.0.3.0',,C9='',,C
 3130 3d27 3127 2c2c 4331 313d 2764 7264 6976 6572 | 10='1',,C11='test
 7327 2c2c 4733 3d27 4e4f 4e55 4e49 5155 4527 2c2c | t',,G3='NONUNIQUE',,
 4331 343d 274e 4f27 2c2c 4331 353d 274e 4f27 2c2c | C14='NO',,C15='NO',,
 4331 393d 2731 3727 2c2c 4331 3728 2731 2729 3d27 | C19='17',,C17('1')='
 4e4c 535f 4c41 4e47 5541 4745 272c 2c43 3138 2827 | NLS_LANGUAGE',,C18('
 3127 293d 2741 4d45 5249 4341 4e27 2c2c 4331 3728 | 1')='AMERICAN',,C17(
 2732 2729 3d27 4e4c 535f 5445 5252 4954 4f52 5927 | '2')='NLS_TERRITORY'


ggsci> alter replicat test, extrba 101000
ggsci>info all
Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING
REPLICAT    STOPPED     TEST       16:34:27      00:45:14

ggsci> start replicat test
ggsci> info all
Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING
REPLICAT    RUNNING     TEST       16:34:27      00:45:14


reference:
http://blog.flimatech.com/2012/03/24/how-to-find-the-transaction-that-abended-oracle-goldengate/
http://satya-dba.blogspot.com/2012/02/ggsci-goldengate-command-interpreter.html