Newbie - How To ...
I know that this is truly a newbie question, but here goes...
We have created several files that insert into Oracle DB 12c tables a large amount of data. We've got this one table that we are getting many error messages during the load concerning duplicate rows. I need a method that will allow me to identify these duplicate rows and delete them in case there are other errors. Is there a way to do this using a bash script? Sandy |
I don’t know how you perform the insertion, but if you can list all rows, a pipeline consisting of the sort and uniq commands is the easiest solution. For example, to find the duplicates:
Code:
sort FILE-WITH-ROWS | uniq -d >FILE-WITH-DUPS Code:
sort -u FILE-WITH-ROWS > FILE-WITHOUT-DUPS |
If you're getting errors about duplicate rows, the duplicates won't be inserted into the database, BUT "duplicate" to the database means duplicate keys, so there may be new information in the data that is not being updated that should be.
|
Quote:
Things get more complicated if you're inserting multiple values and only one value turns out to be duplicated in some rows. Example: you have two records comprised of a person's social security number and their name. Code:
nnn-nn-nnnn James Smith |
Newbie - How To ...
Outstanding!
I took the results of the 2 sort suggestions and they provided exactly what I needed. I'm a contractor and even though the database would have loaded correctly having those errors would not have looked good. I will retain this information and suggestions on my cheat sheet for future problems. Thank You! Sandy |
All times are GMT -5. The time now is 07:02 AM. |