-
Notifications
You must be signed in to change notification settings - Fork 185
Improved cursor comparison - only showing different lines #527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -0,0 +1,4 @@ | ||
create global temporary table ut_cursor_data_diff( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments in tables are allowed but no empty lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except a small refactor suggestion everything is ok.
where nvl(dbms_lob.compare(xmlserialize( content exp.row_data no indent), xmlserialize( content act.row_data no indent)),1) != 0 | ||
and rownum <= 1; | ||
where nvl(dbms_lob.compare(xmlserialize( content exp.row_data no indent), xmlserialize( content act.row_data no indent)),1) != 0; | ||
select count(1) into l_result from ut_cursor_data_comp where rownum <= 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use SQL%ROWCOUNT
instead of selecting rows number after select.
Can you add tests, so are covered for regression?
any others? |
My concern is that there is no limit on the data returned. |
Good point, Jacek. We still need a limit... |
from (select case when l_xpath is not null then deletexml( ucd.row_data, l_xpath ) else ucd.row_data end as row_data, | ||
ucd.row_no | ||
from ut_cursor_data ucd where ucd.cursor_data_guid = self.data_value) exp | ||
full outer join (select case when l_xpath is not null then deletexml( ucd.row_data, l_xpath ) else ucd.row_data end as row_data, | ||
ucd.row_no | ||
from ut_cursor_data ucd where ucd.cursor_data_guid = l_other.data_value) act | ||
on (exp.row_no = act.row_no) | ||
where nvl(dbms_lob.compare(xmlserialize( content exp.row_data no indent), xmlserialize( content act.row_data no indent)),1) != 0 | ||
and rownum <= 1; | ||
where nvl(dbms_lob.compare(xmlserialize( content exp.row_data no indent), xmlserialize( content act.row_data no indent)),1) != 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add limit here (in the insert statement), so that we only process first 50 - 100 differences.
Why 50 - 100?
10 seems not enough, but anything more than 50 - 100 doesn't bring much value anyway.
If you have so may differences, you're in serious trouble anyway.
If first 100 rows doesn't help you identify what the problem is - your test is not distilled properly and you're testing all in one go probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second thought on this idea.
Compare all rows.
Remember toe count of:
- rows in A
- rows in B
- differences
Report firstx
differences (havingx
50 .. 100)
Report also number of rows inA
, number of rows inB
and total number of differences.
That will give much more information about failed expectations,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had something similar in mind myself - would definitely help to output exact number of mismatches because we got them anyways...
I said a long time ago that I’d look at an alternative implementation. The best I’ve done is to put together a simple SQL implementation of a compare. The below does two full table scans to identify differences. It’s an asktom special - forget the name of the person on the three that worked through the problem to come up with this. select c1, c2, c3, My idea was to use this and dynamically produce the query. With testing to see if special handling needed for all data types. Laptop times for comparing two 1M tables were .8 seconds and low resource usage. Leaving this here, as above forms the basis of a POC. We said that we wouldn’t change the implementation unless there were clear benefits or new requirements. Cheers. |
@mathewbutler Agreed. That is, if not clob, blob, xmltype etc - use plain SQL compare, else do to XML and compare CLOBs. For now, we definitely want to improve reporting side, so that we show rows that differ instead of first 'x' rows from data-set. |
Still having to download and install ojdbc.jar
...because of owner issues
Seems the changes broke the exclude-possibility of columns. Might take me some time to get it sorted out. |
# Conflicts: # development/refresh_sources.sh
Fixed failing tests fro 11g (record types in SQL) as well as old_test. Refactored the comparison a little bit.
…success. Added test for empty vs null cursor.
Added some additional tests.
At least partly fixes #454