-
Notifications
You must be signed in to change notification settings - Fork 874
Fix #217 double NaN regression #2 #306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
After commit 854862a Npgsql wasn't adding quotes around float4 and float8 values. But according to documentation, NaN values need to be quoted. Separated handling of float4 and float8 conversion routines.
…heads up. Add tests for Float array.
…nciscojunior/Npgsql into quote_FP_NaN_values
Add non-quoted and double quoted NaN ASCII string values. Move QuoteASCIIString() from NpgsqlNativeTypeInfo to BasicNativeToBackendTypeConverter. Add support in QuoteASCIIString() for predetermined quoted values. Use QuoteASCIIString() in the two native float to backend float text converters.
|
Haven't had the time to look at the work, but right off the bat I think we abandoned non-binary encoding since we dropped support for protocol 2, right? So should we still be worried about this? |
|
Since the code knows how to do non-binary, I think it should ALL know how to do non-binary. Part of the reason is that we could decide at any time to drop binary support for some particular data type. in which case the text encoder would need to be working, or we get the "pain" I mentioned. The encoder logic is too complicated to be trifling with. It's the kind of code you work really hard to get perfect, and then hope nobody ever has to debug it again. Why would we drop support for binary float encoding? Start supporting a platform (either us, or PG itself) that uses "some other" floating point binary format, and we pretty much have to go back to text only. |
|
I'm out of my depth here - I don't know this part of Npgsql and Postgresql very well. I was under the impression that for several Postgresql versions now, since the introduction of protocol 3, the best practice is always to encode in binary if possible, with text encoding being kept mostly for backwards compatibility. Can you confirm if this isn't right? If my understanding is right, then since we no longer need to support pre-9.0 Postgresql versions, isn't it better to gradually drop text encoding altogether, yielding tighter and less complicated codebase? And even if the Postgresql backend (hypothetically) introduces a new floating point binary format, does that mean we have to maintain support for text encoding now? Again, I really don't know much about this so I'd really appreciate being corrected. |
|
Hey @roji, On Sat, Aug 2, 2014 at 2:16 PM, Shay Rojansky notifications@github.com
Actually Tom Lane commented at some point in the past that binary encoding I took a run at a codec for decimal types, and I was never able to achieve So, don't get too cozy the binary encodings :) Oh, and did you forget that in version 3 simple query mode, there is no
Well, it obviously means we have to do something. We could go with text, -Glen Again, I really don't know much about this so I'd really appreciate being
|
|
If I recall correctly, binary encoding is only done when on extended query and if there is a binary converter for the type, right? And this is done only when NpgsqlCommand.Prepare() method is called. As we can set the format each column of the result set will be returned in: So, I think text encoding is needed as we have a lot of types which doesn't have binary encodings handlers.
I think I need more clarification on this part. Because it seems the I think that even with this change, the float coverters still need to know about the single quoted and double quoted values (as they need to pass them to the method). Also, I think the converters should be responsible to know when to quote and how. On a side note, we also have a little memory usage problem with the Sorry if I said a lot of nonsense things. And please, correct me if I'm wrong. |
|
On Sat, Aug 2, 2014 at 10:10 PM, Francisco Figueiredo Jr. <
Yes, exactly.
Arrays are reference types. It makes no difference how large the array
It's OK to understand the difference between no quotes, single quotes, and
I have thought about that some. There is an opportunity to tighten the Doing the quoting as we write to the stream would be a great way to
Not at all! I think we're on the same page here. But, what we have works, -Glen
|
Ah, ok. Thanks for keeping my idea :) Regarding the array case, I was thinking about the case where you would get a big src array and then you would need to supply this same big array with single quotes and/or with double quotes to the method call. You would need to have copies of the array: one without quotes as the src and another one or two as single and/or double quoted.
Agreed! In the past it was way more simple indeed. Now, there are even some exceptions like NaN which complicate things. As a comment regarding the double quote of the NaN, I think it is not needed. From the tests I did, only the single quoted value would be needed. Maybe we could drop this parameter from the method or at least from the method call? This would simplify the code. Sorry, Glen, for being annoying with the idea of simplification (or less complexity) of the code.
Agreed! And I thought about that too! :)
Great! I'm glad to know we are on the same page. And I agree with you that we don't need to rush about that quote handling. On the other side, I think that optimizing the byte[]'s we use throughout Npgsql is something we need to have a look. #308 is open so we can discuss about it there. |
|
On Sun, Aug 3, 2014 at 1:09 PM, Francisco Figueiredo Jr. <
Actually, I just realized that the simple query path should double quote In the process of verifying the bug, I discovered that for arrays, PG -Glen
|
|
And with that last commit, we're full circle. This PR no longer has value. Closing, and will merge @franciscojunior's PR. |
This is a continuation of PR #257,.@franciscojunior's fix for bug #217.
I want to limit the areas that are effected by the rather substantial complexity of properly quoting string values for all the combinations of PG backend, query mechanism, and whether or not the value is an array element. So, I moved QuoteASCIIString() over right next to the massive raw string formatting/quoting function, spiffed it up a little, and marked it internal. Then I changed the float converters to use it, so that they don't have to know anything about how to quote the value.
Note that we don't strictly need the double-quoted NaN value because floating point uses the binary encoding for extended queries. However, if a test is ever put in to test NaN array elements with binary encoding suppressed, they would fail, causing some pain.
Thoughts?
-Glen