Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Querying table and selecting on a few of the columns included in a primary key is generating a DISTINCT result. #1484

Unanswered
mithrandyr asked this question in Q&A
Discussion options

All - could use some help trying to figure out why this is happening. I've written a PowerShell module that wraps various database providers to make it easier to interact with databases through PowerShell (the module is called SimplySql). Anyways, I did a major rewrite last year and in the process changed my MySql provider to MySqlConnector. One of my users has reported this issue.

Basically, if a table is created with a primary key with multiple columns, but a select statement with only primary key columns included (but not all of them), then a DISTINCT result is happening.

-- Here is the SQL to create the two tables
-- Primary key
CREATE TABLE test3 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25), PRIMARY KEY (colA, colB, colC));

INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A');
INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A');
INSERT INTO test3 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A');
INSERT INTO test3 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');

-- No primary Key
CREATE TABLE test4 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25));

INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A');
INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A');
INSERT INTO test4 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A');
INSERT INTO test4 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');

Ok, now here is the PowerShell and the interesting results

<#
Install-Module SimplySql #use this to install the module, make sure its version 2.0 or greater
Open-MySqlConnection -Server <servername> -Database <dbname> -Credential (Get-Credential
#>
# Notice how the following statement will only return 3 rows
Invoke-SqlQuery -Query "SELECT colA FROM test3"

# This one returns 4 rows... the difference? test4 does not have a primary key
Invoke-SqlQuery -Query "SELECT colA FROM test4"

image

You must be logged in to vote

Replies: 2 comments · 5 replies

Comment options

I can't reproduce this problem using standard C# to use the ADO.NET interfaces

var connectionString = "server=localhost;database=mysqltest;userid=root;password=pass";

using var mscConnection = new MySqlConnector.MySqlConnection(connectionString);
mscConnection.Open();
using var msdConnection = new MySql.Data.MySqlClient.MySqlConnection(connectionString);
msdConnection.Open();

using (var mscCommand = mscConnection.CreateCommand())
{
	mscCommand.CommandText = """
		DROP TABLE IF EXISTS test3;
		DROP TABLE IF EXISTS test4;
		-- Here is the SQL to create the two tables
		-- Primary key
		CREATE TABLE test3 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25), PRIMARY KEY (colA, colB, colC));

		INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A');
		INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A');
		INSERT INTO test3 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A');
		INSERT INTO test3 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');

		-- No primary Key
		CREATE TABLE test4 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25));

		INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A');
		INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A');
		INSERT INTO test4 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A');
		INSERT INTO test4 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');
		""";
	mscCommand.ExecuteNonQuery();
}

foreach (var connection in new DbConnection[] { mscConnection, msdConnection })
{
	foreach (var sql in new[] { "SELECT colA FROM test3", "SELECT colA FROM test4" })
	{
		using var command = connection.CreateCommand();
		command.CommandText = sql;
		Console.WriteLine("{0} --- {1}", command.CommandText, command.GetType().FullName);
		using var reader = command.ExecuteReader();
		{
			while (reader.Read())
			{
				Console.WriteLine(reader.GetValue(0));
			}
		}
		Console.WriteLine();
	}
}

Output:

SELECT colA FROM test3 --- MySqlConnector.MySqlCommand
A
A
B
C

SELECT colA FROM test4 --- MySqlConnector.MySqlCommand
A
A
B
C

SELECT colA FROM test3 --- MySql.Data.MySqlClient.MySqlCommand
A
A
B
C

SELECT colA FROM test4 --- MySql.Data.MySqlClient.MySqlCommand
A
A
B
C
You must be logged in to vote
1 reply
@bgrainger
Comment options

Are you sure the problem is in what MySqlConnector returns and not in the PowerShell that's consuming it?

Comment options

It looks like you might be using MySqlDataAdapter: https://github.com/mithrandyr/SimplySql/blob/master/source/SimplySql.Engine/MySQL/MySqlProvider.vb#L33-L45

I rewrote the code to test that instead:

		using var da = command is MySqlConnector.MySqlCommand ?
			(DbDataAdapter) new MySqlConnector.MySqlDataAdapter((MySqlConnector.MySqlCommand) command) :
			new MySql.Data.MySqlClient.MySqlDataAdapter((MySql.Data.MySqlClient.MySqlCommand) command);
		Console.WriteLine("{0} --- {1}", command.CommandText, da.GetType().FullName);
		var ds = new DataSet();
		da.ReturnProviderSpecificTypes = true;
		da.Fill(ds);
		foreach (DataRow row in ds.Tables[0].Rows)
			Console.WriteLine(row[0]);
		Console.WriteLine();

Same output:

SELECT colA FROM test3 --- MySqlConnector.MySqlDataAdapter
A
A
B
C

SELECT colA FROM test4 --- MySqlConnector.MySqlDataAdapter
A
A
B
C

SELECT colA FROM test3 --- MySql.Data.MySqlClient.MySqlDataAdapter
A
A
B
C

SELECT colA FROM test4 --- MySql.Data.MySqlClient.MySqlDataAdapter
A
A
B
C
You must be logged in to vote
4 replies
@mithrandyr
Comment options

@bgrainger -- thanks for taking a look. I know my code is a bit confusing (there are multiple branches, supporting a base implementation that is provider agnostic that can be overridden if a provider needs special handling. Anyways, here is the code that is actually being executed.

 Using cmd As IDbCommand = GetCommand(query, cmdTimeout, params)
     Try
         Dim ds As New DataSet
         Using dr As IDataReader = cmd.ExecuteReader
             Do
                 Dim dt As New DataTable
                 dt.Load(dr)
                 If dt.Rows.Count > 0 Then ds.Tables.Add(dt)
             Loop While Not dr.IsClosed AndAlso dr.NextResult()
         End Using
         Return ds
     Catch ex As Exception
         ex.AddQueryDetails(query, params)
         Throw
     End Try
 End Using

When I debug and step through it, I noticed that after dt.Load(dr) -- the DataTable only holds 3 rows.

So now I wrote a sample console application to try and reproduce and i was able to. The key is, DataReader by itself seems to be find, but DataReader with DataTable.Load() seems to give an issue.

 Using conn As New MySqlConnection(connSB.ToString)
     conn.Open()
     Console.WriteLine("==[select cola from test3]==")
     Using cmd As New MySqlCommand("select cola from test3", conn)
         Using dr = cmd.ExecuteReader
             While dr.Read
                 Console.WriteLine(dr.GetString("cola"))
             End While
         End Using

         Dim dt As New Data.DataTable
         Using dr = cmd.ExecuteReader
             dt.Load(dr)
             Console.WriteLine($"DT Rows: {dt.Rows.Count}")
         End Using
     End Using
     Console.WriteLine()

     Console.WriteLine("==[select cola from test4]==")
     Using cmd As New MySqlCommand("select cola from test4", conn)
         Using dr = cmd.ExecuteReader
             While dr.Read
                 Console.WriteLine(dr.GetString("cola"))
             End While
         End Using

         Dim dt As New Data.DataTable
         Using dr = cmd.ExecuteReader
             dt.Load(dr)
             Console.WriteLine($"DT Rows: {dt.Rows.Count}")
         End Using
     End Using
 End Using

Results
image

@mithrandyr
Comment options

as an aside -- I've having the same results with SQLite databases.. but MSSQL, Oracle and Postgres are working fine.

@bgrainger
Comment options

One difference I can think of (between test3 and test4) is that MySqlDataReader.GetColumnSchema() returns true for DbColumn.IsKey. Perhaps this causes DataTable to de-duplicate rows when it loads?

I'm not sure if this is a bug or not. The column is certainly part of a key, but not every column in the PK is returned in the DbDataReader.

@mithrandyr
Comment options

@bgrainger -- I think it might be a bug. When DataTable.Load() runs, it looks like for any column that has DbColumn.IsKey set to true then it sets the Unique property of DataColumn to true as well -- thus causing the deduplication.

I did a quick check with MSSQL, when query a table with a primary key, it does not set Unique to true for any DataColumn in the DataTable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
🙏
Q&A
Labels
None yet
2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.