Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

@achawla83
Copy link

Summary

This PR extends the ElasticsearchFilterTranslator to handle additional LINQ expression patterns — specifically:

string.Contains() for text wildcard matching

Enumerable.Any() for array or list membership checks

This improves query translation for Semantic Kernel filters mapped to Elasticsearch queries.

…d Enumerable.Any() expressions

Summary

This PR extends the ElasticsearchFilterTranslator to handle additional LINQ expression patterns — specifically:

string.Contains() for text wildcard matching

Enumerable.Any() for array or list membership checks

This improves query translation for Semantic Kernel filters mapped to Elasticsearch queries.
@cla-checker-service
Copy link

cla-checker-service bot commented Nov 10, 2025

💚 CLA has been signed

@flobernd
Copy link
Member

flobernd commented Nov 10, 2025

Hi @achawla83 , thank you for the PR!

The string.Contains() part looks mostly fine 🙂 Could you please use MatchPhrase instead of Wildcard for slightly improved performance and also add a test here?

For the Any() part: Same as a above - could you please add some tests? I'm not sure TermsQuery behaves the way you want it to behave in this case. If I remember correctly, this will also return documents for which none of the terms are matching, unless you specify MinumumShouldMatch = 1 (which I remember is deprecated). I think we have to do some research about the best practice here.

Edit: Should be fine without MinumumShouldMatch 🙂

Used the MatchPhraseQuery instead of the WildcardQuery.
@achawla83
Copy link
Author

Hi @flobernd,

Thank you for the feedback! I’ve updated the code to use MatchPhraseQuery instead of WildcardQuery, as suggested. However, I’m not very familiar with writing test cases since my primary focus is on development. Could you please guide me or point me to an example so I can create the appropriate test?

Thanks for your understanding!

@flobernd
Copy link
Member

Hi @achawla83 , thanks for updating the PR.

The best place to add a test would be this file which contains filter tests specific to the Elasticsearch connector. You can find some examples in the generic test suite that is provided by Microsoft and used as a common ground for all connector implementations. The related Fixture class contains the definition of the test records and the seed functionality. There already is at least a string and a string[] field so modifying the test data record (or adding a custom one) should not be required.

…and added the test cases

MatchQuery was producing zero results for substring searches, causing the StringContains_with_substring test to fail.
Implemented WildcardQuery (*{substring}*) to align behavior with .Contains() semantics.
@achawla83
Copy link
Author

Hi @flobernd,

I made an update related to the failing test StringContains_with_substring.
This test was expecting 2 results but was returning 0. The issue was caused by the use of MatchQuery, which doesn’t support true substring matching.

Since the .Contains("fo") logic requires a substring match, I replaced the MatchQuery with a WildcardQuery that uses {substring}. This aligns the Elasticsearch behavior with what the test expects.

Updated code:

return new WildcardQuery(property.StorageName)
{
    Wildcard = $"*{substring.ToLower(System.Globalization.CultureInfo.InvariantCulture)}*"
};

After this change, the test passes as expected.

I also added the necessary test cases to ensure this scenario is covered.

Let me know if you want to adjust the behavior or discuss alternative approaches.

@flobernd
Copy link
Member

Hi @achawla83 , thank you for adding the tests. I'll have a look soon 🙂 Regarding MatchPhraseQuery vs. Wildcard: You are correct that match-phrase can not be used to look for a substring within a single word. Wildcard query on the other hand is quite slow unfortunately (if the searched field is not mapped as wildcard), but there is not much we can do in this context.

@flobernd flobernd self-assigned this Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.