The Wayback Machine - https://web.archive.org/web/20160513175445/https://www.mediawiki.org/wiki/Help_talk:CirrusSearch

Help talk:CirrusSearch

Jump to: navigation, search

About this board

By clicking "Add topic", you agree to our Terms of Use and agree to irrevocably release your text under the CC BY-SA 3.0 License and GFDL
Cpiral (talkcontribs)

Within a few days, I'll be adding information to the help page, and restructuring the sections.

Concerning other CirrusSearch documentation: besides phabricator, and wikipedia, and the sites already linked from the help page, are there any other resources that describe the behavior of the search box and its parameters that I should know about?

Cpiral (talkcontribs)

The wikitext is given in pithy chunks to help ease translations. Translators: please copy this to a subpage and mark that up. Please do not translate this page directly, as it is needing further editing.

Reply to "Help documentation"

Semantic and structured search: Wikidata + Wiki(pedia)

2
197.218.90.141 (talkcontribs)

One of the reasons that the search engine tends to give bad results is because much like google it simply tries to search based on content of the articles or pagerank.

Something that I haven't seen here and that would be an obvious win would be to do some Data_mining or data from wikipedia to discover obvious connections and how related the content is.

One simple use case is searching for authors related to Shakespeare, or authors born in Shakespeare's Era. Using Wikidata query service it is possible to obtain this information and use it to enhance search or even present other useful searches that the user may want to do later based on subject similarity.

Actually even without using wikidata, it is possible to extract information from infoboxes to determine all authors in the encyclopedia born in a certain year. So to an extent this is possible with some tweaks. One approach taken by wikia (yes evil wikia :) was to develop portable infoboxes which makes these queries easier by using for ex categories. These little tools store their structured data in the page props, making it possible to do all sorts of nifty queries see Portable infoboxes.

197.218.81.127 (talkcontribs)

Oops wrong link, right one: Portable infoboxes.  Preceding unsigned comment added by 197.218.81.127 (talkcontribs)


Regex language

What language's regex implementation applies for insource? I thought it would be PHP, but I see that e.g. \p{Greek} is not supported. --Base (talk) 20:51, 28 April 2016 (UTC)

Reply to "Semantic and structured search: Wikidata + Wiki(pedia)"

Bug: Search result points to the wrong section of page

8
Elvey (talkcontribs)

https://commons.wikimedia.org/w/index.php?search=You+seem+to+be+unaware+that%2C+IIRC%2C+there%27s+an+informed%2C+consensus+view&title=Special%3ASearch&go=Go

gives the search result

User talk:Ellin Beltz/Archive 3 (section Why you delete my files that I translate from English to Khmer)

The page is correct but the section is wrong. Is this a known bug?

CKoerner (WMF) (talkcontribs)

Elvey, I'm seeing something different than you. I get 7 results when I click that link, all of which are image files.

If I change the search to include the User_talk: namespace then I see the link you are describing: https://commons.wikimedia.org/wiki/User_talk:Ellin_Beltz/Archive_3#Why_you_delete_my_files_that_I_translate_from_English_to_Khmer

Which again works for me. :p

ArchiverBot has been active on that user's talk page since you posted, possibly having an impact into these links.

Elvey (talkcontribs)

@CKoerner (WMF), it seems you're seeing the same bug but not realizing it. Let me detail:

My searches include User talk by default.

The issue is that

https://commons.wikimedia.org/wiki/User_talk:Ellin_Beltz/Archive_3#Why_you_delete_my_files_that_I_translate_from_English_to_Khmer is NOT the link you should be seeing. You should be seeing :

https://commons.wikimedia.org/wiki/User_talk:Ellin_Beltz/Archive_3#Understanding_when_PD-.2AGov.2A_applies_-_e.g._to_unpaid_work.

This is because the phrase searching for appears in that section of the page. Section 128 vs 139, 11 sections away.

Elvey (talkcontribs)

I just noticed another search bug, by the way. See https://commons.wikimedia.org/wiki/MediaWiki_talk:Gadget-advanced-search.js#Bug .

Elvey (talkcontribs)

@CKoerner (WMF) or anyone: Hello?

Ckoerner (talkcontribs)

Ah, now I see what's going on. That is not what I would expect either.

I spoke with one of the engineers. The (section: Section heading) suggestion in the first result is not a literal match. To quote, "it will highlight a section if one of the search terms is present in the section name." So the words "you", "that" and "to" are the first close-enough (as the search is not exact) results in a section heading, so it suggests that section.

Putting your search in quotes changes to the results to be literal, and removes any section suggestion.

This appears to be a more complex task to solve. I created a phabricator task (T131950) to track progress on this.

Also, sorry for the delay. I have no excuse but being clumsy with email notifications. :(

CKoerner (WMF) (talkcontribs)

Argh, I was logged in with my volunteer account when I replied. Sorry for the confusion.

Elvey (talkcontribs)

You're awesome! I've oft wondered why the section search was acting so oddly. (I now see the bolding that was a hint as to what was going on that I missed.)

Reply to "Bug: Search result points to the wrong section of page"

How to search for a typo without search defaulting to correct spelling when there are no hits for the typo?

3
Jason Quinn (talkcontribs)

The template en:Template:Adopt-a-typo at the English Wikipedia searches for a particular typos a user specifies. But it would be good to return zero results when no pages have the typo. Instead the search defaults over to the correct spelling most of the time, which can be very confusing to the user of this template. Is there a way to prevent the search from doing this? Jason Quinn (talk) 13:17, 26 March 2016 (UTC)

CKoerner (WMF) (talkcontribs)

It looks like the template uses the Search link template to create a query like this:

https://en.wikipedia.org/w/index.php?title=Special:Search&search=abbriviated&ns0=1&fulltext=Search

if you append "&runsuggestion=0" to the end of that it will skip the "Did you mean" spelling and just show direct matches.

Reply to "How to search for a typo without search defaulting to correct spelling when there are no hits for the typo?"
Colonies Chris (talkcontribs)

I searched Wikipedia for "Key Personnel" in the Template space. There were 48 results, but not included among them was [[Template:Clearwater Threshers]], which has contained this exact string since it was created in 2009.

Cpiral (talkcontribs)

Since "key personnel" does show up on the page, but not in search results, it is either a bug, a common misunderstanding about how the collapse state relates to being indexed or not, or a code-base compromise of sorts.

Template:Clearwater Threshers uses | state = {{{state|autocollapse}}}. There are about 230 other navbox templates with "key personnel" who set collapse state like that, and none of them show up in the 45 or so searchable ones.

See what the 45 searchable ones use for their collapse state that do show up in search results. Most use {{{state<includeonly>|autocollapse</includeonly>}}} and a few use {{{state|collapsed}}}. It's not what I'd guess, since collapsed does not show the terms "key personnel" on the page, yet it is collapsed that seems to work for search.

Just like the HTML semantics determine the difference between a word search and an insource word search, so template semantics should determine the difference between the collapse state (invisible words) and not. Its a good question for investigation.

Elvey (talkcontribs)

I'm guessing this is an undocumented feature rather than a bug.

If I'm right, it should be documented, and IMO probably shouldn't be the default behavior of state/autocollapse.

Reply to "Search text not found"
Equinox (talkcontribs)

On Wiktionary, ''intitle:dasher'' finds ''dasher'' but not ''haberdashery'' (since it's part of a longer word there). The regex ''intitle:/.*dasher.*/'' also fails to find ''haberdashery''. How can I find partial words within a title?

FriedhelmW (talkcontribs)

Use the grep utility https://tools.wmflabs.org/grep/

Cpiral (talkcontribs)

It's possible only for the endings of words. Intitle does stemming, fuzzy, and wildcards on the ends of the partial word you give. Nothing affect a search towards the beginning parts except regex, but intitle doesn't have regex. Only insource has regex, and it doesn't search titles.

Reply to "How to use intitle with partial words"
Cpiral (talkcontribs)

Currently there is yet another translation issue on this Help page. Would somebody please fix it?

I propose we remove translations from this page for a while.

Cpiral (talkcontribs)

I've removed translations tags and markers, and even the <languages /> tags. Not sure about all those. (See task T113907 for why.)

The intent is extensive editing. It needs it.

My criterion for removing translations were these priorities: 1) Editing ability 2) Searching ability 3)Translations.

Kaganer (talkcontribs)

This is ready for translation now, after 4 months? Please explain current status of this page.

Cpiral (talkcontribs)

This page reached a milestone recently -- "fully informed" -- and so I'm glad you asked. This page is also too rough. Very soon I will smooth out the accessibility, readability, usability problems, and then call for translations.

Cpiral (talkcontribs)

Nemo, by reverting all my edits and telling me to add them while translations tags are on the page, you seem to have overlooked MediaWiki's first general principle of translations "Avoid changes".

Cpiral (talkcontribs)

Just for the record, here is what has happened.

  1. Translate administration translated the page, in my opinion, prematurely.
  2. No translations admin responded to direct appeals and clearly stated intentions on the talk page.
  3. Consequently, a significantly updated version was contributed over five months that one translation administrator @Nemo bis has only just now got around to reverting for the sake of translations alone (not because of its content) and another translation administrator @Shirayuki has (understandably) denied to re-markup for translations.

Assuming we are all just trying to do our job, I would just like to point out two principles at stake here: 1) the spirit of collaboration 2) providing for and enabling "customers" with documentation.

Reply to "Translations issues on this help page"
Ftclausen (talkcontribs)

Is there a way to recursively search within a particular category i.e. go into the sub-categories? Or at least specify how many levels deep a search should be?

Cpiral (talkcontribs)

See task T37402.

To get the search parameter "deepcat:", for Wikimedia sites go to your user preferences.

In your global.js (or local, or skin) add

mw.loader.load( "//de.wikipedia.org/w/index.php?title=User:Christoph Fischer (WMDE)/Gadgets/DeepCat.js&action=raw&ctype=text/javascript" );

In your global.css (corresponding) add

mw.loader.load( "//de.wikipedia.org/w/index.php?title=User:Christoph Fischer (WMDE)/Gadgets/DeepCat.css&action=raw&ctype=text/css" , "text/css" );
Ftclausen (talkcontribs)

Thanks for your response. I'm querying the Wikimedia Commons from an app under development thus via the Mediawiki API; this sounds like it only applies to front-end searches. Is it possible to get this functionality at the API level? I.e. stuff under "/w/api.php"?

Cpiral (talkcontribs)

I think it's just a gadget at this point. See Api for what might be available, for example API:Categoryinfo.

Ftclausen (talkcontribs)

Thanks for your prompt reply. I'm currently already using "categorymembers" on an individual subcategory.

Since the category I want to search within only has about 200 subcategories (and those subcategories contain the actual content) I might just construct an "incategory:" filter and explicitly list all those categories as part of that filter, e.g. "incategory:cat1|cat2|cat3 pizza" would, as best I understand, search for "pizza" in cat1,2 and 3.

I don't know if the request URL will then be too long (and hopefully not generate undue server load) but worth a try.

Cpiral (talkcontribs)

The help page should say OR doesn't work for parameters, just for words and phrases. (The syntax you're trying is used for the morelike search parameter, where it means parameter 1 parameter 2, etc., not OR.) For this reason, using the incategory parameter, the search for pizza in cat1 or cat2 takes two searches.

But pizza AND cat1 OR pizza and cat2 should work loosely, (except where cat1 does not match a page by matching the category in the categories box at the bottom of every page) (However when I tried to verify this, I found two instances where the search "one two" AND "three four" OR "five six" AND "seven eight" picked pages that did not match those criteria.)

Ftclausen (talkcontribs)

Ah, just saw your reply here, that's good info. I'll construct my search thusly :

pizza AND cat1 OR pizza AND cat2 .... pizza AND catN

I'll do some tests and ask follow-up questions if I'm still unclear.

Ftclausen (talkcontribs)

I did some testing and it does seem to behave as I initially expected in that "incategory:cat1|cat2|cat3 pizza" *does* "OR" together those categories and returns all matches of "pizza" in those particular categories. The help page also says

  • incategory:Felis_silvestris_catus|Dogs

* Find articles that are either in category:Felis_silvestris_catus or in Category:Dogs

which implies this behaviour. I could be misunderstanding and it just "looks" correct but so far it is giving the desired result.

Reply to "Recursive "incategory:""
Ftclausen (talkcontribs)

I'm trying to start with a simple search via the API in a category that seems to have lots of content. For this example I"m using "Mathematics". When I try a search with the following curl command

curl -s 'https://commons.wikimedia.org/w/api.php?action=query&list=search&srsearch=incategory:Mathematics&format=json'

then it returns (formatted using jsonlint) -

{

  "batchcomplete": "",
  "query": {
    "searchinfo": {
      "totalhits": 0
     },
    "search": []
  }
}

do I need to specify what type of items within the category I'd like returned?

Ftclausen (talkcontribs)

I also asked on #mediawiki on IRC and "Leah" helpfully pointed out that only namespace 0 is searched by default. So an effective query URL would be

https://commons.wikimedia.org/w/api.php?action=query&list=search&srsearch=incategory:Mathematics&srnamespace=0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|100|101|102|103|104|105|106|107|460|461|490|491|828|829|1198|1199|2300|2301|2302|2303|2600
Reply to "incategory: not returning any results"

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Morty Proxy This is a proxified and sanitized view of the page, visit original site.