Ignore accents/diacritics etc. in search fields Topic is solved

Talk about code development, features, specific bugs, enhancements, patches, and similar things.
Forum rules
Please keep everything here strictly on-topic.
This board is meant for Pale Moon source code development related subjects only like code snippets, patches, specific bugs, git, the repositories, etc.

This is not for tech support! Please do not post tech support questions in the "Development" board!
Please make sure not to use this board for support questions. Please post issues with specific websites, extensions, etc. in the relevant boards for those topics.

Please keep things on-topic as this forum will be used for reference for Pale Moon development. Expect topics that aren't relevant as such to be moved or deleted.
User avatar
back2themoon
Moon Magic practitioner
Moon Magic practitioner
Posts: 2411
Joined: 2012-08-19, 20:32

Ignore accents/diacritics etc. in search fields

Unread post by back2themoon » 2020-06-16, 15:51

Let's say we've bookmarked Bärenreiter. If we later search our bookmarks with 'baren', we are not going to find it ('reiter' will work though, which is nice).

So, I think it'd be helpful for users dealing with these languages to have an option to ignore all such characters and make 'baren' work, too. Would also help in History search, Address bar (perhaps this one is trickier?) and elsewhere I might be forgetting.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35647
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Ignore accents/diacritics etc. in search fields

Unread post by Moonchild » 2020-06-16, 16:22

From a language point of view this is absolutely wrong. Words with accents aren't the same words as words without accents. if you're going to make that fuzzy then you'd also have to start fuzzing other misspellings, include phonetically similar words and segments, etc.
Additionally, if someone bookmarks something in their language they would logically also search for it in that same language with the correct spelling.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
back2themoon
Moon Magic practitioner
Moon Magic practitioner
Posts: 2411
Joined: 2012-08-19, 20:32

Re: Ignore accents/diacritics etc. in search fields

Unread post by back2themoon » 2020-06-16, 19:23

I'll strangely have to disagree with your whole point this time. This isn't meant as a grammar-improvement feature, but a simple usability enhancement.

Having accented bookmarks does not necessarily entail that they are in the users native language. Whether someone bookmarks stuff in their language or not, typing without accents will always be faster, especially in the second case. It is not correct, but it is more efficient.

Take my example. I bookmarked a German website in English, but the bookmark is titled in German. Guess what? I cannot even type that in German even if I knew the language; I don't have German installed. So, saying the user has to correctly spell and type all of their international bookmarks is not realistic. Yes, I could search for the appropriate symbol/character in Windows or online, but that's exactly what we are trying to avoid.

About "fuzzing other misspellings, too", that's just arbitrary. I believe many (most?) programs involving databases (e.g audio players) also ignore accents. Anyway, not a major issue.

User avatar
adesh
Board Warrior
Board Warrior
Posts: 1277
Joined: 2017-06-06, 07:38

Re: Ignore accents/diacritics etc. in search fields

Unread post by adesh » 2020-06-16, 19:44

First, the solution / workaround: save the bookmark with the title you want to remember and can type easily, or add some tags so it is easier to find.

You are saying that accents and diacritics should be ignored because it makes searching bookmarks easier. Tomorrow, someone, going by similar logic, will make a case that similar looking and sounding characters should be treated equally. For example, one could say "beta" (how to type, damn!) should come up in search when someone searches for "B". Then, what about the characters in other languages, which may look and sound like Latin alphabets but are in no way related. All of this makes having this questionable and also, very difficult to implement.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35647
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Ignore accents/diacritics etc. in search fields

Unread post by Moonchild » 2020-06-16, 20:37

back2themoon wrote:
2020-06-16, 19:23
About "fuzzing other misspellings, too", that's just arbitrary.
Ignoring accents is arbitrary too.

Take ä for example to stay close to your OP. ignore the accent and it'll be "a", but the common spelling when not using accents for that is "ae" (check your own link and see the ae in the domain name). So your suggestion to use "a" and not "ae" would be arbitrary. But for different languages, different rules apply as well. So you either fuzz properly, or you don't fuzz at all, is my opinion. And full fuzzing is not straightforward.

P.S.: you'd also get a hit if you would search for "baeren" using the "proper" de-accentized version, because it'd match the domain -- HOWEVER, with the advent of IDNs not even that is a guarantee.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
back2themoon
Moon Magic practitioner
Moon Magic practitioner
Posts: 2411
Joined: 2012-08-19, 20:32

Re: Ignore accents/diacritics etc. in search fields

Unread post by back2themoon » 2020-06-16, 20:58

adesh wrote:
2020-06-16, 19:44
Tomorrow, someone, going by similar logic, will make a case that similar looking and sounding characters should be treated equally.
No, because it is irrelevant. You are both still making it a grammar issue while it is not. It's just about a faster search, and bypassing the age-old computing convention of requiring two or even three key presses for accented characters instead of the usual one.

Moonchild says words with accents aren't the same words. In class, yes. On the keyboard, no. You'll still have to press A to get À.

I now realize this would be more useful to non-native users, since native ones might already have special keys on their keyboard, to allow single key-presses.

To be honest, I was kind of hoping there was already some sort of solution devised for this, in Windows, ISO standards or whatever. For the European languages/Latin alphabet at least.

A very quick search led me here, don't know if it's relevant, helpful or interesting: https://bugzilla.mozilla.org/show_bug.cgi?id=202251

User avatar
back2themoon
Moon Magic practitioner
Moon Magic practitioner
Posts: 2411
Joined: 2012-08-19, 20:32

Re: Ignore accents/diacritics etc. in search fields

Unread post by back2themoon » 2020-06-16, 21:19

Moonchild wrote:
2020-06-16, 20:37
So you either fuzz properly, or you don't fuzz at all
That's a contradiction in terms though. Something like this can never be perfect, and such a workaround/patch/fuzzing could never work for all human languages. For the most common scenarios though and without sacrificing the browser's logic it could still be helpful - and not just for me of course.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35647
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Ignore accents/diacritics etc. in search fields

Unread post by Moonchild » 2020-06-16, 23:09

I didn't say "perfect", I said "properly". Please don't twist my words into extremes when none are implied.

Going by "properly" it's not a contradiction in terms at all, but it does make it complex. Which languages should be focused on? How do you determine which language is typed in the search box, if you can do so at all, to begin with? It would require a number of variables to be used to make a fuzzy match, and with that it would require the option for the user to control those variables, too.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

New Tobin Paradigm

Re: Ignore accents/diacritics etc. in search fields

Unread post by New Tobin Paradigm » 2020-06-16, 23:14

And moar megahertz and the random access memorys.

Isn't Pale Moon et all en-US products?

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35647
Joined: 2011-08-28, 17:27
Location: Motala, SE

Re: Ignore accents/diacritics etc. in search fields

Unread post by Moonchild » 2020-06-17, 01:57

back2themoon wrote:
2020-06-16, 20:58
No, because it is irrelevant. You are both still making it a grammar issue while it is not. It's just about a faster search, and bypassing the age-old computing convention of requiring two or even three key presses for accented characters instead of the usual one.

Moonchild says words with accents aren't the same words. In class, yes. On the keyboard, no. You'll still have to press A to get À.
I guess you've never seen non-English keyboards then
Image
Ö != O
Å != Ä != A
All different keys.
I rest my case.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
back2themoon
Moon Magic practitioner
Moon Magic practitioner
Posts: 2411
Joined: 2012-08-19, 20:32

Re: Ignore accents/diacritics etc. in search fields

Unread post by back2themoon » 2020-06-17, 09:25

Oh come on, I already mentioned those keyboards which are clearly not the default situation which as Tobin said, is en-US. As mentioned earlier, this option would mostly benefit en-US users. Even so, users without en-US keyboards will also benefit since they'll only have their few particular special keys at hand. Anyway, it's an interesting topic and I think we can close it.

http://kbd-intl.narod.ru/english/layouts