Quantcast
Viewing all articles
Browse latest Browse all 8978

Accented Characters

This feature is not only needed, but it is mandatory for every people using accented characters.



The reason is that accented letters are almost written randomly in files. It is due:

- to the fact that old text (ASCII) files were not accented

- to the fact that accented characters are more difficult to type on the keyboard

- to the fact that uppercase letters are generally not accented

- to the fact that everybody understands the text even if it is not accented

- to the fact that very often people do not know the correct writing and the kind of accentuation

- taking "e" for example, in French, you can find 10 characters: e é è ê ë E É È Ê Ë

- same problem for "a", "i", "o", "u", c ç C Ç



It means that it is IMPOSSIBLE to find accentuated words if the search engine makes a difference between all these characters.



Let us take an example: suppose that you are searching for the word "fenêtrées".

There are 4 "e", and even if you suppose that the only possible accented ones are the second and the third, and that reasonably the person who wrote it could only have used (e é è ê), you have 4x4=16 searches to do!

And if you have a search with several accented words Image may be NSFW.
Clik here to view.
Image
?



Presently, search engines do not make the difference between lowercase and uppercase characters. Why isn't it possible to define equivalences between several characters instead of 2? In the same way as (e E) are equivalent, it should be possible to decide that (e é è ê ë E É È Ê Ë) are equivalent in a search process.



Of course, for some purposes, it would also be useful to desactivate this feature, and make a distinction between all the characters.



This is the point of view of the user.

OK, it is perhaps not easy to implement, but it ought to be.

And as long as it is not, this software cannot be used in French, Spanish, German, northern Europe, and many other... languages.



This is the reason why many of us stay to X1 version 5.2.3 (Build 1852bz-bs) (Released Friday, August 26, 2005, that means more than 2 years ago!), which works fine from this point of view.

LATER EDIT (Feb. 24th, 2008): in fact, this assumption seems not exact: following posts in this thread show that some diacritics give problem.



And it is not the first time that this problem is pointed out: see for instance point ii) in http://forums.x1.com/viewtopic.php?p=5762



Gerard

Viewing all articles
Browse latest Browse all 8978

Trending Articles