💻 Software

In utf-8 collation, why 11- is less then 1-?

Freshabout 2 months ago

Mar 15, 20262045 views

Confidence Score0%

Problem

I found that the sort result in ASCII: Source file : Sort using ASCII: And using UTF-8: I feel it's so counter-intuitive, and it's not dictionary order. Isn't the character '-' ( ) is always less then ( )? What's the general rule in UTF-8 collation? And how to bypass it, just make be less then whil…

Error Output

1-
11-
1-a
11-a

Unverified for your environment

Select your OS to check compatibility.

Your OS

OS version

Product version

1 Fix

Canonical Fix

Unverified Fix

New Fix – Awaiting Verification

Fix for: In utf-8 collation, why 11- is less then 1-?

Low Risk

The minus sign is ignored in the first pass. So the first pass sorts , , , . Since < , you get < and thus < . is a variable collation element, meaning that you/the implementor can choose to ignore it. The glibc implementation apparently does so. In …

Awaiting Verification

Be the first to verify this fix

In utf-8 collation, why 11- is less then 1-?

Problem

Error Output

1 Fix

Fix for: In utf-8 collation, why 11- is less then 1-?

Environment