regexp/unicode-property
🔧 This rule is automatically fixable by the --fix
CLI option.
enforce consistent naming of unicode properties
📖 Rule Details
This rule helps to enforce consistent style and naming of unicode properties.
There are many ways a single Unicode property can be expressed. E.g. \p{L}
, \p{Letter}
, \p{gc=L}
, \p{gc=Letter}
, \p{General_Category=L}
, and \p{General_Category=Letter}
are all equivalent. This rule can be configured in a variety of ways to control exactly which ones of those variants are allowed. The default configuration is intended to be a good starting point for most users.
🔧 Options
{
"regexp/unicode-property": ["error", {
"generalCategory": "never",
"key": "ignore",
"property": {
"binary": "ignore",
"generalCategory": "ignore",
"script": "long",
}
}]
}
generalCategory: "never" | "always" | "ignore"
Values from the General_Category
property can be expressed in two ways: either without or with the gc=
(or General_Category=
) prefix. E.g. \p{Letter}
or \p{gc=Letter}
.
This option controls whether the gc=
prefix is required or forbidden.
"never"
(default): Thegc=
(orGeneral_Category=
) prefix is forbidden."always"
: Thegc=
(orGeneral_Category=
) prefix is required."ignore"
: Both with and without prefix is allowed.
key: "short" | "long" | "ignore"
Unicode properties in key-value form (e.g. \p{gc=Letter}
, \P{scx=Greek}
) have two variants for the key: a short and a long form. E.g. \p{gc=Letter}
and \p{General_Category=Letter}
.
This option controls whether the short or long form is required.
"short"
: The key must be in short form."long"
: The key must be in long form."ignore"
(default): The key can be in either form.
property: "short" | "long" | "ignore" | object
Similar to key
, most property names also have long and short forms. E.g. \p{Letter}
and \p{L}
.
This option controls whether the short or long form is required. Which forms is required can be configured for each property type via an object. The object has to be of the type:
{
binary?: "short" | "long" | "ignore",
generalCategory?: "short" | "long" | "ignore",
script?: "short" | "long" | "ignore",
}
binary
controls the form of Binary Unicode properties. E.g.ASCII
,Any
,Hex
.generalCategory
controls the form of values from theGeneral_Category
property. E.g.Letter
,Ll
,P
.script
controls the form of values from theScript
andScript_Extensions
properties. E.g.Greek
.
If the option is set to a string instead of an object, it will be used for all property types.
NOTE: The
"short"
and"long"
options follow the Unicode standard for short and long names. However, short names aren't always shorter than long names. E.g. the short name forp{sc=Han}
is\p{sc=Hani}
.There are also some properties that don't have a short name, such as
\p{sc=Thai}
, and some that have additional aliases that can be longer than the long name, such as\p{Mark}
(long) with its short name\p{M}
and alias\p{Combining_Mark}
.
Examples
All set to "long"
:
All set to "short"
:
Binary properties and values of the General_Category
property set to "short"
and values of the Script
property set to "long"
:
📚 Further reading
🚀 Version
This rule was introduced in eslint-plugin-regexp v2.5.0