structa.chars

The structa.chars module provides classes and constants for defining and manipulating character classes (in the sense of regular expressions). The primary class of interest is CharClass, but most uses can likely be covered by the set of constants defined in the module.

class structa.chars.CharClass(chars)[source]

A descendent of frozenset intended to represent a character class in a regular expression. Can be instantiated from any iterable of single characters (including a str).

All operations of frozenset are supported, but return instances of CharClass instead (and thus, are only valid for operations which result in sets containing individual character values). For example:

>>> abc = CharClass('abc')
>>> abc
CharClass('abc')
>>> ghi = CharClass('ghi')
>>> abc == ghi
False
>>> abc < ghi
False
>>> abc | ghi
CharClass('abcghi')
>>> abc < abc | ghi
True
difference(*others)[source]

Return the difference of two or more sets as a new set.

(i.e. all elements that are in this set but not the others.)

intersection(*others)[source]

Return the intersection of two sets as a new set.

(i.e. all elements that are in both sets.)

symmetric_difference(*others)[source]

Return the symmetric difference of two sets as a new set.

(i.e. all elements that are in exactly one of the sets.)

union(*others)[source]

Return the union of sets as a new set.

(i.e. all elements that are in either set.)

class structa.chars.AnyChar[source]

A singleton class (all instances are the same) which represents any possible character. This is comparable with, and compatible in operations with, instances of CharClass. For instance:

>>> abc = CharClass('abc')
>>> any_ = AnyChar()
>>> any_
AnyChar()
>>> abc < any_
True
>>> abc > any_
False
>>> abc | any_
AnyChar()
structa.chars.char_range(start, stop)[source]

Returns a CharClass containing all the characters from start to stop inclusive (in unicode codepoint order). For example:

>>> char_range('a', 'c')
CharClass('abc')
>>> char_range('0', '9')
CharClass('0123456789')
Parameters
  • start (str) – The inclusive start point of the range

  • stop (str) – The inclusive stop point of the range

Constants

structa.chars.oct_digit

Represents any valid digit in base 8 (octal).

structa.chars.dec_digit

Represents any valid digit in base 10 (decimal).

structa.chars.hex_digit

Represents any valid digit in base 16 (hexidecimal).

structa.chars.ident_first

Represents any character which is valid as the first character of a Python identifier.

structa.chars.ident_char

Represents any character which is valid within a Python identifier.

structa.chars.any_char

Represents any valid character (an instance of AnyChar).