A finite state machine specialized for regular-expression-based text filters, this module defines the following classes:
StateMachine
, a state machineState
, a state superclassStateMachineWS
, a whitespace-sensitive version of StateMachine
StateWS
, a state superclass for use with StateMachineWS
SearchStateMachine
, uses re.search()
instead of re.match()
SearchStateMachineWS
, uses re.search()
instead of re.match()
ViewList
, extends standard Python lists.StringList
, string-specific ViewList.Exception classes:
StateMachineError
UnknownStateError
DuplicateStateError
UnknownTransitionError
DuplicateTransitionError
TransitionPatternNotFound
TransitionMethodNotFound
UnexpectedIndentationError
TransitionCorrection
: Raised to switch to another transition.StateCorrection
: Raised to switch to another state & transition.Functions:
string2lines()
: split a multi-line string into a list of one-line strings(See the individual classes, methods, and attributes for details.)
Import it: import statemachine or from statemachine import .... You will also need to import re.
Derive a subclass of State
(or StateWS
) for each state in your state
machine:
class MyState(statemachine.State):
Within the state's class definition:
Include a pattern for each transition, in State.patterns
:
patterns = {'atransition': r'pattern', ...}
Include a list of initial transitions to be set up automatically, in
State.initial_transitions
:
initial_transitions = ['atransition', ...]
Define a method for each transition, with the same name as the transition pattern:
def atransition(self, match, context, next_state): # do something result = [...] # a list return context, next_state, result # context, next_state may be altered
Transition methods may raise an EOFError
to cut processing short.
You may wish to override the State.bof()
and/or State.eof()
implicit
transition methods, which handle the beginning- and end-of-file.
In order to handle nested processing, you may wish to override the
attributes State.nested_sm
and/or State.nested_sm_kwargs
.
If you are using StateWS
as a base class, in order to handle nested
indented blocks, you may wish to:
StateWS.indent_sm
,
StateWS.indent_sm_kwargs
, StateWS.known_indent_sm
, and/or
StateWS.known_indent_sm_kwargs
;StateWS.blank()
method; and/orStateWS.indent()
, StateWS.known_indent()
,
and/or StateWS.firstknown_indent()
methods.Create a state machine object:
sm = StateMachine(state_classes=[MyState, ...], initial_state='MyState')
Obtain the input text, which needs to be converted into a tab-free list of one-line strings. For example, to read text from a file called 'inputfile':
input_string = open('inputfile').read() input_lines = statemachine.string2lines(input_string)
Run the state machine on the input text and collect the results, a list:
results = sm.run(input_lines)
Remove any lingering circular references:
sm.unlink()
Class | DuplicateStateError |
Undocumented |
Class | DuplicateTransitionError |
Undocumented |
Class | SearchStateMachine |
StateMachine which uses re.search() instead of re.match() . |
Class | SearchStateMachineWS |
StateMachineWS which uses re.search() instead of re.match() . |
Class | State |
State superclass. Contains a list of transitions, and transition methods. |
Class | StateCorrection |
Raise from within a transition method to switch to another state. |
Class | StateMachine |
A finite state machine for text filters using regular expressions. |
Class | StateMachineError |
Undocumented |
Class | StateMachineWS |
StateMachine subclass specialized for whitespace recognition. |
Class | StateWS |
State superclass specialized for whitespace (blank lines & indents). |
Class | StringList |
A ViewList with string-specific methods. |
Class | TransitionCorrection |
Raise from within a transition method to switch to another transition. |
Class | TransitionMethodNotFound |
Undocumented |
Class | TransitionPatternNotFound |
Undocumented |
Class | UnexpectedIndentationError |
Undocumented |
Class | UnknownStateError |
Undocumented |
Class | UnknownTransitionError |
Undocumented |
Class | ViewList |
No summary |
Function | string2lines |
Return a list of one-line strings with tabs expanded, no newlines, and trailing whitespace stripped. |
Class | _SearchOverride |
Mix-in class to override StateMachine regular expression behavior. |
Function | _exception_data |
Return exception information: |
Return a list of one-line strings with tabs expanded, no newlines, and trailing whitespace stripped.
Each tab is expanded with between 1 and tab_width
spaces, so that the
next character's index becomes a multiple of tab_width
(8 by default).
Parameters:
astring
: a multi-line string.tab_width
: the number of columns between tab stops.convert_whitespace
: convert form feeds and vertical tabs to spaces?whitespace
: pattern object with the to-be-converted
whitespace characters (default [vf]).