module documentation

A finite state machine specialized for regular-expression-based text filters, this module defines the following classes:

Exception classes:

Functions:

  • string2lines(): split a multi-line string into a list of one-line strings

How To Use This Module

(See the individual classes, methods, and attributes for details.)

  1. Import it: import statemachine or from statemachine import .... You will also need to import re.

  2. Derive a subclass of State (or StateWS) for each state in your state machine:

    class MyState(statemachine.State):
    

    Within the state's class definition:

    1. Include a pattern for each transition, in State.patterns:

      patterns = {'atransition': r'pattern', ...}
      
    2. Include a list of initial transitions to be set up automatically, in State.initial_transitions:

      initial_transitions = ['atransition', ...]
      
    3. Define a method for each transition, with the same name as the transition pattern:

      def atransition(self, match, context, next_state):
          # do something
          result = [...]  # a list
          return context, next_state, result
          # context, next_state may be altered
      

      Transition methods may raise an EOFError to cut processing short.

    4. You may wish to override the State.bof() and/or State.eof() implicit transition methods, which handle the beginning- and end-of-file.

    5. In order to handle nested processing, you may wish to override the attributes State.nested_sm and/or State.nested_sm_kwargs.

      If you are using StateWS as a base class, in order to handle nested indented blocks, you may wish to:

  3. Create a state machine object:

    sm = StateMachine(state_classes=[MyState, ...],
                      initial_state='MyState')
    
  4. Obtain the input text, which needs to be converted into a tab-free list of one-line strings. For example, to read text from a file called 'inputfile':

    input_string = open('inputfile').read()
    input_lines = statemachine.string2lines(input_string)
    
  5. Run the state machine on the input text and collect the results, a list:

    results = sm.run(input_lines)
    
  6. Remove any lingering circular references:

    sm.unlink()
    
Class ​Duplicate​State​Error Undocumented
Class ​Duplicate​Transition​Error Undocumented
Class ​Search​State​Machine StateMachine which uses re.search() instead of re.match().
Class ​Search​State​Machine​WS StateMachineWS which uses re.search() instead of re.match().
Class ​State State superclass. Contains a list of transitions, and transition methods.
Class ​State​Correction Raise from within a transition method to switch to another state.
Class ​State​Machine A finite state machine for text filters using regular expressions.
Class ​State​Machine​Error Undocumented
Class ​State​Machine​WS StateMachine subclass specialized for whitespace recognition.
Class ​State​WS State superclass specialized for whitespace (blank lines & indents).
Class ​String​List A ViewList with string-specific methods.
Class ​Transition​Correction Raise from within a transition method to switch to another transition.
Class ​Transition​Method​Not​Found Undocumented
Class ​Transition​Pattern​Not​Found Undocumented
Class ​Unexpected​Indentation​Error Undocumented
Class ​Unknown​State​Error Undocumented
Class ​Unknown​Transition​Error Undocumented
Class ​View​List No summary
Function string2lines Return a list of one-line strings with tabs expanded, no newlines, and trailing whitespace stripped.
Class _​Search​Override Mix-in class to override StateMachine regular expression behavior.
Function ​_exception​_data Return exception information:
def string2lines(astring, tab_width=8, convert_whitespace=False, whitespace=re.compile(r'[\v\f]')):

Return a list of one-line strings with tabs expanded, no newlines, and trailing whitespace stripped.

Each tab is expanded with between 1 and tab_width spaces, so that the next character's index becomes a multiple of tab_width (8 by default).

Parameters:

  • astring: a multi-line string.
  • tab_width: the number of columns between tab stops.
  • convert_whitespace: convert form feeds and vertical tabs to spaces?
  • whitespace: pattern object with the to-be-converted whitespace characters (default [vf]).
def _exception_data():

Return exception information:

  • the exception's class name;
  • the exception object;
  • the name of the file containing the offending code;
  • the line number of the offending code;
  • the function name of the offending code.