Un modello formalizzabile della teoria del fraseggio di Pay:

A computer model 
by A. Pay

The following is no more than a sketch of a possibility. The notion is
based on the idea that once we have specified a dynamic and timbral
envelope that is to govern the progression of a phrase, it would be
possible to apply it at several levels in a given passage to each
independent line. Thus the substructure would be represented, and also
the way in which the phrases in a polyphonic texture yielded to each
other. (It might then be possible to add different envelopes at local
points, much in the way that a style can be used to control the overall
appearance of a page of text in a word processor, and local effects,
like italics , applied where needed.) The score could then be played,
and the result of the default shape evaluated. 

At the moment, this is not a natural way to represent a score in MIDI
terms. The MIDI system deals with notes, which are captured by their
duration, dynamic and envelope. The possibility of applying a suitable
function at one level to grouped notes, and then being able to reapply
it at another level to the group consisting of the union of those
modulated groups, and so on, to whatever degree of complexity seems
appropriate, is as far as I know as yet unrealized. Probably only about
three levels are necessary, though it would be unwise to underestimate
the difficulty of designing an envelope adequate to make a variety of
even simple one-level phrases sound natural to our ears. 

It may be that the size of the intervals in a phrase changes what is
required to achieve a particular perceptual shape, and perhaps even the
note density is involved we know, for example, that a passage of
semiquavers demands a different (brighter) basic timbre to be effective
in any one acoustic compared with the optimum for a slower moving
passage of minims and crotchets. Perhaps this means that the bipolarity
weight/lightness (a dynamic envelope) dominates the shaping of faster
music, and the bipolarity bright/dark (a timbral envelope) dominates the
shaping of slower music. It is also possible that we need variety in
some other dimension. (One that springs to mind is the degree of
transient attack on each semiquaver in a running passage, which is
something that we often do vary as we play.) It is notoriously difficult
to characterize precisely the physical parameters that allow our
perceptual systems to segment a continuous chunk of speech into
recognizable syllables. It would not be surprising if a similar subtlety
were required to represent fully the analogous situation in music. 

This all adds up to the realization that there is work to be done, and
several layers of subtlety to contend with. All the same, it may be that
even the partial development of a language that allowed the application
of multi-level envelopes to a stored representation of a score would
extend the possibilities of creating more easily and naturally a
particular sort of electronic music. The idea that speech is implicated
in our organization of musical sounds suggests that electronic music
conforming to some speech-like properties may in some sense be richer. I
mentioned before that the phenomenon of scaling may lie at the heart of
our appreciation of an object as natural. Musical masterpieces sometimes
seem to have the quality of always having existed  as though they too
are natural objects. Perhaps one reason is that the composer has set in
motion a structure that is scaled multidimensionally. 

Any analysis of how that is done, despite much effort by intelligent and
sensitive minds, is a long way from completion  nor do I presume to
have added to it here. My only thesis is that composers work with the
abilities of performers as their raw material, and that as performers we
add to or subtract from their genius. This is a small attempt to suggest
one way in which we may begin better in our endeavour to do the former
rather than the latter. 
Per maggiori dettagli sulla teoria del fraseggio di Pay: http://www.sneezy.org/clarinet/Study/Phrasing.html