answer is that while a FSG is limited to classifying strings according to their role as prefixes (i.e. according to what can come after them), a CFG is able to classify strings according to their role as infixes (i.e. according to what can come around them). The CFG in (2) does not work by specifying what can come after aa, and what can come after aaa, etc.; as we have seen, any finite device that adopts this strategy is doomed. Instead, this grammar works by specifying what can appear around certain strings, and the crucial point is that what can appear around, say, aabb, is the same as what can appear around aaabbb – so while the pattern does not create any interesting equivalences between prefixes, it does create interesting equivalences between infixes. The grammar in (2) takes advantage of this by making a two‐way distinction between (i) strings of the form , which can be combined with any surroundings of the form , and (ii) all other strings. It uses the nonterminal symbol S as a book‐keeping symbol to identify strings belonging to the first class, just as FSGs use states.
To see these concepts in a more familiar form, consider the CFG in (4). A few example derivations are illustrated with tree diagrams in (5). The stringset that this grammar generates also cannot be generated by any FSG: among strings of the form , only those where there is exactly one occurrence of chase for each occurrence of dogs after the first one will be generated (i.e. those of the form ), so there are unboundedly many nonequivalent prefixes of the form . The early rejection of FSGs as models of natural language was based on the claim that a grammar of English would need to generate patterns of essentially this sort (Chomsky, 1956, §2.3).
Subsets of the set of nonterminals characterize classes of strings, grouped according to the ways they can appear as infixes. I will call these sets inside sets: for example, the inside set of the string chase big dogs is , and the inside set of chased dogs is ; see Table 5.5. This partitioning of strings is the CFG's analog of an FSG's partitioning of strings by forward sets, and corresponds to the familiar notion of categorized constituents. A typical motivation for writing a grammar where a single nonterminal can derive, say, both sleep and chase dogs, is to express the idea that they are intersubstitutable infixes, i.e. for any two strings and , if is generated then is also generated. This is the sense in which CFGs carry over earlier ideas from immediate constituent analysis (e.g. Harris, 1946, 1954; Wells, 1947). For more recent discussions of this important notion, see Clark (2013), Clark and Yoshinaka (2016) and Stabler (2019, Section 2).
Table 5.5 Equivalence classes induced by the grammar in (4)
Inside set
Example strings from the corresponding equivalence class