In the technology of harvesting, there is a continuum between a totally manual review of code at the low end to a hoped for future tool in which all code in an application will be automatically converted to UML, ready for generating a new application. The current project proposes to create a tool which lies between these extremes by using a rule-based structure to assist in determining what is wheat and what is chaff within any given body of code so that the analyst can review the probable wheat and extract it for re-use as appropriate.
It is expected that this goal will be approached in several stages, starting with fairly simple rules which depend only on the code being examined, but later extending to interactions with previously extracted models. In the initial stages, the concept is that we will “gray out” and possibly collapse sections of code which have been evaluated as chaff in order to enable the analyst to review the remaining code more easily and to evaluate it. If it is evaluated as chaff, then the analyst should be able to mark it and collapse it further in order to focus on what remains. In later versions, more sophisticated rules will interact with previously harvested code to determine, for example, whether a particular fragment has already been harvested in another context. Some code fragments, e.g., validation rules, are likely to occur repeatedly in legacy code.
In initial versions, we expect that the actual harvesting will be simply cutting and pasting from the editor into whatever vehicle is going to be used to store prospective logic fragments. In later versions, we would hope to have a more automated process which will directly transfer selected code to a new form, either as a separate code unit or as a component of a model.
While it would be desirable to support all forms of harvesting including cutting and pasting to new .i and .p files, there is a special attraction for supporting harvesting to UML because, in addition to the potential of UML modeling itself, there would be the possibility of predictable relationships between harvested code and components of the UML model. For example, a field validation would normally be stored as a constraint on an object property, so a field validation in the code could be checked against the object property to determine whether this constraint had already been harvested. To do the same with .i and .p files would require a very artificial naming convention.