Theory evaluation can be approached from both empirical and analytical directions. Within empirical evaluation, experience with the theory is used to discover the strength of the theory. Analytic evaluation analyses the formal structure of the theory and tries to draw from this an assessment of the theory.
The first step in theory evaluation is the identification of theories that explain or predict the type of behaviour of a natural system we are interested in. If no theories exist then one must be developed. Once a set of theories has been identified for the problem then the criteria for prediction, assumptions, consilience and simplicity can be used to select the most suitable one. This is a complicated evaluation, because most of the features are incompatible; a theory can never score best in all categories.
Unambiguous encoding vs. consilience: concrete entities are easier to measure than abstract entities, but the theory is more specific for that reason.
Analytical evaluation involves checking the a priori structural correctness of syntactical operations on statements of the theory. A priori because they are true or false as a matter of logic alone. The more that is known about the structure of the theory, the more can be inserted as logical statements and checked analytically. Consistency and completeness are examples of structural criteria.
Reasoning with universal laws represented by implications in predicate logic results in `tautologies' that are valid if the axioms are valid. In this case the evaluation process can concentrate on empirical validation of the applied laws and axioms.
If the applied inference over implications is not strictly deductive, but applies inductive, abductive or analogical reasoning steps, then the inferences of valid propositions will not necessarily lead to valid conclusions; the theory could be making incorrect predictions. In this situation, in addition to the validation steps required for the previous case, every abductive, inductive or analogical reasoning step needs to be checked a postiori for its empirical validity.
Statistical laws come with a probability for a defined type of problem set. Deductive inference over statistical laws will preserve this probability. Abductive, inductive an analogical reasoning with statistical laws leads to conclusions whose probability is not preserved. In this case the validity of the analogy should be checked empirically for every analogical reasoning step.