Language Vulnerabilities base document

Collated by: Brian Wichmann

1 Foreword
2 Scope
    2.1 In Scope
    2.2 Not in scope
    2.3 Cautious approach
3 Vulnerability issues
    3.1 Human factors
    3.2 Predictable execution
        3.2.1 Language definition
        3.2.2 Precision of Language
    3.3 Portability
4 Guideline selection process
    4.1 Cost/benefit analysis
    4.2 Documenting of the selection process
5 Conformance
    5.1 Claiming conformance to requirements in this document
    5.2 Claiming conformance to a language specific guideline document
    5.3 Deviations
        5.3.1 Deviation approval requirements
6 Producing language specific guidelines
7 Evaluating a guidelines document
    7.1 Objectives of guideline document
    7.2 Coverage of guidelines relative to objectives
A Bibliography
B Factors that need to be covered in a proposed guideline recommendation
        B.0.1 Expected cost of following a guideline
        B.0.2 Expected benefit from following a guideline
    B.1 Language definition
    B.2 Measurements of language usage
    B.3 Level of expertise
    B.4 Intended purpose of guidelines
    B.5 Constructs whose behavior can vary
    B.6 Example guideline proposal template
        B.6.1 Coding Guideline
C Language-specific guidelines
    C.1 Guidelines document for language XYZ
D Recommendations considered and rejected
E Glossary
F Document details

(Please email comments/corrections to me.)

This version has heading and structure following the ISO requirements. However, it is not a draft standards document.

Title of proposed Technical Report: Guidance to Avoiding Vulnerabilities in Programming Languages through Language Selection and Use.

1 Foreword

2 Scope

2.1 In Scope

Applicable to any computer programming language.

Applicable to software written, reviewed and maintained for any application.

Applicable in any context where assured behaviour is required, e.g. security, safety, mission/business criticality etc.

2.2 Not in scope

This technical report does not address software engineering and management issues such as how to design and implement programs, using configuration management, managerial processes etc.

The specification of the application is not within the scope.

2.3 Cautious approach

The impact of the guidelines in this technical report are likely to be highly leveraged in that they are likely to affect many times more people than the number that worked on them. This leverage means that these guidelines have the potential to make large savings, for a small cost, or to generate large unnecessary costs, for little benefit.

Some of the reasons why a guideline might generate unnecessary costs include:

Little hard information is available on which guideline recommendations might be cost effective.
It is likely to be difficult to withdraw a guideline recommendation once it has been published.
Premature creation of a guideline recommendation can result in:
- Unnecessary enforcement costs (i.e., if a given recommendation is later found to be not worthwhile).
- Potentially unnecessary program development costs through having to specify and use alternative constructs during software development.
- A reduction in developer confidence of the worthwhileness of these guidelines.

For these reasons this technical report has taken a cautious approach to creating guideline recommendations. New guideline recommendations can be added over time, as practical experience and experimental evidence is accumulated.

References

The following reference are primary ones which are likely to be in the final version of the guidelines. See the bibliography for working references which may not be in the final Technical Report.

[2]: ISO/IEC TR 15942:2000, "Information technology - Programming languages - Guide for the use of the Ada programming language in high integrity systems"
[3]: Motor Industry Software Reliability Association. Guidelines for the Use of the C Language in Vehicle Based Software, 2004 (second edition). NB: the first edition should not be used/quoted in this work.
[4]: Joint Fighter Air Vehicle: C++ Coding Standards for the System Development and Demonstration Program. Lockheed Martin Corporation. December 2005.
[5]: J Barnes. High Integrity Software - the SPARK Approach to Safety and Security. Addison-Wesley. 2002.
[6]: ISO/IEC 15291:1999, Information technology - Programming languages - Ada Semantic Interface Specification (ASIS)
[7]: Ada 95 Quality and Style Guide: Guidelines for Professional Programmers. http://www.adaic.com/docs/95style/html/cover.html
[8]: Software Considerations in Airborne Systems and Equipment Certification. Issued in the USA by the Requirements and Technical Concepts for Aviation (document RTCA SC167/DO-178B) and in Europe by the European Organization for Civil Aviation Electronics (EUROCAE document ED-12B). December 1992.
[9]: IEC 61508: Parts 1-7, Functional safety: safety-related systems. 1998. (Part 3 is concerned with software).
[10]: ISO/IEC 15408: 1999 Information technology. Security techniques. Evaluation criteria for IT security.

3 Vulnerability issues

For the definition of vulnerability, see Annex E.

Vulnerabilities might be targeted by external threats such as worms and viruses, or might be faults that can occur during during the expected normal execution of the software.

The economic impact of a vulnerability will depend on the how it changes the behavior of a program and the real world events that are affected by that program. For instance, the impact of an uninitialised variable can range from failure to of a coffee machine to deliver hot water to people dying in an aircraft accident.

The following subsections cover some of the sources of vulnerabilities.

3.1 Human factors

Possible human factors include the following:

Cognitive failure, external pressures on readers and writers results in them failing to invest the time and effort needed to fully comprehend the code,
Knowledge failure:
- people reading source code having incomplete and incorrect knowledge of the appropriate language semantics,
- people reading source code having incomplete and incorrect knowledge of how it will be executed by a particular implementation,
- people reading source code having incomplete and incorrect knowledge of the interaction between its various components,
competence...

3.2 Predictable execution

Given sufficient time and information (including the behavior of a particular translator) the behavior of a program can always be predicted. In practice sufficient time and information is rarely available to perform the analyse needed to correctly predict the complete behavior of a program. These practical issues include the following:

It is intended that this technical report provide guidelines that will enable a greater level of predictability to be achieved for the same level of investment of time and money. The following are some of the mechanisms used to achieve this goal:

reducing the amount of cognitive effort that needs to be invested by readers of the source code,
reducing the amount of knowledge needed by readers of the source code,
reducing the probability that incorrect developer knowledge will result in incorrect prediction of behavior,
recommending against the use of constructs that are costly or impractical to check automatically using tools,
recommending against the use of constructs that are costly or impractical to check during testing,
suggesting annotations which provide information against which additional consistency checks can be made,
creating a widely adopted set of guidelines make it economically worthwhile to produce checking tools, which in turn reduce the cost of achieving a desired level of confidence in predicted program behavior.

Verifying that the predicted behavior of a program is as intended (i.e., that it meets its specification) is outside the scope of this technical report.

3.2.1 Language definition

Languages frequently support constructs whose behaviour is undefined, implementation defined, or unspecified. If the output from a program has a dependency on these constructs having a particular behavior, then the people and tools that reader the code need to be aware of, and take account of, this particular behavior. In some cases the undefined and unspecified behaviors are likely to change frequently and it can be costly and timing consuming to continually have to track these changes and the impact they have on overall program behavior.

Those language constructs that are undefined, implementation defined, or unspecified need to be documented and the cost effectiveness of recommending against their use carried out.

3.2.2 Precision of Language

Some key aspects are:

A requirement for translator to reject programs which are statically incorrect.
A requirement to apply dynamic checks to ensure predictable execution.
A requirement to document all cases in which the execution of a program is unpredictable.

3.3 Portability

Portability can refer to people or to tools. The skills people learn on one platform are likely to be the ones they apply, at least initially, to a different platform. The behavior of source code can change when it is built using using different language translators and libraries (generating code for the same/different processor or same/different operating system).

Restricting the use of language constructs to those whose behavior does not vary between different translators and libraries increases the likelihood that a programs behavior will not change across platforms and that different people will correctly predict this behavior.

4 Guideline selection process

It is possible to claim that any language construct can be misunderstood by a developer and lead to a failure to predict program behavior. A cost/benefit analysis of each proposed guideline is the solution adopted by this technical report.

The selection process has been based on evidence that the use of a language construct leads to unpredictable behavior (i.e., a cost) and that the proposed guideline increases the likelihood of a correct prediction of behavior (i.e., a benefit). The following is a list of the major source of evidence on the use of a language construct and the faults resulting from that use:

a list of language constructs having undefined, implementation defined, or unspecified behaviors,
measurements of existing source code. This usage information has included the number of occurrences of uses of the construct and the contexts in which it occurs,
measurement of faults experienced in existing code,
measurements of developer knowledge and performance behavior.

The following are some of the issues that were considered when framing guidelines:

An attempt was made to be generic to particular kinds of language constructs (i.e., language independent), rather than being language specific.
Preference was given to wording that is capable of being checked by automated tools.
Known algorithms for performing various kinds of source code analysis and the properties of those algorithms (i.e., their complexity and running time).

4.1 Cost/benefit analysis

The fact that a coding construct is known to be a source of failure to predict correct behavior is not in itself a reason to recommend against its use. Unless the desired algorithmic functionality can be implemented using an alternative construct whose use has more predictable behavior, then there is no benefit in recommending against the use of the original construct.

While the cost/benefit of some guidelines may always come down in favor of them being adhered to (e.g., don't access a variable before it is given a value), the situation may be less clear cut for other guidelines. Providing a summary of the background analysis for each guideline will enable development groups...

Annex A provides a template for the information that should be supplied with each guideline....

It is unlikely that all of the guidelines given in this technical report will be applicable to all application domains. Different development projects may ... is likely to have its own requirements.

4.2 Documenting of the selection process

The intended purpose of this documentation is to enable third parties to evaluate:

the effectiveness of the process that created each guideline,
the applicability of individual guidelines to a particular project.

5 Conformance

5.1 Claiming conformance to requirements in this document

Examples of methods that might be used, by the authors of a language specific guidelines document, to build a claim of conformance to the requirements given in this technical report.

5.2 Claiming conformance to a language specific guideline document

List of conformance issues that authors of language specific guidelines documents need to consider.

Possible methods a user of a language specific guidelines document might use to building a claim of conformance to that document.

5.3 Deviations

While the cost/benefit analysis for a particular guideline may have come down in favor of it being generally adhered to, it is possible that there are situations where the cost is significantly greater than the benefit. The mechanism for handling these situations is to allow a deviation against a guideline in a particular situation to be made.

5.3.1 Deviation approval requirements

One of the most important requirements when a deviation is made is that the reasons for the deviation be documented. This documentation should include at least the following information:

List of alternative constructs considered.
Summary of rationales for each kind of deviation made.

6 Producing language specific guidelines

A specification of the requirements that must be met when adapting either a set of generic guidelines to a particular language or specialising language specific guidelines to meet the requirements of a particular project.

Issues to consider.

Template or boiler-plate wording that can be used, for instance:

ISO standard + vendor's specification of extensions
List of implementations of language taken into consideration when writing guidelines, including any specific versions of an implementation.
Amount of knowledge expected of software developers who are writing and reading source code...

7 Evaluating a guidelines document

Discussion of the requirements a third party might consider when evaluating the merits of a particular set of coding guidelines.

7.1 Objectives of guideline document

Assessing what the specified objectives of a guidelines document are and how they have been addressed.

7.2 Coverage of guidelines relative to objectives

Assessing the extent to which the stated objectives of a guidelines document have been met.

Evaluating the evidence that all applicable issues have been considered.

A Bibliography

ISO 9126. Information technology - Software evaluation - Quality characteristics and guidelines for their use. (Agreed to use this as a check list, rather than as a key standard.)
B A Wichmann. Predictable execution. Working paper. Dated March 2006.
The Certification of Systems containing Software developed using RCTA DO-178B. ASSC. Draft Issue 2. March 2006.
Mitre's Common Weakness Enumeration, CWE - http://www.cve.mitre.org/cwe/about/index.html
Shepperd, M. J., A Critique of Cyclomatic Complexity as a Software Metric, in Shepperd, M., Software Engineering Metrics 1: Measures and Validations, McGraw-Hill, 1993, ISBN: 0-07-707410-6

B Factors that need to be covered in a proposed guideline recommendation

These are needed because circumstances might change, for instance:

Changes to language definition.
Changes to translator behavior.
Developer training.
More effective recommendation discovered.

B.0.1 Expected cost of following a guideline

How to evaluate likely costs.

B.0.2 Expected benefit from following a guideline

How to evaluate likely benefits.

B.1 Language definition

Which one to use. For instance, an ISO Standard, Industry standard, a particular implementation.

Position on use of extensions.

B.2 Measurements of language usage

Occurrences of applicable language constructs in software written for the target market.

How often do the constructs addressed by each guideline recommendation occur.

B.3 Level of expertise

How much expertise, and in what areas, are the people using the language assumed to have?

Is use of the alternative constructs less likely to result in faults?

B.4 Intended purpose of guidelines

For instance: How the listed guidelines cover the requirements specified in a safety related standard.

B.5 Constructs whose behavior can vary

The different ways in which language definitions specify behavior that is allowed to vary between implementations and how to go about documenting these cases.

B.6 Example guideline proposal template

B.6.1 Coding Guideline

Anticipated benefit of adhering to guideline

Cost of moving to a new translator reduced.
Probability of a fault introduced when new version of translator used reduced.
Probability of developer making a mistake is reduced.
Developer mistakes more likely to be detected during development.
Reduction of future maintenance costs.

C Language-specific guidelines

C.1 Guidelines document for language XYZ

An actual guidelines document for XYZ (enough people to produce a C document?).

D Recommendations considered and rejected

Issues considered for inclusion in this document but rejected through lack of evidence of a worthwhile benefit. Where to find this information.

E Glossary

Vulnerability (in a programming language): : Add a definition of this term here.

F Document details

Merged comments received from vulnerability panel members after February meeting. First written 1st April 2006.
Substantially revised, 16th May 2006. References split between formal references and bibliography - the bibliographic ones listed here are not for consideration as references, but merely as a source of information. Sections put in order likely for a standard.
New document produced from DJ's OWGbase 1.2 (which reflected discussion of Vulnerabilities panel meeting of June 6), 13th June.
Revised to reflect comments of DJ, 14th June 2006.
Revised to reflect August panel discussion, 28th August 2006.