Cyber & IT Supervisory Forum - Additional Resources

Measure 2.5 The AI system to be deployed is demonstrated to be valid and reliable. Limitations of the generalizability beyond the conditions under which the technology was developed are documented. About An AI system that is not validated or that fails validation may be inaccurate or unreliable or may generalize poorly to data and settings beyond its training, creating and increasing AI risks and reducing trustworthiness. AI Actors can improve system validity by creating processes for exploring and documenting system limitations. This includes broad consideration of purposes and uses for which the system was not designed. Validation risks include the use of proxies or other indicators that are often constructed by AI development teams to operationalize phenomena that are either not directly observable or measurable (e.g., fairness, hireability, honesty, propensity to commit a crime). Teams can mitigate these risks by demonstrating that the indicator is measuring the concept it claims to measure (also known as construct validity). Without this and other types of validation, various negative properties or impacts may go undetected, including the presence of confounding variables, potential spurious correlations, or error propagation and its potential impact on other interconnected systems. Suggested Actions Define the operating conditions and socio-technical context under which the AI system will be validated. Define and document processes to establish the system’s operational conditions and limits. Establish or identify, and document approaches to measure forms of validity, including: construct validity (the test is measuring the concept it claims to measure) internal validity (relationship being tested is not influenced by other factors or variables)

124

Made with FlippingBook Annual report maker