Cyber & IT Supervisory Forum - Additional Resources
external validity (results are generalizable beyond the training condition) the use of experimental design principles and statistical analyses and modeling. Assess and document system variance. Standard approaches include confidence intervals, standard deviation, standard error, bootstrapping, or cross-validation. Establish or identify and document robustness measures. Establish or identify and document reliability measures. Establish practices to specify and document the assumptions underlying measurement models to ensure proxies accurately reflect the concept being measured. Utilize standard software testing approaches (e.g., unit, integration, functional and chaos testing, computer-generated test cases, etc.) Utilize standard statistical methods to test bias, inferential associations, correlation, and covariance in adopted measurement models. Utilize standard statistical methods to test variance and reliability of system outcomes. Monitor operating conditions for system performance outside of defined limits. Identify TEVV approaches for exploring AI system limitations, including testing scenarios that differ from the operational environment. Consult experts with knowledge of specific context of use. Define post-alert actions. Possible actions may include: alerting other relevant AI actors before action, requesting subsequent human review of action, alerting downstream users and stakeholder that the system is operating outside it’s defined validity limits, tracking and mitigating possible error propagation action logging Log input data and relevant system configuration information whenever there is an attempt to use the system beyond its well-defined range of system validity.
125
Made with FlippingBook Annual report maker