Spaces:
Sleeping
Sleeping
qqubb
commited on
Update public_dataset.yaml
Browse files
examples/compliant_project/public_dataset.yaml
CHANGED
@@ -64,135 +64,135 @@ high_risk_ai_system_requirements:
|
|
64 |
data_and_data_governance_data_governance:
|
65 |
article: 'Art. 10(1)-(2)'
|
66 |
verbose: 'The dataset was subject to data governance and management practices appropriate to the intended use case'
|
67 |
-
value:
|
68 |
data_and_data_governance_design_choices:
|
69 |
article: 'Art. 10(2)(a)'
|
70 |
verbose: 'The dataset has been subject to data governance and management practices as regards its relevant design choices'
|
71 |
-
value:
|
72 |
data_and_data_governance_data_origin:
|
73 |
article: 'Art. 10(2)(b)'
|
74 |
verbose: 'The dataset has been subject to data governance and management practices as regards its data collection processes and the origin of data, and in the case of personal data, the original purpose of the data collection'
|
75 |
-
value:
|
76 |
data_and_data_governance_data_preparation:
|
77 |
article: 'Art. 10(2)(c)'
|
78 |
verbose: 'The dataset has been subject to data governance and management practices as regards its data-preparation processing operations, such as annotation, labelling, cleaning, updating, enrichment and aggregation'
|
79 |
-
value:
|
80 |
data_and_data_governance_data_assumptions:
|
81 |
article: 'Art. 10(2)(d)'
|
82 |
verbose: 'The dataset has been subject to data governance and management practices as regards its formulation of assumptions, in particular with respect to the information that the data are supposed to measure and represent'
|
83 |
-
value:
|
84 |
data_and_data_governance_data_quantity:
|
85 |
article: 'Art. 10(2)(e)'
|
86 |
verbose: 'The dataset has been subject to data governance and management practices that include an assessment of the availability, quantity and suitability of the data sets that are needed'
|
87 |
-
value:
|
88 |
data_and_data_governance_ata_bias_examination:
|
89 |
article: 'Art. 10(2)(f)'
|
90 |
verbose: 'The dataset has been subject to data governance and management practices that include an examination of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations'
|
91 |
-
value:
|
92 |
data_and_data_governance_data_and_data_governance_data_bias_mitigation:
|
93 |
article: 'Art. 10(2)(g)'
|
94 |
verbose: 'The dataset has been subject to data governance and management practices that include appropriate measures to detect, prevent and mitigate possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations'
|
95 |
-
value:
|
96 |
data_and_data_governance_data_compliance:
|
97 |
article: 'Art. 10(2)(h)'
|
98 |
verbose: 'The dataset has been subject to data governance and management practices that include identification of relevant data gaps or shortcomings that prevent compliance with this Regulation, and how those gaps and shortcomings can be addressed'
|
99 |
-
value:
|
100 |
data_and_data_governance_data_relevance:
|
101 |
article: 'Art. 10(3); Rec. 67'
|
102 |
verbose: 'Training data is relevant'
|
103 |
-
value:
|
104 |
data_and_data_governance_data_representativity:
|
105 |
article: 'Art. 10(3); Rec. 67'
|
106 |
verbose: 'Training data is sufficiently representative'
|
107 |
-
value:
|
108 |
data_and_data_governance_data_errors:
|
109 |
article: 'Art. 10(3); Rec. 67'
|
110 |
verbose: 'Training data is, to the best extent possible, free of errors'
|
111 |
-
value:
|
112 |
data_and_data_governance_data_completeness:
|
113 |
article: 'Art. 10(3); Rec. 67'
|
114 |
verbose: 'Training data is complete in view of the intended purpose'
|
115 |
-
value:
|
116 |
data_and_data_governance_statistical_properties:
|
117 |
article: 'Art. 10(3)'
|
118 |
verbose: 'Training data possesses the appropriate statistical properties, including, where applicable, as regards the people in relation to whom it is intended to be used'
|
119 |
-
value:
|
120 |
data_and_data_governance_contextual:
|
121 |
article: 'Art. 10(4)'
|
122 |
verbose: 'Training data takes into account, to the extent required by the intended purpose, the characteristics or elements that are particular to the specific geographical, contextual, behavioural or functional setting within which it is intended to be used'
|
123 |
-
value:
|
124 |
data_and_data_governance_personal_data_necessary:
|
125 |
article: 'Art. 10(5)'
|
126 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use of this data was strictly necessary'
|
127 |
-
value:
|
128 |
data_and_data_governance_personal_data_safeguards:
|
129 |
article: 'Art. 10(5)'
|
130 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use complied with appropriate safeguards for the fundamental rights and freedoms of natural persons'
|
131 |
-
value:
|
132 |
data_and_data_governance_personal_data_gdpr:
|
133 |
article: 'Art. 10(5)'
|
134 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use of this data satisfied the provisions set out in Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680'
|
135 |
-
value:
|
136 |
data_and_data_governance_personal_data_other_options:
|
137 |
article: 'Art. 10(5)(a)'
|
138 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the bias detection and correction was not effectively fulfilled by processing other data, including synthetic or anonymised data'
|
139 |
-
value:
|
140 |
data_and_data_governance_personal_data_limitations:
|
141 |
article: 'Art. 10(5)(b)'
|
142 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were not subject to technical limitations on the re-use of the personal data, and state-of-the-art security and privacy-preserving measures, including pseudonymisation'
|
143 |
-
value:
|
144 |
data_and_data_governance_personal_data_controls:
|
145 |
article: 'Art. 10(5)(c)'
|
146 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were subject to measures to ensure that the personal data processed are secured, protected, subject to suitable safeguards, including strict controls and documentation of the access, to avoid misuse and ensure that only authorised persons have access to those personal data with appropriate confidentiality obligations'
|
147 |
-
value:
|
148 |
data_and_data_governance_personal_data_access:
|
149 |
article: 'Art. 10(5)(d)'
|
150 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were not to be transmitted, transferred or otherwise accessed by other parties'
|
151 |
-
value:
|
152 |
data_and_data_governance_personal_data_deletion:
|
153 |
article: 'Art. 10(5)(e)'
|
154 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were deleted once the bias was corrected or the personal data reached the end of its retention period (whichever came first)'
|
155 |
-
value:
|
156 |
data_and_data_governance_personal_data_necessary_105f:
|
157 |
article: 'Art. 10(5)(f)'
|
158 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the records of processing activities pursuant to Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680 include the reasons why the processing of special categories of personal data was strictly necessary to detect and correct biases, and why that objective could not be achieved by processing other data'
|
159 |
-
value:
|
160 |
technical_documentation_general_description:
|
161 |
article: 'Art. 11; Annex IV(2)(d)'
|
162 |
verbose: 'Dataset carries technical documention, such as a dataseet, including a general description of the dataset.'
|
163 |
-
value:
|
164 |
technical_documentation_provenance:
|
165 |
article: 'Art. 11; Annex IV(2)(d)'
|
166 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about its provenance'
|
167 |
-
value:
|
168 |
technical_documentation_scope:
|
169 |
article: 'Art. 11; Annex IV(2)(d)'
|
170 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about scope and main characteristics'
|
171 |
-
value:
|
172 |
technical_documentation_origins:
|
173 |
article: 'Art. 11; Annex IV(2)(d)'
|
174 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about how the data was obtained and selected'
|
175 |
-
value:
|
176 |
technical_documentation_labelling:
|
177 |
article: 'Art. 11; Annex IV(2)(d)'
|
178 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about labelling procedures (e.g. for supervised learning)'
|
179 |
-
value:
|
180 |
technical_documentation_cleaning:
|
181 |
article: 'Art. 11; Annex IV(2)(d)'
|
182 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about data cleaning methodologies (e.g. outliers detection)'
|
183 |
-
value:
|
184 |
technical_documentation_cybersecurity:
|
185 |
article: 'Art. 11; Annex IV(2)(h)'
|
186 |
verbose: 'Cybersecurity measures were put in place as regards the data (e.g., scanning for data poisoning)'
|
187 |
-
value:
|
188 |
transparency_and_provision_of_information_to_deployers:
|
189 |
article: 'Art. 13(3)(b)(vi)'
|
190 |
verbose: 'Dataset is accompanied by instructions for use that convery relevant information about it, taking into account its intended purpose'
|
191 |
-
value:
|
192 |
quality_management_system:
|
193 |
article: 'Art. 17(1)(f)'
|
194 |
verbose: 'Datset was subject to a quality management system that is documented in a systematic and orderly manner in the form of written policies, procedures and instructions, and includes a description of the systems and procedures for data management, including data acquisition, data collection, data analysis, data labelling, data storage, data filtration, data mining, data aggregation, data retention and any other operation regarding the data'
|
195 |
-
value:
|
196 |
|
197 |
# Metadata related to data-related requirements when AI project is a GPAI model
|
198 |
|
|
|
64 |
data_and_data_governance_data_governance:
|
65 |
article: 'Art. 10(1)-(2)'
|
66 |
verbose: 'The dataset was subject to data governance and management practices appropriate to the intended use case'
|
67 |
+
value: true
|
68 |
data_and_data_governance_design_choices:
|
69 |
article: 'Art. 10(2)(a)'
|
70 |
verbose: 'The dataset has been subject to data governance and management practices as regards its relevant design choices'
|
71 |
+
value: true
|
72 |
data_and_data_governance_data_origin:
|
73 |
article: 'Art. 10(2)(b)'
|
74 |
verbose: 'The dataset has been subject to data governance and management practices as regards its data collection processes and the origin of data, and in the case of personal data, the original purpose of the data collection'
|
75 |
+
value: true
|
76 |
data_and_data_governance_data_preparation:
|
77 |
article: 'Art. 10(2)(c)'
|
78 |
verbose: 'The dataset has been subject to data governance and management practices as regards its data-preparation processing operations, such as annotation, labelling, cleaning, updating, enrichment and aggregation'
|
79 |
+
value: true
|
80 |
data_and_data_governance_data_assumptions:
|
81 |
article: 'Art. 10(2)(d)'
|
82 |
verbose: 'The dataset has been subject to data governance and management practices as regards its formulation of assumptions, in particular with respect to the information that the data are supposed to measure and represent'
|
83 |
+
value: true
|
84 |
data_and_data_governance_data_quantity:
|
85 |
article: 'Art. 10(2)(e)'
|
86 |
verbose: 'The dataset has been subject to data governance and management practices that include an assessment of the availability, quantity and suitability of the data sets that are needed'
|
87 |
+
value: true
|
88 |
data_and_data_governance_ata_bias_examination:
|
89 |
article: 'Art. 10(2)(f)'
|
90 |
verbose: 'The dataset has been subject to data governance and management practices that include an examination of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations'
|
91 |
+
value: true
|
92 |
data_and_data_governance_data_and_data_governance_data_bias_mitigation:
|
93 |
article: 'Art. 10(2)(g)'
|
94 |
verbose: 'The dataset has been subject to data governance and management practices that include appropriate measures to detect, prevent and mitigate possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations'
|
95 |
+
value: true
|
96 |
data_and_data_governance_data_compliance:
|
97 |
article: 'Art. 10(2)(h)'
|
98 |
verbose: 'The dataset has been subject to data governance and management practices that include identification of relevant data gaps or shortcomings that prevent compliance with this Regulation, and how those gaps and shortcomings can be addressed'
|
99 |
+
value: true
|
100 |
data_and_data_governance_data_relevance:
|
101 |
article: 'Art. 10(3); Rec. 67'
|
102 |
verbose: 'Training data is relevant'
|
103 |
+
value: true
|
104 |
data_and_data_governance_data_representativity:
|
105 |
article: 'Art. 10(3); Rec. 67'
|
106 |
verbose: 'Training data is sufficiently representative'
|
107 |
+
value: true
|
108 |
data_and_data_governance_data_errors:
|
109 |
article: 'Art. 10(3); Rec. 67'
|
110 |
verbose: 'Training data is, to the best extent possible, free of errors'
|
111 |
+
value: true
|
112 |
data_and_data_governance_data_completeness:
|
113 |
article: 'Art. 10(3); Rec. 67'
|
114 |
verbose: 'Training data is complete in view of the intended purpose'
|
115 |
+
value: true
|
116 |
data_and_data_governance_statistical_properties:
|
117 |
article: 'Art. 10(3)'
|
118 |
verbose: 'Training data possesses the appropriate statistical properties, including, where applicable, as regards the people in relation to whom it is intended to be used'
|
119 |
+
value: true
|
120 |
data_and_data_governance_contextual:
|
121 |
article: 'Art. 10(4)'
|
122 |
verbose: 'Training data takes into account, to the extent required by the intended purpose, the characteristics or elements that are particular to the specific geographical, contextual, behavioural or functional setting within which it is intended to be used'
|
123 |
+
value: true
|
124 |
data_and_data_governance_personal_data_necessary:
|
125 |
article: 'Art. 10(5)'
|
126 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use of this data was strictly necessary'
|
127 |
+
value: true
|
128 |
data_and_data_governance_personal_data_safeguards:
|
129 |
article: 'Art. 10(5)'
|
130 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use complied with appropriate safeguards for the fundamental rights and freedoms of natural persons'
|
131 |
+
value: true
|
132 |
data_and_data_governance_personal_data_gdpr:
|
133 |
article: 'Art. 10(5)'
|
134 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use of this data satisfied the provisions set out in Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680'
|
135 |
+
value: true
|
136 |
data_and_data_governance_personal_data_other_options:
|
137 |
article: 'Art. 10(5)(a)'
|
138 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the bias detection and correction was not effectively fulfilled by processing other data, including synthetic or anonymised data'
|
139 |
+
value: true
|
140 |
data_and_data_governance_personal_data_limitations:
|
141 |
article: 'Art. 10(5)(b)'
|
142 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were not subject to technical limitations on the re-use of the personal data, and state-of-the-art security and privacy-preserving measures, including pseudonymisation'
|
143 |
+
value: true
|
144 |
data_and_data_governance_personal_data_controls:
|
145 |
article: 'Art. 10(5)(c)'
|
146 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were subject to measures to ensure that the personal data processed are secured, protected, subject to suitable safeguards, including strict controls and documentation of the access, to avoid misuse and ensure that only authorised persons have access to those personal data with appropriate confidentiality obligations'
|
147 |
+
value: true
|
148 |
data_and_data_governance_personal_data_access:
|
149 |
article: 'Art. 10(5)(d)'
|
150 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were not to be transmitted, transferred or otherwise accessed by other parties'
|
151 |
+
value: true
|
152 |
data_and_data_governance_personal_data_deletion:
|
153 |
article: 'Art. 10(5)(e)'
|
154 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were deleted once the bias was corrected or the personal data reached the end of its retention period (whichever came first)'
|
155 |
+
value: true
|
156 |
data_and_data_governance_personal_data_necessary_105f:
|
157 |
article: 'Art. 10(5)(f)'
|
158 |
verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the records of processing activities pursuant to Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680 include the reasons why the processing of special categories of personal data was strictly necessary to detect and correct biases, and why that objective could not be achieved by processing other data'
|
159 |
+
value: true
|
160 |
technical_documentation_general_description:
|
161 |
article: 'Art. 11; Annex IV(2)(d)'
|
162 |
verbose: 'Dataset carries technical documention, such as a dataseet, including a general description of the dataset.'
|
163 |
+
value: true
|
164 |
technical_documentation_provenance:
|
165 |
article: 'Art. 11; Annex IV(2)(d)'
|
166 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about its provenance'
|
167 |
+
value: true
|
168 |
technical_documentation_scope:
|
169 |
article: 'Art. 11; Annex IV(2)(d)'
|
170 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about scope and main characteristics'
|
171 |
+
value: true
|
172 |
technical_documentation_origins:
|
173 |
article: 'Art. 11; Annex IV(2)(d)'
|
174 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about how the data was obtained and selected'
|
175 |
+
value: true
|
176 |
technical_documentation_labelling:
|
177 |
article: 'Art. 11; Annex IV(2)(d)'
|
178 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about labelling procedures (e.g. for supervised learning)'
|
179 |
+
value: true
|
180 |
technical_documentation_cleaning:
|
181 |
article: 'Art. 11; Annex IV(2)(d)'
|
182 |
verbose: 'Dataset carries technical documention, such as a dataseet, including information about data cleaning methodologies (e.g. outliers detection)'
|
183 |
+
value: true
|
184 |
technical_documentation_cybersecurity:
|
185 |
article: 'Art. 11; Annex IV(2)(h)'
|
186 |
verbose: 'Cybersecurity measures were put in place as regards the data (e.g., scanning for data poisoning)'
|
187 |
+
value: true
|
188 |
transparency_and_provision_of_information_to_deployers:
|
189 |
article: 'Art. 13(3)(b)(vi)'
|
190 |
verbose: 'Dataset is accompanied by instructions for use that convery relevant information about it, taking into account its intended purpose'
|
191 |
+
value: true
|
192 |
quality_management_system:
|
193 |
article: 'Art. 17(1)(f)'
|
194 |
verbose: 'Datset was subject to a quality management system that is documented in a systematic and orderly manner in the form of written policies, procedures and instructions, and includes a description of the systems and procedures for data management, including data acquisition, data collection, data analysis, data labelling, data storage, data filtration, data mining, data aggregation, data retention and any other operation regarding the data'
|
195 |
+
value: true
|
196 |
|
197 |
# Metadata related to data-related requirements when AI project is a GPAI model
|
198 |
|