qqubb commited on
Commit
c8b7731
·
unverified ·
1 Parent(s): 67a786b

Update public_dataset.yaml

Browse files
examples/compliant_project/public_dataset.yaml CHANGED
@@ -64,135 +64,135 @@ high_risk_ai_system_requirements:
64
  data_and_data_governance_data_governance:
65
  article: 'Art. 10(1)-(2)'
66
  verbose: 'The dataset was subject to data governance and management practices appropriate to the intended use case'
67
- value: false
68
  data_and_data_governance_design_choices:
69
  article: 'Art. 10(2)(a)'
70
  verbose: 'The dataset has been subject to data governance and management practices as regards its relevant design choices'
71
- value: false
72
  data_and_data_governance_data_origin:
73
  article: 'Art. 10(2)(b)'
74
  verbose: 'The dataset has been subject to data governance and management practices as regards its data collection processes and the origin of data, and in the case of personal data, the original purpose of the data collection'
75
- value: false
76
  data_and_data_governance_data_preparation:
77
  article: 'Art. 10(2)(c)'
78
  verbose: 'The dataset has been subject to data governance and management practices as regards its data-preparation processing operations, such as annotation, labelling, cleaning, updating, enrichment and aggregation'
79
- value: false
80
  data_and_data_governance_data_assumptions:
81
  article: 'Art. 10(2)(d)'
82
  verbose: 'The dataset has been subject to data governance and management practices as regards its formulation of assumptions, in particular with respect to the information that the data are supposed to measure and represent'
83
- value: false
84
  data_and_data_governance_data_quantity:
85
  article: 'Art. 10(2)(e)'
86
  verbose: 'The dataset has been subject to data governance and management practices that include an assessment of the availability, quantity and suitability of the data sets that are needed'
87
- value: false
88
  data_and_data_governance_ata_bias_examination:
89
  article: 'Art. 10(2)(f)'
90
  verbose: 'The dataset has been subject to data governance and management practices that include an examination of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations'
91
- value: false
92
  data_and_data_governance_data_and_data_governance_data_bias_mitigation:
93
  article: 'Art. 10(2)(g)'
94
  verbose: 'The dataset has been subject to data governance and management practices that include appropriate measures to detect, prevent and mitigate possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations'
95
- value: false
96
  data_and_data_governance_data_compliance:
97
  article: 'Art. 10(2)(h)'
98
  verbose: 'The dataset has been subject to data governance and management practices that include identification of relevant data gaps or shortcomings that prevent compliance with this Regulation, and how those gaps and shortcomings can be addressed'
99
- value: false
100
  data_and_data_governance_data_relevance:
101
  article: 'Art. 10(3); Rec. 67'
102
  verbose: 'Training data is relevant'
103
- value: false
104
  data_and_data_governance_data_representativity:
105
  article: 'Art. 10(3); Rec. 67'
106
  verbose: 'Training data is sufficiently representative'
107
- value: false
108
  data_and_data_governance_data_errors:
109
  article: 'Art. 10(3); Rec. 67'
110
  verbose: 'Training data is, to the best extent possible, free of errors'
111
- value: false
112
  data_and_data_governance_data_completeness:
113
  article: 'Art. 10(3); Rec. 67'
114
  verbose: 'Training data is complete in view of the intended purpose'
115
- value: false
116
  data_and_data_governance_statistical_properties:
117
  article: 'Art. 10(3)'
118
  verbose: 'Training data possesses the appropriate statistical properties, including, where applicable, as regards the people in relation to whom it is intended to be used'
119
- value: false
120
  data_and_data_governance_contextual:
121
  article: 'Art. 10(4)'
122
  verbose: 'Training data takes into account, to the extent required by the intended purpose, the characteristics or elements that are particular to the specific geographical, contextual, behavioural or functional setting within which it is intended to be used'
123
- value: false
124
  data_and_data_governance_personal_data_necessary:
125
  article: 'Art. 10(5)'
126
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use of this data was strictly necessary'
127
- value: false
128
  data_and_data_governance_personal_data_safeguards:
129
  article: 'Art. 10(5)'
130
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use complied with appropriate safeguards for the fundamental rights and freedoms of natural persons'
131
- value: false
132
  data_and_data_governance_personal_data_gdpr:
133
  article: 'Art. 10(5)'
134
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use of this data satisfied the provisions set out in Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680'
135
- value: false
136
  data_and_data_governance_personal_data_other_options:
137
  article: 'Art. 10(5)(a)'
138
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the bias detection and correction was not effectively fulfilled by processing other data, including synthetic or anonymised data'
139
- value: false
140
  data_and_data_governance_personal_data_limitations:
141
  article: 'Art. 10(5)(b)'
142
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were not subject to technical limitations on the re-use of the personal data, and state-of-the-art security and privacy-preserving measures, including pseudonymisation'
143
- value: false
144
  data_and_data_governance_personal_data_controls:
145
  article: 'Art. 10(5)(c)'
146
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were subject to measures to ensure that the personal data processed are secured, protected, subject to suitable safeguards, including strict controls and documentation of the access, to avoid misuse and ensure that only authorised persons have access to those personal data with appropriate confidentiality obligations'
147
- value: false
148
  data_and_data_governance_personal_data_access:
149
  article: 'Art. 10(5)(d)'
150
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were not to be transmitted, transferred or otherwise accessed by other parties'
151
- value: false
152
  data_and_data_governance_personal_data_deletion:
153
  article: 'Art. 10(5)(e)'
154
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were deleted once the bias was corrected or the personal data reached the end of its retention period (whichever came first)'
155
- value: false
156
  data_and_data_governance_personal_data_necessary_105f:
157
  article: 'Art. 10(5)(f)'
158
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the records of processing activities pursuant to Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680 include the reasons why the processing of special categories of personal data was strictly necessary to detect and correct biases, and why that objective could not be achieved by processing other data'
159
- value: false
160
  technical_documentation_general_description:
161
  article: 'Art. 11; Annex IV(2)(d)'
162
  verbose: 'Dataset carries technical documention, such as a dataseet, including a general description of the dataset.'
163
- value: false
164
  technical_documentation_provenance:
165
  article: 'Art. 11; Annex IV(2)(d)'
166
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about its provenance'
167
- value: false
168
  technical_documentation_scope:
169
  article: 'Art. 11; Annex IV(2)(d)'
170
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about scope and main characteristics'
171
- value: false
172
  technical_documentation_origins:
173
  article: 'Art. 11; Annex IV(2)(d)'
174
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about how the data was obtained and selected'
175
- value: false
176
  technical_documentation_labelling:
177
  article: 'Art. 11; Annex IV(2)(d)'
178
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about labelling procedures (e.g. for supervised learning)'
179
- value: false
180
  technical_documentation_cleaning:
181
  article: 'Art. 11; Annex IV(2)(d)'
182
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about data cleaning methodologies (e.g. outliers detection)'
183
- value: false
184
  technical_documentation_cybersecurity:
185
  article: 'Art. 11; Annex IV(2)(h)'
186
  verbose: 'Cybersecurity measures were put in place as regards the data (e.g., scanning for data poisoning)'
187
- value: false
188
  transparency_and_provision_of_information_to_deployers:
189
  article: 'Art. 13(3)(b)(vi)'
190
  verbose: 'Dataset is accompanied by instructions for use that convery relevant information about it, taking into account its intended purpose'
191
- value: false
192
  quality_management_system:
193
  article: 'Art. 17(1)(f)'
194
  verbose: 'Datset was subject to a quality management system that is documented in a systematic and orderly manner in the form of written policies, procedures and instructions, and includes a description of the systems and procedures for data management, including data acquisition, data collection, data analysis, data labelling, data storage, data filtration, data mining, data aggregation, data retention and any other operation regarding the data'
195
- value: false
196
 
197
  # Metadata related to data-related requirements when AI project is a GPAI model
198
 
 
64
  data_and_data_governance_data_governance:
65
  article: 'Art. 10(1)-(2)'
66
  verbose: 'The dataset was subject to data governance and management practices appropriate to the intended use case'
67
+ value: true
68
  data_and_data_governance_design_choices:
69
  article: 'Art. 10(2)(a)'
70
  verbose: 'The dataset has been subject to data governance and management practices as regards its relevant design choices'
71
+ value: true
72
  data_and_data_governance_data_origin:
73
  article: 'Art. 10(2)(b)'
74
  verbose: 'The dataset has been subject to data governance and management practices as regards its data collection processes and the origin of data, and in the case of personal data, the original purpose of the data collection'
75
+ value: true
76
  data_and_data_governance_data_preparation:
77
  article: 'Art. 10(2)(c)'
78
  verbose: 'The dataset has been subject to data governance and management practices as regards its data-preparation processing operations, such as annotation, labelling, cleaning, updating, enrichment and aggregation'
79
+ value: true
80
  data_and_data_governance_data_assumptions:
81
  article: 'Art. 10(2)(d)'
82
  verbose: 'The dataset has been subject to data governance and management practices as regards its formulation of assumptions, in particular with respect to the information that the data are supposed to measure and represent'
83
+ value: true
84
  data_and_data_governance_data_quantity:
85
  article: 'Art. 10(2)(e)'
86
  verbose: 'The dataset has been subject to data governance and management practices that include an assessment of the availability, quantity and suitability of the data sets that are needed'
87
+ value: true
88
  data_and_data_governance_ata_bias_examination:
89
  article: 'Art. 10(2)(f)'
90
  verbose: 'The dataset has been subject to data governance and management practices that include an examination of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations'
91
+ value: true
92
  data_and_data_governance_data_and_data_governance_data_bias_mitigation:
93
  article: 'Art. 10(2)(g)'
94
  verbose: 'The dataset has been subject to data governance and management practices that include appropriate measures to detect, prevent and mitigate possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations'
95
+ value: true
96
  data_and_data_governance_data_compliance:
97
  article: 'Art. 10(2)(h)'
98
  verbose: 'The dataset has been subject to data governance and management practices that include identification of relevant data gaps or shortcomings that prevent compliance with this Regulation, and how those gaps and shortcomings can be addressed'
99
+ value: true
100
  data_and_data_governance_data_relevance:
101
  article: 'Art. 10(3); Rec. 67'
102
  verbose: 'Training data is relevant'
103
+ value: true
104
  data_and_data_governance_data_representativity:
105
  article: 'Art. 10(3); Rec. 67'
106
  verbose: 'Training data is sufficiently representative'
107
+ value: true
108
  data_and_data_governance_data_errors:
109
  article: 'Art. 10(3); Rec. 67'
110
  verbose: 'Training data is, to the best extent possible, free of errors'
111
+ value: true
112
  data_and_data_governance_data_completeness:
113
  article: 'Art. 10(3); Rec. 67'
114
  verbose: 'Training data is complete in view of the intended purpose'
115
+ value: true
116
  data_and_data_governance_statistical_properties:
117
  article: 'Art. 10(3)'
118
  verbose: 'Training data possesses the appropriate statistical properties, including, where applicable, as regards the people in relation to whom it is intended to be used'
119
+ value: true
120
  data_and_data_governance_contextual:
121
  article: 'Art. 10(4)'
122
  verbose: 'Training data takes into account, to the extent required by the intended purpose, the characteristics or elements that are particular to the specific geographical, contextual, behavioural or functional setting within which it is intended to be used'
123
+ value: true
124
  data_and_data_governance_personal_data_necessary:
125
  article: 'Art. 10(5)'
126
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use of this data was strictly necessary'
127
+ value: true
128
  data_and_data_governance_personal_data_safeguards:
129
  article: 'Art. 10(5)'
130
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use complied with appropriate safeguards for the fundamental rights and freedoms of natural persons'
131
+ value: true
132
  data_and_data_governance_personal_data_gdpr:
133
  article: 'Art. 10(5)'
134
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the use of this data satisfied the provisions set out in Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680'
135
+ value: true
136
  data_and_data_governance_personal_data_other_options:
137
  article: 'Art. 10(5)(a)'
138
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the bias detection and correction was not effectively fulfilled by processing other data, including synthetic or anonymised data'
139
+ value: true
140
  data_and_data_governance_personal_data_limitations:
141
  article: 'Art. 10(5)(b)'
142
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were not subject to technical limitations on the re-use of the personal data, and state-of-the-art security and privacy-preserving measures, including pseudonymisation'
143
+ value: true
144
  data_and_data_governance_personal_data_controls:
145
  article: 'Art. 10(5)(c)'
146
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were subject to measures to ensure that the personal data processed are secured, protected, subject to suitable safeguards, including strict controls and documentation of the access, to avoid misuse and ensure that only authorised persons have access to those personal data with appropriate confidentiality obligations'
147
+ value: true
148
  data_and_data_governance_personal_data_access:
149
  article: 'Art. 10(5)(d)'
150
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were not to be transmitted, transferred or otherwise accessed by other parties'
151
+ value: true
152
  data_and_data_governance_personal_data_deletion:
153
  article: 'Art. 10(5)(e)'
154
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the special categories of personal data were deleted once the bias was corrected or the personal data reached the end of its retention period (whichever came first)'
155
+ value: true
156
  data_and_data_governance_personal_data_necessary_105f:
157
  article: 'Art. 10(5)(f)'
158
  verbose: 'Where special categories of personal data have been used to ensure the detection and correction of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations, the records of processing activities pursuant to Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680 include the reasons why the processing of special categories of personal data was strictly necessary to detect and correct biases, and why that objective could not be achieved by processing other data'
159
+ value: true
160
  technical_documentation_general_description:
161
  article: 'Art. 11; Annex IV(2)(d)'
162
  verbose: 'Dataset carries technical documention, such as a dataseet, including a general description of the dataset.'
163
+ value: true
164
  technical_documentation_provenance:
165
  article: 'Art. 11; Annex IV(2)(d)'
166
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about its provenance'
167
+ value: true
168
  technical_documentation_scope:
169
  article: 'Art. 11; Annex IV(2)(d)'
170
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about scope and main characteristics'
171
+ value: true
172
  technical_documentation_origins:
173
  article: 'Art. 11; Annex IV(2)(d)'
174
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about how the data was obtained and selected'
175
+ value: true
176
  technical_documentation_labelling:
177
  article: 'Art. 11; Annex IV(2)(d)'
178
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about labelling procedures (e.g. for supervised learning)'
179
+ value: true
180
  technical_documentation_cleaning:
181
  article: 'Art. 11; Annex IV(2)(d)'
182
  verbose: 'Dataset carries technical documention, such as a dataseet, including information about data cleaning methodologies (e.g. outliers detection)'
183
+ value: true
184
  technical_documentation_cybersecurity:
185
  article: 'Art. 11; Annex IV(2)(h)'
186
  verbose: 'Cybersecurity measures were put in place as regards the data (e.g., scanning for data poisoning)'
187
+ value: true
188
  transparency_and_provision_of_information_to_deployers:
189
  article: 'Art. 13(3)(b)(vi)'
190
  verbose: 'Dataset is accompanied by instructions for use that convery relevant information about it, taking into account its intended purpose'
191
+ value: true
192
  quality_management_system:
193
  article: 'Art. 17(1)(f)'
194
  verbose: 'Datset was subject to a quality management system that is documented in a systematic and orderly manner in the form of written policies, procedures and instructions, and includes a description of the systems and procedures for data management, including data acquisition, data collection, data analysis, data labelling, data storage, data filtration, data mining, data aggregation, data retention and any other operation regarding the data'
195
+ value: true
196
 
197
  # Metadata related to data-related requirements when AI project is a GPAI model
198