The term Artificial intelligence and Machine learning has been the buzzword for the past few years, for instance, Google Executives including CEO Sundar Pichai used the word ” AI” for at least 150 times during a 2-hour Google I/O 2023 Keynote announcement. To substantiate the same, PwC published a report titled “PwC’s Global Artificial Intelligence Study: Sizing the Prize” published in 2017 predicts that AI could contribute up to $15.7 trillion to the global economy. To keep pace with this rapid development, preserve consumer privacy, and ensure robust data protection, various regulatory authorities have released whitepapers and regulations to articulate compliance mandates for organizations aiming to efficiently use artificial intelligence and machine learning technologies without violating core data protection and privacy rights.
To catch up with the compliance trend, the Indian government after 5 years of several rounds of consultation, finally passed a comprehensive legislation for Data Protection titled “Digital Personal Data Protection Act 2023” and the act has mandated several mandatory compliance requirements for companies operating and providing services in India. In addition to the compliance requirements under the DPDP Act, Honorable Indian IT Minister Rajeev Chandrasekhar mentioned during a press conference that several data protection rules (as most of the provisions in the act left the scope of compliance procedures to be guided by delegated legislations) will be released by mid-October 2023. He notifies that the maximum grace period for compliance will not exceed 12 months.
The impact of Digital Personal Data Protection Act 2023 on Artificial Intelligence and Machine Learning:
According to a study report conducted by Boston Consulting Group in collaboration with IIT-A titled ” AI in India – A Strategic Necessity” founds that incorporating and adopting Artificial Intelligence could potentially add up to 1.4% to the annual real GDP growth of India and further quantitatively depicts the increased research in the field of AI development in India and founds that the Private investment in Research and Development relating to AI technologies in India was approximately about 642 million USD which depicts the acceleration of AI investments in the country.
The newly enacted DPDP act does not explicitly mention anything about Artificial Intelligence, but the core principle and preamble of the act is to recognize the rights of individuals and to protect their data by mandating the act of permitting processing of such personal data for lawful purposes only. The core functioning of Artificial intelligence and Machine learning models is based on the collection of vast amounts of data. To be precise, the training of AI systems entirely depends upon the data collection, and the presence of a vast amount of data set is essential for determining the success or failure of a Machine learning Algorithm. Even the definition of the term ‘Machine learning’ by IBM states that “ML is a branch of AI and computer science that focuses on the use of data and Algorithms to imitate how humans learn and improve its accuracy.
To regulate the same, section 4 of the DPDP Act mandates that valid consent or legitimate uses are required for the processing of Personal Data of a Data Principal. The term legitimate use is different from that of the ground “Legitimate interest” mentioned in Article 6(1) (f) of GDPR.
Section 7 of the DPDP Act, tabulates the instances which amount to legitimate interest in 9 headings including:
- Specified purpose with voluntary disclosure
- State and instrumentalities process data for issuing licenses, benefits, subsidies etc.,
- Perform state functions or national interest
- Performing legal obligations in India.
- For compliance with judgment or order in India
- responding to a life-threatening medical emergency
- Provide medical treatment during the epidemic, or any threat to public health
- Assist with public safety and disaster
- For employment-related purposes.
So, to train the AI models the owners of the model require consent or the act should be justified within one of the legitimate uses. Apart from these two conditions, there is an exception provided under Section 3 of the DPDP Act.
According to Section 3(c)(ii) of the DPDP Act, which exempts the application of the provisions of the act for personal data made or caused to be made available to the public by the Data Principal or by any other person who is legally obliged to make such Personal Data publicly available. Corporations may exploit this exception to process vast amounts of data sets, but there is a lack of clear provisions regarding instances where such publicly available data has been taken down to private and whether it will be qualified as personal data and be protected under provisions of the DPDP Act.
Further, according to Section 9 of the DPDP Act, processing the personal data of children (individuals below 18 years of age and persons with a disability) can only be done by obtaining the verifiable consent of the parent and legal guardian. Sub-clause (2) of Section 9 prohibits processing (even if verifiable consent is obtained) if it can cause any detrimental effect on the well-being of the child. However, in such cases, it is practically possible to formulate data collection and analyze the compliance requirements under sub-clause (2) of Section 9.
But if the parent or the legal guardian made the personal data of the child or disabled person publicly available whether the same protection is available to such data i.e., for instance if a parent discloses personal information about their children (irrespective of their intent) whether it will be within or exempted from the application of DPDP Act. Apart from the Applicability dilemma, it is practically impossible to cypher and identify such situations of doubtful data sets during vast amounts of data collection for training AI models, this situation may be possible to avert in manual data collection or specified data collection by small business, but in cases of feeding Data for training AI models or training the AI models by Data scraping till date there is no feasible technological advancement to address the said issue.
Apart from the technical downsides, there also arises another situation where the data which is being shared as a prompt by users in Generative AI models should be protected under compliance requirements of the DPDP Act. Even though the Data Principal has voluntarily provided his Personal information to the Generative AI model, in most of the instances it is presumed that the personal data has been shared for providing personalized and quality responses, but as a default setting this personal data is being used for training the said LLM models. This practice is a blatant violation of section 6(1) of the DPDP Act which specifies that agreement to processing of personal data for a specified purpose must be limited to such specified purpose and in case the LLM model is using the data for any other purpose, it shall specify the Data Principal with proper notice as mentioned under Section 5 (1) (i) of the DPDP Act which mandates the essential conditions of depicting the purpose of processing under the Notice given to Data Principal.
Ineptness for Executing Deletion and Other DSR compliance requests:
Even after the Notice requirements are fulfilled as prescribed under Section 5 of the DPDP Act, and assuming that consent was given by the Data Principal, it is impossible for the Data Fiduciaries to comply with some general obligations mentioned under Section 8 of the DPDP Act. To be precise, according to Section 8(3)(a), when the processing of personal data by a Data Fiduciary is used to make a decision that affects the Data Principal, the Data Fiduciary must ensure the completeness, accuracy, and consistency of the data. Since these mandates are abstract and broad in nature, it is difficult to implement and check whether they are being followed at all times. For example, it is possible for another user of an LLM model to trick and prompt it to generate inaccurate information about a certain Data Principal. In such a case, whether it is possible to rectify this mistake of non-compliance is a million-dollar question.
Another important obligation is exercising of proper DSR framework. According to Chapter III of the DPDP Act, there are four rights enshrined to the Data Principal i.e.
- Right to Access information about Personal Data
- Right to correction and Erasure of Personal Data
- Right to Grievance Redressal
- Right to Nominate
To exercise the Right to Access and Right to correction and erasure it is important to identify the particular data set which stores the personal data of a specific Data fiduciary and the act of making an ML model to forget the data set in which it has been trained is technically an impossible task and the research for implementing such features are being still in the nascent stage for instance the model of Machine Unlearning via Neuro Masking as proposed by Researchers based on Columbia University. Since these concepts are in a nascent stage and present scenario it is a herculean task to comply with such mandatory Data Principal Requests.
Beyond the compliance burdens for AI and ML models, there is a proposal suggesting that AI can serve as an ally in the journey towards DPDP Act compliance. It is recommended that the AI model be used as a Consent Manager (as defined in Section 2(g)), a role where it manages consent on behalf of the Data Principal. The AI model can be trained to comply with the mandatory requirements for a Consent Manager
It is important to note that the regulation of Artificial intelligence is being prioritized by the Major economic superpowers. This is evident from the recent G20 summit in September 2023, in which a “G20 New Delhi Leaders’ Declaration ” has been declared which advocates that to ensure responsible AI Development, Deployment and use, the protection of human rights, transparency, fairness, accountability, regulation safety, ethics, biases, privacy and data protection must be seriously addressed. This intention may result in additional compliance requirements and will ultimately result in the usage of a more mature and ethical AI model without clipping its wings of innovation and digitalization.