India, let alone the world, is set to generate unprecedented amounts of data. A few companies have already extracted the value of this gigantic resource while others are just taking first steps. Very few countries have any regulatory mechanism in place for governing data. India is one of the few pioneering countries proactively seeking to bring in regulation in this aspect. While the Personal Data Protection Bill is being considered by a Joint Parliamentary Committee, steps are being taken to study how the non-personal data could be regulated. In July, the Kris Gopalakrishnan committee submitted a draft governance framework for non-personal data.
Mindmap Learning Programme (MLP)
Absorb information like a sponge!
- Current Affairs (Newsbits, Editorials & In-depths)
- Indian Polity
- Indian Economy
- Art & Culture
- Geography (World & Indian)
- Ancient Indian History
- Medieval Indian History
- Modern Indian History
- Post-Independence Indian History
- World History
- International Relations
- Indian Society & Social Justice
- Internal Security
- Disasters & its Management
- Science & Technology
- Syllabus-wise learning
- Prelims Sureshots (Repeated Topic Compilations)
What is the background of the NPD Committee report?
- According to a Nasscom- McKinsey report, India is set to house a 500 billion USD data market in the next 5 years. This calls for a properly functioning regulatory mechanism.
- The Ministry of Electronics and Information Technology had constituted an expert committee headed by Kris Gopalakrishnan, the co-founder of Infosys, to frame a draft non-personal data governance framework.
- This committee called the Committee of Experts on Non-personal Data Governance Framework or the NPD Committee submitted its report in July.
- The committee was constituted following queries submitted to the ministry under the Personal Data Protection Bill– on whether access to non-personal data will be made free.
What is non-personal data?
- Non-personal data is basically any data set that is devoid of personally identifiable information. This means that the data cannot be used to identify an individual.
- The draft framework defines non-personal data as the data that is not ‘personal data’ as defined in the PDP Bill or data that is ‘without any Personally Identifiable Information (PII)’.
- It also includes data that
- Never related to an identifiable natural person. Eg: data from public infrastructure, weather conditions, etc.
- Was initially personal but later anonymised using data transformation techniques. Some of these techniques include the use of tools like Amnesia, K-anonymity, Anonimatron, L-diversity, etc.
- Such data is classified into 3 categories (by the committee), based on the data source and whether re-identification is possible or not:
- Public non-personal data: data collected by government and its agencies (eg: census) and by municipal corporations (eg: data on total tax receipts in a given time period, data collected during implementation of publicly funded works, etc.)
- Community non-personal data: data about a set of people with common factor like same geographical location, occupation, religion or a common social interests. Eg: metadata gathered by telecom companies, discom companies, ride-hailing apps, etc.
- Private non-personal data: data produced by individuals and derived from application of propriety software or knowledge.
Is non-personal data sensitive like personal data?
- Personal data contains explicit information that can be used to identify a person, such as- name, gender, age, biometrics, sexual orientation, genetic details, etc. These are highly sensitive.
- In contrast, non-personal data are in anonymised
- However, this does not mean that such data is not sensitive and don’t require regulation. Some categories of non-personal data, even though anonymised, can be dangerous when unregulated.
- According to the Kris Gopalakrishnan committee, non-personal data arising from sensitive personal data can be sensitive non-personal data. For instance, if the non-personal data is about the health of a community, even if it’s in an anonymised form, can be misused.
- Specifically, the committee highlighted the sensitivity in the following perspectives:
- The NPD relates to national strategic or security interests
- The NPD bears a risk of collective harm. eg: collective privacy concerns.
- The NPD constitutes trade secrets and is sensitive for businesses.
- The NPD has a risk of re-identification. This is because no anonymisation method has perfect irreversibility.
How are other countries regulating non-personal data?
- The European Union had formulated a regulatory framework in May, 2019. Under this framework, non-personal data is to be shared freely among the EU countries.
- The EU countries are to cooperate for data sharing. They are required to inform the European Commission of any draft acts which brings in new data localisation requirements or changes existing data localisation requirements.
- It recognises all data that isn’t personal as non-personal data but hasn’t clearly defined its constituents.
- In 2016, the EU adopted a landmark framework to govern the flow of personal data on the internet- General Data Protection Regulation.
- In many countries, there isn’t a national level data protection law– even for personal data.
What are the recommendations of the recently released draft framework?
- The committee recommends the classification of NPD in the same manner as the personal data classification under PDP Bill:
- General NPD. Eg: NPD about mobile penetration in a city, pollution levels in a city, etc.
- Sensitive NPD. Eg: NPD about health of a community
- Critical NPD. Eg: NPD data obtained by anonymising critical PD.
- Like in the case of storing personal data, the committee recommends storage restrictions based on the NPD sensitivity:
- General NPD to be stored anywhere
- Sensitive NPD to be stored in India but can be transferred outside the country
- Critical NPD to be stored in India alone
- The consent of the data principal (individual to whom the data pertains) is required for anonymising the personal data and for use of the resulting NPD.
- It proposes different roles in the NPD ecosystem:
- Data principal: entity/ individual to whom the data pertains. Eg: in case of census, the citizens are the data principals.
- Data custodian: the entity involved in collection, storage and processing of the data. This is similar to the role of ‘data fiduciary’ under the PDP Bill. The data custodian has a ‘duty of care’ (a general set of obligations) to the community from which the NPD is collected.
- Data trustee: the entity through which the data principal will exercise its rights. A government entity or a community body may fulfil this role. Eg: the health ministry could be the trustee for data on diabetes, a local NGO can be the trustee of data regarding solid waste management, etc.
- Data trusts: institutional organisations for sharing a specific dataset in accordance with specific rules and protocols. This can contain data from multiple sources and from multiple data custodians.
- The data trustee can recommend the enforcement of obligations (like transparency, regulation practices, etc.) on the data custodians to the data regulator.
- The data regulator and the data trustee will collaborate to enforce data sharing. Eg: collaboration of transport department with the data regulator with regards to data on modes of transport.
Ownership of data
- The committee has adopted a ‘beneficial ownership/ interest’ notion because many actors may have overlapping rights and privileges over the data. Hence:
- The public NPD is to be treated as ‘national resource’.
- In case of private NPD, the individual from whom its derived will be treated as the data principal.
- In case of community NPD, the community will have the right to determine how its used through a data trustee– who is the closest and most appropriate representative of the community.
Category of data businesses
- The committee proposed the creation of a new business category called the ‘data businesses’. This is not an independent sector –but rather a horizontal classification. Businesses (including already existing ones) that gather data beyond a certain threshold are to be classified under this category. Businesses that fall below the threshold can voluntarily register as a data business.
- These data businesses will have to furnish information such as the data collections’ processing, uses, sales, services developed based on it, etc. If the collection is beyond a threshold, it will also have to submit metadata.
Sharing of NPD
- The committee recommended various purposes for which the data be shared-
- Sovereign purposes: for national security, legal purposes, etc.
- Core public interest purposes: for community benefits, policy making, better delivery of public services, etc.
- Economic purposes: to encourage competition, level playing field for start-ups, etc.
- Recommended the specification of a new class of ‘special public interest’ or ‘high value’ dataset. This can include transportation data, health data, etc.
- Private organisations have to share only factual data in its raw form.
- The remuneration for the requested data (NPD) will be based on the value addition. Eg: data sharing will be on FRAND basis (fair, reasonable and non-discriminatory) in case of NPD with low value addition.
- The committee recommended checks and balances in the form of ‘expert probing’ measure involving academicians, experts, other organisations, etc. who are registered through a self-serve peer review.
Research and innovation
- Creation of data spaces to bring together universities, NGOs, research labs, citizens etc. to encourage ‘intensive data-based research’. These can have several sectoral spaces with dedicated clouds.
- Creation of ‘data and cloud innovation labs and research centres’ to function as field validation centres to test out digital
NPD Regulatory Authority
- The committee recommended the establishment of a regulatory authority with an enforcing (compliance with rules and regulations) and an enabling role (addressing root causes of market failures, ensuring effective competition, level playing field, etc.)
- Data businesses are to link their ‘raw data pipes’ with the Authority for submitting data on request. This is to be done within a specified time.
- The Authority will enforce compliance irrespective of whether or not the data businesses are already being regulated by their sectoral regulator.
- The committee has given some guiding principles for a technology architecture to enable digital implementation of the data sharing rules:
- Mechanism for accessing data: shareable NPD and dataset created/ maintained by the government, its agencies, universities, research labs, NGOs, etc. should have a REST API (Representational State Transfer Application Programming Interface) for accessing data. Data sandboxes (a development platform) can be utilised for experimentation.
- Distributed storage for security: to ensure that there is no single leakage point. If all the sharing is done through API, all the data requests can be accounted for.
- Systematized data exchange approach: the data collected should be made accessible via a data exchange which can accept data in any form but gives the output in a standardized format so that it is usable by all stakeholders.
- Prevent de-anonymisation: different techniques can be used to prevent re-identification.
- These principles have been constituted into a 3-tiered architectural system covering legal safeguards, technology and compliance.
What are the concerns?
- Legal experts have objected to the ‘excessive focus’ on economic interests rather than public interests.
- Absence of appropriate mechanisms to regulate government’s access to data.
- Some have raised question about why such data access requests should not comply with the principles of necessity, legality and proportionality when the committee report recognizes the possibility of privacy violations in enabling such access.
- The wisdom of forcing large businesses to share data indicating consumer behaviour has been questioned too as these are the intellectual property rights of these firms.
- The report assumes that raw data is least valuable for firms. This may not be always true– some datasets involve a degree of creativity and need to be protected under copyright laws.
- The ‘modicum of creativity’ rule adopted under the 2007 Eastern Book Company case clarified that copyrights can be assigned to works that have a minimum level of intellectual creativity.
- Some datasets could reveal propriety processes used by organisations to generate/ collect data. This is in violation of legal obligations as such processes may be trade secrets.
- Experts opine that the draft could have been clearer with regards to certain aspects like community non-personal data. The overarching classification of community NPD ignores the nuances of assorted online businesses.
- The drafts hasn’t adequately provided for community rights. As the boundary of who constitutes a community is fluid, anyone can lay claims on it.
- In addition to possible operational difficulties, there is lack of clarity regarding how the mandatory sharing of community NPD will translate into shared benefits.
- The parameters based on which the regulator is to consider data sharing requests are being criticised as vague. This gives the regulator wide discretionary powers.
- Possibility of over-regulation in tech sector: at least 4 regulators may come up in the near future- Non-Personal Data Authority, Data Protection Authority, E-commerce Regulator and the Central Consumer Protection Regulator. This would give rise to jurisdictional overlaps and affect the regulation quality. Eg: DPA is to regulate NPD along with NPDA, potential conflict between NPDA and CCI (Competition Commission of India) with regards to addressing market failures, etc.
What is the way forward?
- Non-personal data going to fuel economy. But privacy concerns are genuine too and must be factored in the final legislation.
- Regulations designed based on this draft should put more focus on citizens’ rights rather than pure economic interests. Experts say that the data belongs to individuals and not these data gathering companies.
- Some experts have suggested using the data authority provided for under the PDPL rather than establishing a new authority.
- To address anti-trust conduct and market failures, the CCI could be strengthened instead of establishing a new NPDA. The CCI is already functioning and has technical and economic expertise to tackle such competition related issues.
- Need to bring in provisions for compensation for personal data in case of non-compliance.
- Clearly lay down the liabilities of data custodians, trustees and businesses.
- There is a need to clearly define the roles of various participants including data principal, data trustees and data custodians.
- The regulation needs to be more concise and clearer with regards to the participants’ responsibilities to give certainty and hence encourage the confidence of the market participants.
The draft presented by the NPD committee is a pioneer in the sense that it identifies the power wielded by NPD, its roles and usages in the society. It is true that data is a vast untapped resource for boosting the economy. However, the challenge for the policy makers is to strike a correct balance between privacy concerns and intellectual property rights and making data available for a level-playing field in innovations and policy-making.
Practice question for mains
Is there a need to regulate non-personal data, given it is less sensitive than personal data? Discuss the recommendations of the Kris Gopalakrishnan committee report. (250 words)