How can individuals be protected when their personal data is constantly being collected for uses that may not be apparent until some future date? And when it may not be obvious who is collecting that data?

As giants like Google, Facebook, WeChat and Alibaba track their users every minute of the day, these questions are rising high on government agendas around the world. In little more than a decade, most people now share personal information in order to gain access to services – whether socialising, shopping, seeking entertainment, or checking up on their health. Even our whereabouts can be tracked at every moment if the location service on our phones is turned on.

That goldmine of information is being used by both businesses and governments to make decisions about individuals and groups, such as how much to charge certain users for services, whether to deny them access and what trends are revealed by their data. And therein lie several problems.

First, the story told by big data may not be an accurate one. Professor John Bacon-Shone of the Faculty of Social Sciences, a statistician with an interest in big data and privacy who also advises the Hong Kong Government on the issues, cites the example of the Google Flu Trends web service which aggregated search queries about flu to predict outbreaks. “The problem is, it’s just an association, not causation, and it doesn’t work well at prediction. If you have a different type of flu, the whole thing falls apart,” he said.

Big data may also contain coding mistakes or built-in biases. Another example cited by Professor Bacon-Shone concerns decisions in the US on who should be granted bail. When African Americans were shown to be less likely to get bail after controlling for other factors, the decision was computerised. But the data fed into the computer came from past decisions. “The inputs already had bias in them. So you end up replicating the bias,” he said.

A third problem is that even when data is anonymised for the sake of privacy, it may be possible to
re-identify a person because the data retains telling details. For example, hospital data about
accident casualties will include the date, time of admission and condition, and inferences could be drawn about the identity of a patient. More worryingly, with big data crunching DNA information, it
is becoming possible to predict a person’s hair colour, eye colour and even surname based on a
sample of their DNA. “There are people who have been foolish enough to put their full DNA profiles
in the public domain. DNA has the potential for massive health benefits but also for massive risks,”
he said.

All of this seems to cry out for regulation. But this, too, is problematic.

Traditional regulation out of step

Personal data protection laws typically require banks and other institutions to keep accurate up-to-date information and disclose how it will be used. But when the technology is changing rapidly, with new and unanticipated uses becoming possible, this may no longer be sufficient.

Professor Anne SY Cheung of the Faculty of Law has been studying privacy and personal data protection and is co-editor of the 2015 book Privacy and Legal Issues in Cloud Computing. “Recent legal reforms and position papers from the European Union (EU), the UK and the US have raised concerns about the problem of profiling, predictive decisions and discrimination, and the harm that may result from that. This is because the use of big data is very different from our traditional understanding of how to regulate personal data.

“The traditional approach is essentially one of notice and consent: the collection of personal data is allowed only for a specific and limited purpose. But in the age of big data, the more data one has, the more accurate and arguably useful one’s conclusions will be. So the collector tries to collect as much data as possible and only after they have it and have done their analysis, will they find correlations and identify the purpose,” she said.

It can be difficult to control the use of personal data in these circumstances. The EU will implement a new regulation in May, 2018 on profiling and the use of anonymous data. Among other things, for decisions made about EU citizens using data collected through automated processes, the individual will need to be notified and will have the right to correct or object.

However, this only applies to EU citizens (including those working abroad) and in specific areas such as employment and credit scoring. There are still grey areas.

Professor Cheung cites the example of ‘well-being’ apps that track physical activity and other data on individuals. In the United States, one employer encouraged employees to use such apps and tied this to the health insurance premium it offered its staff. Health data is regulated in the US but not well-being data. The case has ended up in court. “If I’m very concerned about my privacy and don’t want to join such a scheme, would I be punished by having to pay more for my insurance?” she said.

Ethical, risk-based approach needed

Professor Bacon-Shone sees two key issues at play with big data use: transparency and fairness. Transparency means allowing people to correct incorrect information about themselves and making it clear how decisions are made, while fairness means avoiding situations like the US bail example. Importantly, it is not only individuals who face unfair decisions – groups can be targeted, too. For example, Mac computer users will be shown more expensive hotel options when they visit the travel website Orbitz than will PC users.

“When you’re basically saying to a computer, ‘here is all the data, make the best decision for me’ without understanding how that decision is reached, whether it is fair, whether it has unintended consequences, then you have really very challenging questions. These are ethical questions, not just technological,” he said.

Professor Cheung concurs. “We should be talking about the ethical use of big data and artificial intelligence because the law is always behind the technology,” she said.

“I’ve been arguing we should move to a risk-based or harm-based regime instead of just focussing on notice and consent – looking at the use of such data, the risk level, how likely it will be shared with a third party, who will be the downstream party. We should be targeting those uses rather than just seeking broad consent, and we should be getting more specific in terms of the possible usage and the context of that usage.”

Individuals also need to learn to protect themselves. Professor Cheung herself does not use her real name or put much personal information on Facebook (although Facebook managed to correctly guess her secondary school from the data on her friends). She does not use WeChat, electronic wallets, well-being apps or, for the most part, location services. Her friends have told her that she is being too cautious and that privacy is dead. “I don’t know how long I can resist this trend,” she admitted.

“Most people embrace technology and the conveniences and advantages it brings. Of course, big data and artificial intelligence do have advantages that we cannot deny. But in terms of potential risk to the individual, it depends on awareness and education, a discussion about the issues and the approaches of leaders in both government and industry,” she said. For, as the sidebar on China shows, big data not only raises challenges related to privacy, but also governance. 


The aggregation of big data can be put to some amazing uses. But there are also risks for the individual.

Professor John Bacon-Shone speaking at the Symposium on ‘Data Protection Law Development in the Information Age’ at the City University of Hong Kong in September, 2016.

When you’re basically saying to a computer, ‘here is all the data, make the best decision for me’ without understanding how that decision is reached, whether it is fair, whether it has unintended consequences, then you have really very challenging questions.

We should be talking about the ethical use of big data and artificial intelligence because the law is always behind the


Professor John Bacon-Shone

Professor Anne SY Cheung

China: Big data, big brother?

The use of big data in China is of an altogether different level of concern from commercial uses of personal information. The central government is in the process of rolling out a social credit system that draws on big data to rate each individual’s reputation based on their political leanings, purchase history, social interactions and other factors.

“China is like a big data laboratory,” said Professor Cheung, who has been studying the situation there with colleague Dr Clement Chen. “Arguably, there is 360-degree surveillance watching individuals and gathering data. They have real-name registration [for mobile and internet services] and close connections between the government and the banking system and internet companies.”

The social credit system was announced in 2014 and although it will not be fully implemented until 2020, Professor Cheung and Dr Chen have already found that individuals suffer consequences for a low score.

On about five million occasions (as of August, 2016), ‘judgment defaulters’ who defied unspecified court orders were blocked from buying airline tickets. Such individuals were also stopped from travelling on high-speed trains. Low-scorers have also been barred from employment in the civil service and public institutions, and even their children can suffer by being disqualified from studying in private schools.

China does not have a law to protect personal data. Provinces and cities are also introducing their own scoring systems in addition to the national one and it has been suggested that people even be scored for filial piety – how well they take care of their parents. “How would they know? We don’t know. It could be from neighbours, your parents or travel tickets you purchase,” Professor Cheung said.

“This is more than a privacy issue, it is a governance issue, too, because it concerns the relationship between the citizens and   the State. Some scholars agree with the government rhetoric that this is to restore trust and sincerity in China after corruption and dishonesty got out of hand. Some say China is the real Orwellian state, with big brother and small brother watching together, which one cannot escape because people use their phones and the internet and there is real-name registration. It’s unresolved, which makes it interesting and challenging to study.”


(Courtesy of Tai Ngai Lung)