Korea’s data protection watchdog recently imposed a hefty penalty on a startup for leaking a massive amount of personal information in the process of developing and commercializing a controversial female chatbot. The Personal Information Protection Commission (PIPC) accused Scatter Lab, a Seoul-based startup for illegally using personal information of its clients in the development and operation of an artificial intelligence-driven chatbot service called “Lee Luda.” Scatter Lab was ordered to pay 103.3 million won (Korean currency), which is approximately $92,900 in penalties — a penalty surcharge of 55.5 million won and an administrative fine of 47.8 million won. It is the first time in Korea that the government has sanctioned the indiscriminate use of personal information by companies using AI technology.

Recent major amendments to three South Korean data privacy laws and their implications

On 9 January 2020, South Korea’s National Assembly passed amendments (the ‘Amendments’) to the three major data privacy laws: the Personal Information Protection Act (PIPA); the Act on the Promotion of Information and Communications Network Utilisation and Information Protection (‘Network Act’); and the Act on the Use and Protection of Credit Information (‘Credit Information Act’). The Amendments came into force on 5 August 2020, except for certain provisions in the Credit Information Act which will come into effect in a year to 18 months after its promulgation.

The Amendments largely aim to:

• Minimise the burden of redundant regulatory activities and confusion among regulated persons stemming from previously overlapping data privacy regulations and multiple supervisory bodies; and
• Develop a ‘data economy’ by introducing the concept of ‘pseudonymised data’ and legal basis on which data may be used in a more flexible way to an extent reasonably related to the original purpose of collection.

AI ChatBoT – Technical Details

The company illegally harvested data from 9.4 billion conversations conducted by 600,000 users of its other apps, “Science of Love” and “Text At”. The Science of Love app focused on conversations between romantic partners to predict a partner’s true levels of affection. Those insecure about their mates could pay equivalent US$4.50 to upload their KakaoTalk messenger logs to Science of Love and be reassured of (or disappointed by) their partner’s level of love. In addition, the Lee Luda chatbot was programmed to select and speak one of about 100 million KakaoTalk conversation sentences from women in their 20s, the PIPC said.

The natural tone of Lee Luda was possible as ScatterLab collected “10 billion real-life conversations between young couples taken from KakaoTalk”, which is the most popular message application in South Korea (McCurry 2021). ScatterLab did not directly collect conversations from KakaoTalk, but took a roundabout way; in other words, in a sneaky way. There have been few counselling service applications which analyse messenger conversations and give advice about love life when the users agree to submit their KakaoTalk conversations to the apps. ScatterLab obtained data from those applications very easily.

Sanction on Artificial Intelligence and Machine Learning Development

Scatter Lab is accused of using about 600,000 people’s 9.4 billion KakaoTalk conversations collected from its emotional analysis apps Science of Love and Text At in the process of developing and operating the Lee Luda chatbot service without obtaining their prior consent. The company is also accused of collecting personal information of about 200,000 children under the age of 14 without obtaining the consent of their parents or guardians in the development and operation process for its services. Scatter Lab did not set any age limit in recruiting subscribers for its app services and collected 48,000 children’s personal information through Text At, 120,000 children’s information from Science of Love and 39,000 children’s information from Lee Luda. The company was also criticized for failing to delete or encode the app users’ names, mobile phone numbers and personal addresses before using them in the development of its AI chatbot learning algorithms.

Impact on Business

The app was a hit. By summer 2020 Science of Love had been downloaded 7.5 million times in South Korean and Japan and Scatter Lab planned to start capitalizing on relationship insecurity in the US. However, expansion came to a screeching halt when Scatter Lab added a chatbot service named “Lee-Luda”, marketed as a 20-something AI friend available for those who’d given up on human interaction. The Lee Luda chatbot service attracted more than 750,000 users in just three weeks after its launch on Dec. 23, but Scatter Lab suspended the Facebook-based service the following month amid complaints over its discriminatory and offensive language against sexual minorities since Lee-Luda had a proclivity towards lewd and homophobic speech, and she also leaked personal data. Furthermore, Lee Luda’s training data was uploaded to GitHub thus exposing names, locations, relationship status and even some medical information. Lee-Luda was shut down and Science of Love was slammed with damming Google Play reviews.

“This case is meaningful in that companies are not allowed to use personal information collected for specific services indiscriminately for other services without obtaining explicit consent from the concerned people,” PIPC Chairman Yoon Jong-in said, as we anticipate South Korean companies’ entry into the EU market. Such cases become highlights from a business point of view as this will facilitate a transfer of personal data between South Korea and EU Member States easily. In this regard, companies will need to verify in advance whether they are subject to the EU’s General Data Protection Regulation (GDPR) and, if so, ensure compliance with its legal requirements to reduce legal risk.

Data Ethics and Preventive Controls

It is very common to see the users of a certain internet service are indifferent to the usage of their personal data, although they have the rights to the data. They must agree to terms of services — which states that their personal data will be collected and shared — otherwise they will not be able to use the service. Yet, they are often not aware of the terms as they simply do not read the screed or do not understand the legal terms. They would implicitly know that their personal information will be revealed or used somewhere and sometime, but they would not know the exact usage or extent of disclosure. The best way to prevent data leakage or misuse would be that individuals need to understand what kind of data they are sharing, who they are sharing with, and where the data will be used.
In addition to this, the data collectors often overlook data ethics that they need to collect and handle the data with caution. Obviously, the lack of control on the usage of data can produce negative outcomes. Thus, the data collectors must specify what kind of data they will be collecting from data providers and how they will be used. They also should have a sense that the data providers gave the right to use their data, thus the data cannot be transferred to others without agreement, and the data should be treated carefully. Furthermore, there must be legal and technical mechanisms which protect data providers’ privacy and prevent data collectors from breaching laws.

In sum, keeping data safe is not just a matter of one certain group of people, but it is a matter of everyone. By understanding how personal data should be shared, how the data one shared can be used, and what steps are needed to protect the data, we can protect our personal information and will be able to make good use of advanced technology without being counterattacked. Both data providers and data collectors need to be responsible for the data they create, provide, collect, and use by keeping in mind that AI is built upon big data.

Cookie	Duration	Description
__cfruid	session	Cloudflare sets this cookie to identify trusted web traffic.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Analytics" category.
cookielawinfo-checkbox-functional	1 year	The GDPR Cookie Consent plugin sets the cookie to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Necessary" category.
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores user consent for cookies in the category "Others".
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores the user consent for cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
elementor	never	The website's WordPress theme uses this cookie. It allows the website owner to implement or change the website's content in real-time.
OptanonConsent	1 year	OneTrust sets this cookie to store details about the site's cookie category and check whether visitors have given or withdrawn consent from the use of each category.

Cookie	Duration	Description
_calendly_session	21 days	Calendly, a Meeting Schedulers, sets this cookie to allow the meeting scheduler to function within the website and to add events into the visitor’s calendar.
SRM_B	1 year 24 days	Used by Microsoft Advertising as a unique ID for visitors.

Cookie	Duration	Description
_clck	1 year	Microsoft Clarity sets this cookie to retain the browser's Clarity User ID and settings exclusive to that website. This guarantees that actions taken during subsequent visits to the same website will be linked to the same user ID.
_clsk	1 day	Microsoft Clarity sets this cookie to store and consolidate a user's pageviews into a single session recording.
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gat_gtag_UA_*	1 minute	Google Analytics sets this cookie to store a unique user ID.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
CLID	1 year	Microsoft Clarity set this cookie to store information about how visitors interact with the website. The cookie helps to provide an analysis report. The data collection includes the number of visitors, where they visit the website, and the pages visited.
MR	7 days	This cookie, set by Bing, is used to collect user information for analytics purposes.
SM	session	Microsoft Clarity cookie set this cookie for synchronizing the MUID across Microsoft domains.

Cookie	Duration	Description
ANONCHK	10 minutes	The ANONCHK cookie, set by Bing, is used to store a user's session ID and verify ads' clicks on the Bing search engine. The cookie helps in reporting and personalization as well.
MUID	1 year 24 days	Bing sets this cookie to recognise unique web browsers visiting Microsoft sites. This cookie is used for advertising, site analytics, and other operations.

AI data ethics rattles the South Korean tech culture!

Privacy Terms & Conditions

About Tsaaro

Resources

Other Links

Tsaaro Netherlands Office

Tsaaro India Office

Tsaaro Noida Office

Tsaaro Bangalore Office

Tsaaro Mumbai Office

We’d love to help your organization achieve your Data Protection goals!