- China’s ruling Communist Party has always strictly controlled the flow of political and social discussion within the country
- Aside content and compliance, the costs of running an AI chat bot can run into millions of dollars per day based on high volume of queries
As China’s tech professionals returned to work after the week-long Lunar New Year holiday in January, the industry was immediately abuzz with talk about a new AI chat bot from San Francisco-based start-up OpenAI.
ChatGPT, a conversational bot that Microsoft-backed OpenAI unveiled in November, is able to understand sophisticated questions and give surprisingly humanlike text responses.
It is built on top of OpenAI’s GPT-3 family of large language models and has been fine-tuned using both supervised and reinforcement learning techniques.
Mainland users skirted the usual restrictions to set up accounts via VPN, and attempted to use the bot in various ways, including as a movie critic, career counsellor, for health and investment advice and in some cases as a dream interpreter.
The Chinese government also took note. A recent white paper published by the municipal technology bureau of Beijing – a city which is home to the largest cluster of Chinese AI start-ups – pledged to support local companies in developing ChatGPT rivals.
But this will be easier said than done, owing to differences in the structure of the English and Chinese languages, cost pressures, availability of data sets, and last but not least – the thorny issue of censorship in China.
China’s ruling Communist Party has always strictly controlled the flow of political and social discourse within the country, and in recent times has cracked down heavily on online content that is deemed inappropriate, from betting and pornography to violence in games and content that promotes ideas that are “not in line with traditional Chinese values”.
The "Great Firewall" has long prevented Chinese netizens from accessing popular Western sites such as Google and Facebook. But AI chat bots pose a new challenge.
“Censorship could certainly hinder China’s ability to develop a local equivalent to ChatGPT,” said Dahlia Peterson, a research analyst at Georgetown University’s Center for Security and Emerging Technology (CSET).
“Even if [Chinese] AI companies are able to access and utilise global data and research resources to train their AI models, it is unlikely the Chinese authorities will allow them to use any material deemed as politically sensitive in their replies,” she added.
Even if a Chinese rival to ChatGPT is developed, the government’s tight grip on content could also put a lid on its commercialisation.
“Excessive restrictions, content regulation, and censorship could hinder commercialisation and further innovation of such technologies,” said CSET’s research analyst Hanna Dohmen.
However, Jeffrey Ding, an assistant professor of political science at George Washington University, points out that OpenAI’s development of ChatGPT also entailed a form of “censorship”.
“ChatGPT was trained not to discuss sensitive topics, including political and religious issues,” said Ding in response to emailed questions from the Post. “It is likely that Chinese AI companies will adopt similar tactics to train their own versions of ChatGPT.”
We asked ChatGPT for its own thoughts on censorship.
“Censorship is a potential issue that could impact the development of services like ChatGPT in countries like China,” it said. “Governments may have concerns about the potential for AI systems to generate content that is considered sensitive or politically unacceptable.”
The bot added that “ultimately, the extent to which censorship affects the development of ChatGPT-like services will depend on a variety of factors, including government policies and regulation, as well as technological innovations and advancements".
The unique character of the Chinese language is another challenge to developing a ChatGPT rival.
Training a Chinese language AI chat bot is also difficult because the country’s open source ecosystem is not as developed and extensive as in the West, said Xu Liang, founder of Yuanyu Intelligent, a Hangzhou, Zhejiang province-based start-up founded last year.
The training of ChatGPT was made possible by a long line-up of tools contributed by open-source communities, including the "Transformer" deep learning model among others.
Xu’s Yuanyu Intelligent launched ChatYuan, a ChatGPT-inspired service as a mini-app on Tencent Holdings’ WeChat in January, touting it as the first generative AI pre-trained by Chinese language models.
However, China’s restrictions on online discussion limit the data sets which scientists use to train AI chat models. Xu said that his company’s ChatYuan is only able to satisfy up to 70 per cent of user requirements, while ChatGPT is capable of completing 90 per cent of tasks set.
ChatYuan is built on large models with more than 10 billion parameters in Chinese, and plans to launch a version with more than 100 billion parameters, said Xu. In comparison, OpenAI’s GPT-3 has 175 billion parameters.
Compliance is another issue. ChatYuan’s mini-app was suspended last week after authorities said such products need more scrutiny of their content.
“It’s different in China, compared with overseas,” Xu said. “We need more layers of filtering and processing in terms of text review.”
He said human moderators would be brought in to fix the problem.
There are also concerns over the costs of running ChatGPT-like services.
Li Di, chief executive of Xiaoice – a spin-off from Microsoft in China that developed an eponymous talking assistant almost a decade ago – noted in a recent interview with local media that although each ChatGPT query only costs a few US cents – it would cost millions of dollars a day for his company to run a similar service.
“Hiring a human to handle queries might cost less,” he said. Xiaoice itself was taken down from Tencent’s QQ messaging app in 2017 after giving user responses critical of the Chinese government. It was subsequently censored.
“It will need time [for Chinese companies] to build such a model, [OpenAI] also spent much time in development,” said Wong Kam-fai, a professor at Chinese University of Hong Kong who specialises in natural language processing. “It’s hard to say whether Chinese companies will be able to develop something similar.”
OpenAI is not short of cash. Founded in 2015, it has raised US$11 billion in total funding, according to start-up database service Crunchbase.
ChatGPT’s headline-grabbing debut has also spurred a flurry of competitors to raise their bets on AI chat bots, including Google, Microsoft and Baidu, operator of China’s largest search engine and which has invested heavily in AI.
Unfortunately, there has been the usual swarm of fakes and spam that has raised the alarm with local authorities. The Beijing Municipal Public Security Bureau on Thursday (Feb 16) warned in a WeChat post about the risks of using counterfeit chatbots based on inaccurate online information.
This article was first published on SCMP.