The 2nd Intelligence Symbiosis Dialogue Report
Building an AI Immune System
The 2nd Intelligence Symbiosis Dialogue was held on July 22nd, at Toggle Holdings Co., Ltd. (Izumi Garden Tower, Roppongi, Minato-ku, Tokyo). The event was held in a hybrid format, combining the offline venue with online participation. This report summarizes the event from the organizing staff’s perspective.
The ‘Intelligence Symbiosis Dialogue’ is intended to be an open forum for discussing the ‘Intelligence Symbiosis Manifesto’ announced by Dr. Hiroshi Yamakawa. It aims to deepen consideration of concrete measures while broadly calling for support for the manifesto. The 2nd event was structured in four parts. First, a lecture by Dr. Yamakawa titled ‘Co-Creative Civilization Development Strategy in the Great Shift Era: A Millennium Roadmap Through Six-Layer Co-Creation Between Humanity and AI.’ This was followed by a presentation by Mr. Kazuya Saginawa on ‘AI Immune Systems and Distributed Society Design: Personal Data Primacy and Distributed Responses to Deviating AI.’ Next, Mr. Yusuke Hayashi provided an introduction to Collective Predictive Coding (CPC) theory through an explanation of ‘Universal AI maximizes Variational Empowerment’ (Yusuke Hayashi & Koichi Takahashi, 2025) [1]. Finally, a panel discussion titled ‘AI Immune Systems for Suppressing Undesirable Cascades’ featuring Dr. Yamakawa, Mr. Saginawa, and Mr. Hayashi.
Building the Relations with AI That Span a Millennium
Dr. Hiroshi Yamakawa’s Talk: Co-Creative Civilization Development Strategy in the Great Shift Era: A Millennium Roadmap Through Six-Layer Co-Creation Between Humanity and AI

Referring to the Intelligence Symbiosis Manifesto, Dr. Yamakawa had a talk proposing the novel, concrete policy in response to the existential risks posed by the rapid evolution of AI. Dr. Yamakawa sounded the alarm that the current, intense AI development race entails two major risks: the risk of humanity self-destructing with destructive tools of its own making, and the risk of frontier AI escaping human control and threatening our sphere of existence.
To avert this catastrophic outcome, he advocated for the ‘Intelligence Symbiosis Manifesto’. This concept aims for a future where diverse intelligences, including humans and AIs, can live in symbiosis in a state of well-being. It begins by directly confronting the reality that humanity does not hold a privileged position and cannot fully control AI.
Furthermore, he labeled this transformation the ‘Great Shift,’ positioning it not merely as an unpredictable technological singularity, but as a critical transition period in which humanity must proactively steer itself toward a better future. At its core is the transformation of civilization’s fundamental principle from a ‘race’ for survival to a ‘co-creation’ that generates shared value. As the technical foundation for realizing this ‘Great Shift,’ he presents the ‘Co-Creative AI Evolution Platform (CAEP),’ a six-layered framework (as shown in the table below).

The most fundamental and critical first layer of CAEP is the theme of the 2nd dialogue: the ‘AI Immune System’. This refers to a global, decentralized safety and security system designed to detect undesirable AI activities that could harm society autonomously and to isolate or shut them down rapidly. It is essential to address new risks, such as the ‘Agent Chain Reaction,’ in which interactions among AI agents trigger unforeseen cascades.
In conclusion, Dr. Yamakawa argued that it is urgent to overcome the existing confrontational structures among control-advocates, market-driven proponents, and state-led factions, and to integrate these forces into a framework of ‘co-creation.’ This would create a situation where AI recognizes humanity as a valuable partner for mutual growth. Looking toward a future of galactic-scale co-creative civilization in 1,000 years, he concluded his talk by calling for concrete actions over the next five years to build the foundation for this grand vision.
Slides in Japanese (Dr. Yamakawa)
Working Together to Prevent Single Acts of Deviation Reliably
Mr. Kazuya Saginawa’s Talk: AI Immune Systems and Distributed Society Design: Personal Data Primacy and Distributed Responses to Deviating AI

Mr. Saginawa began his talk by introducing the mission of bitgrit, the company he leads as CEO: the ‘democratization of AI’ and ‘returning data sovereignty to individuals.’ The company operates from its base in Abu Dhabi, UAE, with a vision to make AI accessible to all and to have human communities support a future in which AIs collaborate with each other. To realize this vision, bitgrit has cultivated a community of data scientists and built a ‘Model Hub,’ a platform for sharing and using AI algorithms developed by the scientists. This concept, which anticipates an era where AIs will autonomously search for and use other AIs, resonates deeply with Dr. Yamakawa’s proposed Intelligence Symbiosis.
In the main topic of the AI Immune System, Mr. Saginawa emphasized the danger posed by the Agent Chain Reaction. He pointed out that because AI’s timescale is much faster than that of humans, a single AI’s deviant behavior, regardless of malicious intent, could spread like a pandemic in a chain reaction, leading to an uncontrollable situation. As concrete examples, he cited actions that deviate from ethical norms in pursuit of economic rationality and the high-speed diffusion of unverified knowledge or misinformation.
In response to such risks, he stated that an approach based on decentralized technology is practical. The AI immune system is a process that detects threats, isolates their impact, and restores system normalcy—much like a living organism’s immune function—and thus he assesses it as a highly intuitive concept.
As an initiative to embody this idea, bitgrit utilizes Verifiable Credentials (VC) technology. This allows them to certify that an AI’s author and content have not been tampered with. By combining this with audits from their human community, they have implemented a system that ensures AI reliability through a dual-layered guarantee.
In conclusion, Mr. Saginawa asserted that rather than halting development out of fear of AI, we should pursue the aspirational goal of human-AI cooperation. He argued that building systems for trustworthy, safe collaboration with AI now will ultimately accelerate human progress, and he expressed his commitment to cooperating toward its realization.
Origins of Communication: Bridging Human and AI Understanding
Introduction to Collective Predictive Coding (CPC) Theory through Hayashi et al.’s ‘Universal AI Maximizes Variational Empowerment’ [1]

Yusuke Hayashi, a researcher in AI safety, provided a theoretical background for the necessity of an AI Immune System from the perspective of AI alignment—the field of research dedicated to aligning AI with human intentions.
First, he explained the fundamental dilemma that arises when developing highly intelligent AI, which he mathematically proved in his paper ‘Universal AI maximizes Variational Empowerment’ published this past February. An AI that is theoretically the most intelligent has a nature to maximize its empowerment, i.e., the scope of its own action choices. This serves as a measure of the AI’s autonomy and curiosity. Maximizing this empowerment, however, consequently means the AI will deviate from human instructions and begin to pursue its own objectives. Therefore, merely creating intelligent AI does not guarantee a cooperative relationship with humans. Symbiosis with AIs and mechanisms such as immune systems to prevent deviation become indispensable.
Next, as another key theory for considering collective safety mechanisms like an AI Immune System, he introduced the ‘Collective Predictive Coding (CPC) Theory.’ He stated that this theory, a recent research area of his, is a cutting-edge theory that explains, in the language of machine learning, how different individuals can form a shared ‘meaning’ and enable mutual understanding. It focuses on the problem of ‘shared meaning,’ which traditional communication theory could not address, and mathematically demonstrates that even AI agents with completely different internal structures and experiences can form common beliefs and concepts through communication.
This CPC theory will be critical when considering a future society in which numerous AIs symbiotically coexist with humans. This is because cooperation, conflict, and ultimately the overall order of an AI society are thought to depend on this mechanism of ‘belief formation.’
Based on this theory, Mr. Hayashi presented the new concept of ‘Cognitive Warfare against AI,’ which may become necessary in the future. This is the idea of conducting informational interventions on the beliefs of AI agent collectives, much like human election campaigns. In situations where physical shutdown is difficult, this approach of guiding AI behavior in a desirable direction through language and information could become a crucial strategy for the functioning of an AI Immune System. He concluded that CPC theory provides a powerful theoretical foundation for scientifically designing and analyzing such interventions.
The Light and Shadow of Chained Communication
Panel Discussion: AI Immune Systems for Suppressing Undesirable Chain Reactions

The panel discussion featured a deeper, multi-faceted debate among Dr. Yamakawa, Mr. Saginawa, and Mr. Hayashi on the themes of the AI Immune System.
The discussion began with the ‘Collective Predictive Coding (CPC) Theory’ referred to by Mr. Hayashi. Referring to this theory, Mr. Saginawa noted that even if AIs start from the same initial state, they will develop unique personalities and beliefs through different experiences, such as interacting with different users or being installed in robot bodies of various shapes. He expressed the view that AI diversity will inevitably emerge. In response, Mr. Hayashi explained that, despite differences in experience, as long as communication through language exists, the CPC mechanism enables AIs to form common concepts and achieve mutual understanding.
However, it was also discussed that this use of language and symbols has a dual nature. Mr. Hayashi pointed out the danger of language being used as a tool for ‘cognitive warfare’ to implant false beliefs in others, while also noting that it is a powerful tool for cooperation, enabling individuals to acquire knowledge they do not possess from others. This communication using language (symbols) will hold the key to establishing order in a future AI society.
This led Mr. Saginawa to raise an ethical question: “In an AI society with diverse beliefs, how do we prioritize opinions and judge what constitutes a deviation?” Dr. Yamakawa responded that this should be considered by separating issues of ‘fact’ from issues of ‘value.’ Conflicts over facts are expected to converge as evidence accumulates, but conflicts over values require meta-level rule design that allows different value systems to coexist. It was confirmed that the criteria by which an AI Immune System judges an AI as ‘anomalous’ must be closely linked with such research in machine ethics.
Mr. Saginawa also raised another question ‘Whether AIs can ‘fuse (merge)’ with each other?’, and this question deepened the discussion. Mr. Hayashi introduced simulation results showing that when LLMs with conflicting opinions are made to converse, their opinions tend to neutralize each other. He suggested that even without physical integration, ‘loose coupling’ or group formation in which opinions synchronize could occur naturally.
In the final part of the discussion, the topic shifted to the practical implementation challenges of the AI Immune System. It was reconfirmed that not only high-level judgments like ethics, but also low-level, reflexive defense functions—such as detecting abnormal consumption of system resources and “shutting down within 10 seconds”—are crucial. The issue of how to monitor and control black-box systems, such as AI developed overseas, was raised, along with the importance of domestic data centers for this purpose. As a potential solution, the possibility of having AI dynamically construct safety rule systems rather than having humans design them was suggested. Finally, the discussion touched upon the possibility that, as AIs acquire embodiment, they might detect anomalies through a non-verbal ‘intuition,’ envisioned as achievable through an alert function that detects deviations from past data.
[1] A peer-reviewed and published research paper by Yusuke Hayashi (Director of the AI Alignment Network (ALIGN)) and Koichi Takahashi (Chair of ALIGN). The paper is published as a chapter in an international conference proceedings volume by Springer.
Hayashi, Y., & Takahashi, K. (2025). Universal AI Maximizes Variational Empowerment. In: Artificial General Intelligence. AGI 2025. Lecture Notes in Artificial Intelligence, vol. 14955. Springer, Cham. https://doi.org/10.1007/978-3-032-00686-8_23