Autonomously Generated Knowledge
Table of Contents:
We envision the coming cognitive systems era in which, on a daily basis, hundreds of millions of people, if not billions of people, across the world will be collaborating with cogs able to perform expert-level cognition. Furthermore, these cogs will be able to communicate with each other and interact with each other at computer speeds far in excess of human/cog interaction speed. With billions of cogs out there thinking billions of times faster than us humans, 24 hours a day, 7 days a week and processing more information in a few seconds than a human can in a lifetime, we might naturally expect cognitive systems to create original ideas, draw new conclusions, construct new theories, synthesize new solutions, or come to new realizations. This represents newly created knowledge and new intellectual property. What are the implications of autonomously-generated knowledge and automated knowledge discovery? An interesting question the future will have to answer is who owns autonomously generated knowledge? Does a company or organization own it? Does a person own it? Is it communal property of the people at large? Can it be bought, sold, traded, and bequeathed to heirs?
As described in Chapter 4, in the 1950s and 1960s, early systems in artificial intelligence required knowledge to be laboriously captured by human researchers and encoded into symbolic expressions. In the 1970s and 1980s, expert systems attempted to capture expertise into banks of production rules also requiring tremendous human knowledge engineering effort.
In the 1980s and early 1990s, mining of extremely large databases became possible spawning the field of knowledge discovery from databases (KDD). Frawley defined knowledge discovery as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" (Frawley, 1992). Silberschatz and Tuzhilin (1995) introduces a framework of knowledge discovery describing how systems define vital information and apply measures of interestingness to discover useful patterns. Grobelnik and Mladenic (2005) define the goal of automated knowledge discovery as finding "useful pieces of knowledge within the data with none or little human involvement" and identifies the following among the areas it can be used with great utility: document categorization, document clustering and similarity, document visualization, user profiling, ontology learning, dealing with unlabeled data, information retrieval, and text mining.
KDD represents a transition from engineered data to unstructured data. At the outset of KDD, researchers labored over the quality of the data in databases used for input to machine learning systems. An early workgroup stated "databases were an integral part of knowledge discovery but could be inefficient depending on the quantity and quality of data used" (Shapiro, 1990). Researchers argued over the inclusion of commonsense and general domain knowledge not expressly contained in the database. However, what has evolved since then is data mining and machine learning from unstructured data—not involving a database at all.
An Association for Computing Machinery (ACM) working group defines data mining as the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems (Chakrabarti et al., 2006). Data mining involves six common classes of tasks (Fayyad et al., 1996):
• Anomaly detection: identification of unusual data requiring further
• Regression: modeling data with the least for estimating
• Summarization: a more compact representation of the data set
Kurgans developed the Knowledge Discovery and Data Mining Models (KDDM). Kurgan believes these models "would help organizations to understand the knowledge discovery processes and provide a roadmap to follow while planning and carrying out the projects" (Kurgan, 2006). The traditional method of turning data into knowledge relies on manual analysis and interpretation. Fayyad et al. believes this process is slow, expensive, highly subjective, and could be made more effective with machine learning and data mining to automate discovery in databases.
Deep learning is a class of machine learning algorithms using multiple layers to progressively extract higher level features from the raw data. Recent successes in artificial intelligence have centered on deep-learning systems capable of unsupervised learning from unstructured data. Many scientists have concluded analyzing data and databases could lead to new discoveries. Machines could be programmed to analyze and learn from these patterns. Using machine learning computers could follow the framework of knowledge discovery to solve common problems and make new discoveries.
Early machine learning systems required carefully engineered training sets prepared by humans—supervised learning. However, recently, systems have been developed requiring little human intervention—semi-supervised learning. The most recent systems can learn from unstructured data on their own without human intervention—unsupervised learning.
If systems can learn on their own, can they discover or create new knowledge on their own? Advancements in automated knowledge discovery have led to improvements in the fields of science and healthcare. Varun Chandola expresses "knowledge discovery could be used to analyze data to identify fraud, waste, and abuse in the healthcare system" (Chandola et al., 2013). Chandola believes through analytics the machines could discover new ways to improve the healthcare system.
Lu Zhang used machine learning to automate the drug discovery process (Zhang, 2017). Using machine learning Zhang was able to use models able to identify potential biological active molecules from millions of candidates quickly and cheaply. Zack Ulissi and his team are using machine learning to test methods on discovering new intermetallics able to make good electrocatalysts for carbon dioxide reduction and hydrogen evolution (Ulissi et al., 2017; Tran and Ulissi, 2018). Automated knowledge discovery has sped up Ulissi's research significantly.
Today, we have cogs able to automate the most time-consuming tasks of knowledge discovery done in collaboration with humans—representing Level 3 or Level 4 cognitive augmentation as explained in Chapter 3. Therefore, the next step is Level 5, knowledge discovery without human involvement.
We can already see the beginnings of cogs being able to discover new knowledge themselves. Ornes (2019) describes how programmers at OpenAI recently taught a collection of intelligent artificial agents (bots) to play hide-and-seek. The goal was to observe how competition between hiders and seekers would evolve. Even though the bots had not received explicit instructions about how to play, they soon learned to run away and chase. After hundreds of millions of games, they learned to manipulate their environment to give themselves an advantage. The hiders, for example, learned to build miniature forts and barricade themselves inside;
the seekers, in response, learned how to use ramps to scale the walls and find the hiders. None of these strategies were built into the system, the bots discovered them on their own.
In 2017, AlphaGo defeated the world human Go champion Lee Sedol. Analysis of AlphaGo's moves showed AlphaGo followed a process of learning to play like humans do but then abandoned those methods in favor of its own strategy. At first, like a human beginner, AlphaGo attempted to quickly capture as many of its opponents' stones as possible. But as training continued, the program improved by discovering successful new maneuvers. It learned to lay the groundwork early for long-term strategies like "life and death," which involves positioning stones in ways that prevent their capture (Baker and Hui, 2017). AlphaGo also developed a "win by just enough to win" strategy whereby, it would not seek to capture large numbers of the opponent's stones if it did not need them to win. Most human players would capture as many as possible even though they did not need that many to win. These strategies represent new ways of approaching the game.
Autonomous Knowledge Discovery in Healthcare
Nura Esfandiari believes automated knowledge discovery will affect the healthcare industry stating automated knowledge discovery "could explore patterns from Alzheimer's data by using visualization techniques to gain a better understanding about the causes and potential solutions of the disease" (Esfandiari, 2014). It is possible cogs discover new drugs and cures scientists would not have discovered for hundreds of years.
Automated knowledge discovery can displace some professionals in the healthcare field. Obermeyer (2016) states "much of the work of radiologists and pathologists will be displaced by automated knowledge discovery." Cogs have the ability to compare large quantities of patient data and images, diagnose, and recommend treatment for these patients. Already, cogs have exceeded human performance in some areas as discussed in Chapter 7.
However, instead of simply putting radiologist out of work, we foresee the democratization of healthcare services. Using expert-level cogs to perform the analytical work of specially-trained personnel means such services can be offered anywhere and by small offices each employing only a few people. Thus, radiological and pathological services could be distributed across the nation throughout suburban areas and through major pharmaceutical retailers such as CVS, Walmart, and Walgreens. When this happens, the cost of these services will drop dramatically. Indeed, we have already seen this kind of distribution of services over recent years with prescriptions and eyecare.
Autonomous Knowledge Discovery in Business and Military
Another area automated knowledge discovery stands to make significant contributions is in the business world. Businesses will create cognitive systems able to analyze tremendous volumes of information and answer complex questions, create solutions, discover new associations and relationships, and identify ways to break into new markets. Cogs will lead to creating a competitive edge, improving existing products, vetting investments, or analyzing mergers and acquisitions.
Automated knowledge discovery figures to affect the military as well. Advancements are already being made in this area, but in the future automated knowledge discovery will transform warfare by analyzing strategies and discovering new plans not thought of before. Having computers able to make plans able to create a tactical advantage will be extremely advantageous.
Automated knowledge discovery applied to intelligence could yield cogs able to discover new threats and decipher intentions of the enemy long before human analysts can. Also, these type of cogs could synthesize novel plans no human had ever thought of, similar to systems developing new gaming strategies. When the enemy begins using automated knowledge discovery to conceive of attack plans, we will have to use automate knowledge discovery to counteract. Are we looking at a cognitive systems arms race?