# Uncertainty and probability theory

There is a long-established custom in science of dealing with uncertainty— whatever its form and nature—through the use of probability theory. Successes of this tradition are undeniable. But as we move further into the age of machine intelligence and automated decision making, a basic limitation of probability theory becomes a serious problem. More specifically, in large measure standard probability theory (PT) cannot deal with information described in natural language; that is, to put it simply, PT does not have NL-capability. Here are a few relatively simple examples:

*Trip planning:* I am planning to drive from Berkeley to Santa Barbara, with stopover for lunch in Monterey. Usually it takes about two hours to get to Monterey. Usually it takes about one hour to have lunch. It is likely that it will take about five hours to get from Monterey to Santa

Barbara. At what time should I leave Berkeley to get to Santa Barbara, with high probability, before about 6 p.m.?

*Balls-in-box:* A box contains about 20 balls of various sizes. Most are large. What is the number of small balls? What is the probability that a ball drawn at random is neither small nor large?

*Temperature:* Usually, the temperature is not very low and not very high. What is the average temperature?

*Tall Swedes:* Most Swedes are tall. How many are short? What is the average height of Swedes?

*Flight delay:* Usually, most United Airlines flights from San Francisco leave on time. What is the probability that my flight will be delayed?

*Maximization: f* is a function from reals to reals described as: If *X* is small then *Y* is small; if *X* is medium then *Y* is large; if *X* is large then *Y* is small. What is the maximum off?

*Expected value: X* is a real-valued random variable. Usually, *X* is much larger than approximately a and much smaller than approximately *b*, where a and *b* are real numbers, with *a < b.* What is the expected value of X?

*Vera's age:* Vera has a son who is in his mid-twenties, and a daughter who is in her mid-thirties. What is Vera's age? This example differs from other examples in that to answer the question what is needed is information drawn from world knowledge. More specifically: (a) childbearing age ranges from about 16 to about 42; and (b) age of mother is the sum of the age of child and the age of mother when the child was born.

In recent years, important contributions have been made to enhancing the capabilities of PT (Bouchon-Meunier, Yager and Zadeh, 2000; Colubi et al., 2001; Dubois and Prade, 1992; 1994; Nguyen, 1993; Nguyen, Kreinovich and Di Nola, 2003; Puri and Ralescu, 1993; Smets, 1996; Singpurwalla and Booker, 2004; Yager, 2002). Particularly worthy of note are random set-based theories (Orlov, 1980; Wang and Sanchez, 1982; Goodman and Nguyen, 1985), among them the Dempster-Shafer theory (Dempster, 1967; Shafer, 1976); Klir's generalized information theory (Klir, 2004; 2006); and theories of imprecise probabilities (Walley, 1991; de Cooman, 2005). The generalized theory of uncertainty (GTU) differs from other theories in three important respects. First, the thesis that information is statistical in nature is replaced by a much more general thesis that information is a generalized constraint (Zadeh, 1986), with statistical uncertainty being a special, albeit important case. Equating information to a generalized constraint is the fundamental thesis of GTU. In symbolic form, the thesis may be expressed as

where *X* is a variable taking values in U; *I(X*) is information about *X; *and *GC(X* ) is a generalized constraint on X.

Second, bivalence is abandoned throughout GTU, and the foundation of GTU is shifted from bivalent logic to fuzzy logic (Zadeh, 1975a; 1975b; Novak, Perfilieva and Mockor, 1999). As a consequence, in GTU everything is, or is allowed to be, a matter of degree or, equivalently, fuzzy. Concomitantly, all variables are, or are allowed to be, granular, a granule being a clump of values defined by a generalized constraint (Zadeh, 1979a; 1979b; 1997; 1999).

And third, one of the principal objectives of GTU is to achieve NL-capability. Why is NL-capability important capability? Principally because much of human knowledge and real-world information is expressed in natural language. Basically, a natural language is a system for describing perceptions. Perceptions are intrinsically imprecise, reflecting the bounded ability of human sensory organs, and ultimately the brain, to resolve detail and store information. Imprecision of perception is passed on to natural languages. It is this imprecision that severely limits the ability of PT to deal with information described in natural language. NL-capability of GTU is the focus of attention in the present chapter.

A concomitant of GTU's NL-capability is its ability to deal with perception-based information (see Figure 6.1). Much information about subjective probabilities is perception-based. In an earlier paper, a generalization of PT, which leads to a perception-based theory, PTp, of probabilistic reasoning with imprecise probabilities, is described (Zadeh, 2002), PTp is subsumed by GTU.

What follows is a precis of GTU. An exposition of an earlier but more detailed version of GTU may be found in Bouchon-Meunier, Yager and Zadeh (2000), which forms the basis for the present chapter.

The centerpiece of GTU is the concept of a generalized constraint— a concept drawn from fuzzy logic. The principal distinguishing features of fuzzy logic are (a) graduation and (b) granulation. More specifically, in fuzzy logic everything is, or is allowed to be, graduated, that is, to be a matter of degree or, more or less equivalently, fuzzy. Furthermore, in fuzzy logic all variables are allowed to be granulated, a granule being a clump of values drawn together by indistinguishability, similarity, proximity, or functionality (see Figure 6.2). Graduation and granulation underline the

*Figure 6.1* Measurement-based vs. perception-based information

*Figure 6.2* Logical systems

concept of a linguistic variable (Zadeh, 1973)—a concept which plays a key role in almost all applications of fuzzy logic (Yen, Langari and Zadeh, 1995). More fundamentally, graduation and granulation have a position of centrality in human cognition. This is one of the basic reasons why fuzzy logic may be viewed in a model of human reasoning.

NL-computation is the core of precisiated natural language (PNL) (Zadeh, 2004a; 2004b). Basically, PNL is a fuzzy logic-based system for computation and deduction with information described in natural language. A forerunner of PNL is PRUF (Zadeh, 1984). We begin with a brief exposition of the basics of NL-computation in the context of GTU.