Computation and Representation

See any bugs/typos/confusing explanations? Open a GitHub issue . You can also comment below

★ See also the PDF version of this chapter (better formatting/references) ★

  • Distinguish between specification and implementation , or equivalently between mathematical functions and algorithms/programs .
  • Representing an object as a string (often of zeroes and ones).
  • Examples of representations for common objects such as numbers, vectors, lists, and graphs.
  • Prefix-free representations.
  • Cantor’s Theorem: The real numbers cannot be represented exactly as finite strings.
“The alphabet (sic) was a great invention, which enabled men (sic) to store and to learn with little effort what others had learned the hard way – that is, to learn from books rather than from direct, possibly painful, contact with the real world.” , B.F. Skinner
“The name of the song is called `HADDOCK’S EYES.’” [said the Knight] “Oh, that’s the name of the song, is it?” Alice said, trying to feel interested. “No, you don’t understand,” the Knight said, looking a little vexed. “That’s what the name is CALLED. The name really is `THE AGED AGED MAN.’” “Then I ought to have said `That’s what the SONG is called’?” Alice corrected herself. “No, you oughtn’t: that’s quite another thing! The SONG is called `WAYS AND MEANS’: but that’s only what it’s CALLED, you know!” “Well, what IS the song, then?” said Alice, who was by this time completely bewildered. “I was coming to that,” the Knight said. “The song really IS `A-SITTING ON A GATE’: and the tune’s my own invention.” Lewis Carroll, Through the Looking-Glass

To a first approximation, computation is a process that maps an input to an output .

what is natural representation

When discussing computation, it is essential to separate the question of what is the task we need to perform (i.e., the specification ) from the question of how we achieve this task (i.e., the implementation ). For example, as we’ve seen, there is more than one way to achieve the computational task of computing the product of two integers.

In this chapter we focus on the what part, namely defining computational tasks. For starters, we need to define the inputs and outputs. Capturing all the potential inputs and outputs that we might ever want to compute seems challenging, since computation today is applied to a wide variety of objects. We do not compute merely on numbers, but also on texts, images, videos, connection graphs of social networks, MRI scans, gene data, and even other programs. We will represent all these objects as strings of zeroes and ones , that is objects such as \(0011101\) or \(1011\) or any other finite list of \(1\) ’s and \(0\) ’s. (This choice is for convenience: there is nothing “holy” about zeroes and ones, and we could have used any other finite collection of symbols.)

what is natural representation

Today, we are so used to the notion of digital representation that we are not surprised by the existence of such an encoding. But it is actually a deep insight with significant implications. Many animals can convey a particular fear or desire, but what is unique about humans is language : we use a finite collection of basic symbols to describe a potentially unlimited range of experiences. Language allows transmission of information over both time and space and enables societies that span a great many people and accumulate a body of shared knowledge over time.

Over the last several decades, we have seen a revolution in what we can represent and convey in digital form. We can capture experiences with almost perfect fidelity, and disseminate it essentially instantaneously to an unlimited audience. Moreover, once information is in digital form, we can compute over it, and gain insights from data that were not accessible in prior times. At the heart of this revolution is the simple but profound observation that we can represent an unbounded variety of objects using a finite set of symbols (and in fact using only the two symbols 0 and 1 ).

In later chapters, we will typically take such representations for granted, and hence use expressions such as “program \(P\) takes \(x\) as input” when \(x\) might be a number, a vector, a graph, or any other object, when we really mean that \(P\) takes as input the representation of \(x\) as a binary string. However, in this chapter we will dwell a bit more on how we can construct such representations.

The main takeaways from this chapter are:

We can represent all kinds of objects we want to use as inputs and outputs using binary strings . For example, we can use the binary basis to represent integers and rational numbers as binary strings (see Section 2.1.1 and Section 2.2 ).

We can compose the representations of simple objects to represent more complex objects. In this way, we can represent lists of integers or rational numbers, and use that to represent objects such as matrices, images, and graphs. Prefix-free encoding is one way to achieve such a composition (see Section 2.5.2 ).

A computational task specifies a map from an input to an output— a function . It is crucially important to distinguish between the “what” and the “how”, or the specification and implementation (see Section 2.6.1 ). A function simply defines which output corresponds to which input. It does not specify how to compute the output from the input, and as we’ve seen in the context of multiplication, there can be more than one way to compute the same function.

While the set of all possible binary strings is infinite, it still cannot represent everything . In particular, there is no representation of the real numbers (with absolute accuracy) as binary strings. This result is also known as “Cantor’s Theorem” (see Section 2.4 ) and is typically referred to as the result that the “reals are uncountable.” It is also implied that there are different levels of infinity though we will not get into this topic in this book (see Remark 2.10 ).

The two “big ideas” we discuss are Big Idea 1 - we can compose representations for simple objects to represent more complex objects and Big Idea 2 - it is crucial to distinguish between functions’ (“what”) and programs’ (“how”). The latter will be a theme we will come back to time and again in this book.

Defining representations

Every time we store numbers, images, sounds, databases, or other objects on a computer, what we actually store in the computer’s memory is the representation of these objects. Moreover, the idea of representation is not restricted to digital computers. When we write down text or make a drawing we are representing ideas or experiences as sequences of symbols (which might as well be strings of zeroes and ones). Even our brain does not store the actual sensory inputs we experience, but rather only a representation of them.

To use objects such as numbers, images, graphs, or others as inputs for computation, we need to define precisely how to represent these objects as binary strings. A representation scheme is a way to map an object \(x\) to a binary string \(E(x) \in \{0,1\}^*\) . For example, a representation scheme for natural numbers is a function \(E:\N \rightarrow \{0,1\}^*\) . Of course, we cannot merely represent all numbers as the string “ \(0011\) ” (for example). A minimal requirement is that if two numbers \(x\) and \(x'\) are different then they would be represented by different strings. Another way to say this is that we require the encoding function \(E\) to be one to one .

Representing natural numbers

We now show how we can represent natural numbers as binary strings. Over the years people have represented numbers in a variety of ways, including Roman numerals, tally marks, our own Hindu-Arabic decimal system, and many others. We can use any one of those as well as many others to represent a number as a string (see Figure 2.3 ). However, for the sake of concreteness, we use the binary basis as our default representation of natural numbers as strings. For example, we represent the number six as the string \(110\) since \(1\cdot 2^{2} + 1 \cdot 2^1 + 0 \cdot 2^0 = 6\) , and similarly we represent the number thirty-five as the string \(y = 100011\) which satisfies \(\sum_{i=0}^5 y_i \cdot 2^{|y|-i-1} = 35\) . Some more examples are given in the table below.

what is natural representation

If \(n\) is even, then the least significant digit of \(n\) ’s binary representation is \(0\) , while if \(n\) is odd then this digit equals \(1\) . Just like the number \(\floor{n/10}\) corresponds to “chopping off” the least significant decimal digit (e.g., \(\floor{457/10}=\floor{45.7}=45\) ), the number \(\floor{n/2}\) corresponds to the “chopping off” the least significant binary digit. Hence the binary representation can be formally defined as the following function \(NtS:\N \rightarrow \{0,1\}^*\) ( \(NtS\) stands for “natural numbers to strings”):

\[NtS(n) = \begin{cases} 0 & n=0 \\ 1 & n=1 \\ NtS(\floor{n/2}) parity(n) & n>1 \end{cases} \;\;(2.1)\] where \(parity:\N \rightarrow \{0,1\}\) is the function defined as \(parity(n)=0\) if \(n\) is even and \(parity(n)=1\) if \(n\) is odd, and as usual, for strings \(x,y \in \{0,1\}^*\) , \(xy\) denotes the concatenation of \(x\) and \(y\) . The function \(NtS\) is defined recursively : for every \(n>1\) we define \(rep(n)\) in terms of the representation of the smaller number \(\floor{n/2}\) . It is also possible to define \(NtS\) non-recursively, see Exercise 2.2 .

Throughout most of this book, the particular choices of representation of numbers as binary strings would not matter much: we just need to know that such a representation exists. In fact, for many of our purposes we can even use the simpler representation of mapping a natural number \(n\) to the length- \(n\) all-zero string \(0^n\) .

We can implement the binary representation in Python as follows:

We can also use Python to implement the inverse transformation, mapping a string back to the natural number it represents.

In this book, we sometimes use code examples as in Remark 2.1 . The point is always to emphasize that certain computations can be achieved concretely, rather than illustrating the features of Python or any other programming language. Indeed, one of the messages of this book is that all programming languages are in a certain precise sense equivalent to one another, and hence we could have just as well used JavaScript, C, COBOL, Visual Basic or even BrainF*ck . This book is not about programming, and it is absolutely OK if you are not familiar with Python or do not follow code examples such as those in Remark 2.1 .

Meaning of representations (discussion)

It is natural for us to think of \(236\) as the “actual” number, and of \(11101100\) as “merely” its representation. However, for most Europeans in the middle ages CCXXXVI would be the “actual” number and \(236\) (if they have even heard about it) would be the weird Hindu-Arabic positional representation. 1 When our AI robot overlords materialize, they will probably think of \(11101100\) as the “actual” number and of \(236\) as “merely” a representation that they need to use when they give commands to humans.

So what is the “actual” number? This is a question that philosophers of mathematics have pondered throughout history. Plato argued that mathematical objects exist in some ideal sphere of existence (that to a certain extent is more “real” than the world we perceive via our senses, as this latter world is merely the shadow of this ideal sphere). In Plato’s vision, the symbols \(236\) are merely notation for some ideal object, that, in homage to the late musician , we can refer to as “the number commonly represented by \(236\) ”.

The Austrian philosopher Ludwig Wittgenstein, on the other hand, argued that mathematical objects do not exist at all, and the only things that exist are the actual marks on paper that make up \(236\) , \(11101100\) or CCXXXVI . In Wittgenstein’s view, mathematics is merely about formal manipulation of symbols that do not have any inherent meaning. You can think of the “actual” number as (somewhat recursively) “that thing which is common to \(236\) , \(11101100\) and CCXXXVI and all other past and future representations that are meant to capture the same object”.

While reading this book, you are free to choose your own philosophy of mathematics, as long as you maintain the distinction between the mathematical objects themselves and the various particular choices of representing them, whether as splotches of ink, pixels on a screen, zeroes and ones, or any other form.

Representations beyond natural numbers

We have seen that natural numbers can be represented as binary strings. We now show that the same is true for other types of objects, including (potentially negative) integers, rational numbers, vectors, lists, graphs and many others. In many instances, choosing the “right” string representation for a piece of data is highly non-trivial, and finding the “best” one (e.g., most compact, best fidelity, most efficiently manipulable, robust to errors, most informative features, etc.) is the object of intense research. But for now, we focus on presenting some simple representations for various objects that we would like to use as inputs and outputs for computation.

Representing (potentially negative) integers

Since we can represent natural numbers as strings, we can represent the full set of integers (i.e., members of the set \(\Z=\{ \ldots, -3 , -2 , -1 , 0 , +1, +2, +3,\ldots \}\) ) by adding one more bit that represents the sign. To represent a (potentially negative) number \(m\) , we prepend to the representation of the natural number \(|m|\) a bit \(\sigma\) that equals \(0\) if \(m \geq 0\) and equals \(1\) if \(m<0\) . Formally, we define the function \(ZtS:\Z \rightarrow \{0,1\}^*\) as follows \[ZtS(m) = \begin{cases} 0\;NtS(m) & m \geq 0 \\ 1\;NtS(-m) & m < 0 \end{cases}\] where \(NtS\) is defined as in Equation 2.1 .

While the encoding function of a representation needs to be one to one, it does not have to be onto . For example, in the representation above there is no number that is represented by the empty string but it is still a fine representation, since every integer is represented uniquely by some string.

Given a string \(y\in \{0,1\}^*\) , how do we know if it’s “supposed” to represent a (non-negative) natural number or a (potentially negative) integer? For that matter, even if we know \(y\) is “supposed” to be an integer, how do we know what representation scheme it uses? The short answer is that we do not necessarily know this information, unless it is supplied from the context. (In programming languages, the compiler or interpreter determines the representation of the sequence of bits corresponding to a variable based on the variable’s type .) We can treat the same string \(y\) as representing a natural number, an integer, a piece of text, an image, or a green gremlin. Whenever we say a sentence such as “let \(n\) be the number represented by the string \(y\) ,” we will assume that we are fixing some canonical representation scheme such as the ones above. The choice of the particular representation scheme will rarely matter, except that we want to make sure to stick with the same one for consistency.

Two’s complement representation (optional)

Section 2.2.1 ’s approach of representing an integer using a specific “sign bit” is known as the Signed Magnitude Representation and was used in some early computers. However, the two’s complement representation is much more common in practice. The two’s complement representation of an integer \(k\) in the set \(\{ -2^n , -2^n+1, \ldots, 2^n-1 \}\) is the string \(ZtS_n(k)\) of length \(n+1\) defined as follows: \[ ZtS_n(k) = \begin{cases} NtS_{n+1}(k) & 0 \leq k \leq 2^n-1 \\ NtS_{n+1}(2^{n+1}+k) & -2^n \leq k \leq -1 \end{cases} \;, \] where \(NtS_\ell(m)\) denotes the standard binary representation of a number \(m \in \{0,\ldots, 2^{\ell}\}\) as string of length \(\ell\) , padded with leading zeros as needed. For example, if \(n=3\) then \(ZtS_3(1)=NtS_4(1)=0001\) , \(ZtS_3(2)=NtS_4(2)=0010\) , \(ZtS_3(-1)=NtS_4(16-1)=1111\) , and \(ZtS_3(-8)=NtS_4(16-8)=1000\) . If \(k\) is a negative number larger than or equal to \(-2^n\) then \(2^{n+1}+k\) is a number between \(2^n\) and \(2^{n+1}-1\) . Hence the two’s complement representation of such a number \(k\) is a string of length \(n+1\) with its first digit equal to \(1\) .

Another way to say this is that we represent a potentially negative number \(k \in \{ -2^n,\ldots, 2^n-1 \}\) as the non-negative number \(k \mod 2^{n+1}\) (see also Figure 2.4 ). This means that if two (potentially negative) numbers \(k\) and \(k'\) are not too large (i.e., $ k + k’ { -2^n,, 2^n-1 }$), then we can compute the representation of \(k+k'\) by adding modulo \(2^{n+1}\) the representations of \(k\) and \(k'\) as if they were non-negative integers. This property of the two’s complement representation is its main attraction since, depending on their architectures, microprocessors can often perform arithmetic operations modulo \(2^w\) very efficiently (for certain values of \(w\) such as \(32\) and \(64\) ). Many systems leave it to the programmer to check that values are not too large and will carry out this modular arithmetic regardless of the size of the numbers involved. For this reason, in some systems adding two large positive numbers can result in a negative number (e.g., adding \(2^n-100\) and \(2^n-200\) might result in \(-300\) since \((2^{n+1}-300) \mod 2^{n+1} = -300\) , see also Figure 2.4 ).

what is natural representation

Rational numbers and representing pairs of strings

We can represent a rational number of the form \(a/b\) by representing the two numbers \(a\) and \(b\) . However, merely concatenating the representations of \(a\) and \(b\) will not work. For example, the binary representation of \(4\) is \(100\) and the binary representation of \(43\) is \(101011\) , but the concatenation \(100101011\) of these strings is also the concatenation of the representation \(10010\) of \(18\) and the representation \(1011\) of \(11\) . Hence, if we used such simple concatenation then we would not be able to tell if the string \(100101011\) is supposed to represent \(4/43\) or \(18/11\) .

We tackle this by giving a general representation for pairs of strings . If we were using a pen and paper, we would just use a separator symbol such as \(\|\) to represent, for example, the pair consisting of the numbers represented by \(10\) and \(110001\) as the length- \(9\) string “ \(10\|110001\) ”. In other words, there is a one to one map \(F\) from pairs of strings \(x,y \in \{0,1\}^*\) into a single string \(z\) over the alphabet \(\Sigma = \{0,1,\| \}\) (in other words, \(z\in \Sigma^*\) ). Using such separators is similar to the way we use spaces and punctuation to separate words in English. By adding a little redundancy, we achieve the same effect in the digital domain. We can map the three-element set \(\Sigma\) to the three-element set \(\{00,11,01 \} \subset \{0,1\}^2\) in a one-to-one fashion, and hence encode a length \(n\) string \(z\in \Sigma^*\) as a length \(2n\) string \(w\in \{0,1\}^*\) .

Our final representation for rational numbers is obtained by composing the following steps:

Representing a (potentially negative) rational number as a pair of integers \(a,b\) such that \(r=a/b\) .

Representing an integer by a string via the binary representation.

Combining 1 and 2 to obtain a representation of a rational number as a pair of strings.

Representing a pair of strings over \(\{0,1\}\) as a single string over \(\Sigma = \{0,1,\|\}\) .

Representing a string over \(\Sigma\) as a longer string over \(\{0,1\}\) .

Consider the rational number \(r=-5/8\) . We represent \(-5\) as \(1101\) and \(+8\) as \(01000\) , and so we can represent \(r\) as the pair of strings \((1101,01000)\) and represent this pair as the length \(10\) string \(1101\|01000\) over the alphabet \(\{0,1,\|\}\) . Now, applying the map \(0 \mapsto 00\) , \(1\mapsto 11\) , \(\| \mapsto 01\) , we can represent the latter string as the length \(20\) string \(s=11110011010011000000\) over the alphabet \(\{0,1\}\) .

The same idea can be used to represent triples of strings, quadruples, and so on as a string. Indeed, this is one instance of a very general principle that we use time and again in both the theory and practice of computer science (for example, in Object Oriented programming):

If we can represent objects of type \(T\) as strings, then we can represent tuples of objects of type \(T\) as strings as well.

Repeating the same idea, once we can represent objects of type \(T\) , we can also represent lists of lists of such objects, and even lists of lists of lists and so on and so forth. We will come back to this point when we discuss prefix free encoding in Section 2.5.2 .

Representing real numbers

The set of real numbers \(\R\) contains all numbers including positive, negative, and fractional, as well as irrational numbers such as \(\pi\) or \(e\) . Every real number can be approximated by a rational number, and thus we can represent every real number \(x\) by a rational number \(a/b\) that is very close to \(x\) . For example, we can represent \(\pi\) by \(22/7\) within an error of about \(10^{-3}\) . If we want a smaller error (e.g., about \(10^{-4}\) ) then we can use \(311/99\) , and so on and so forth.

what is natural representation

The above representation of real numbers via rational numbers that approximate them is a fine choice for a representation scheme. However, typically in computing applications, it is more common to use the floating-point representation scheme (see Figure 2.5 ) to represent real numbers. In the floating-point representation scheme we represent \(x\in \R\) by the pair \((b,e)\) of (positive or negative) integers of some prescribed sizes (determined by the desired accuracy) such that \(b \times 2^{e}\) is closest to \(x\) . Floating-point representation is the base-two version of scientific notation , where one represents a number \(y\in R\) as its approximation of the form \(b \times 10^e\) for \(b,e\) . It is called “floating-point” because we can think of the number \(b\) as specifying a sequence of binary digits, and \(e\) as describing the location of the “binary point” within this sequence. The use of floating representation is the reason why in many programming systems, printing the expression 0.1+0.2 will result in 0.30000000000000004 and not 0.3 , see here , here and here for more.

what is natural representation

The reader might be (rightly) worried about the fact that the floating-point representation (or the rational number one) can only approximately represent real numbers. In many (though not all) computational applications, one can make the accuracy tight enough so that this does not affect the final result, though sometimes we do need to be careful. Indeed, floating-point bugs can sometimes be no joking matter. For example, floating-point rounding errors have been implicated in the failure of a U.S. Patriot missile to intercept an Iraqi Scud missile, costing 28 lives, as well as a 100 million pound error in computing payouts to British pensioners .

Cantor’s Theorem, countable sets, and string representations of the real numbers

“For any collection of fruits, we can make more fruit salads than there are fruits. If not, we could label each salad with a different fruit, and consider the salad of all fruits not in their salad. The label of this salad is in it if and only if it is not.” , Martha Storey .

Given the issues with floating-point approximations for real numbers, a natural question is whether it is possible to represent real numbers exactly as strings. Unfortunately, the following theorem shows that this cannot be done:

There does not exist a one-to-one function \(RtS:\R \rightarrow \{0,1\}^*\) . 2

Countable sets. We say that a set \(S\) is countable if there is an onto map \(C:\N \rightarrow S\) , or in other words, we can write \(S\) as the sequence \(C(0),C(1),C(2),\ldots\) . Since the binary representation yields an onto map from \(\{0,1\}^*\) to \(\N\) , and the composition of two onto maps is onto, a set \(S\) is countable iff there is an onto map from \(\{0,1\}^*\) to \(S\) . Using the basic properties of functions (see Section 1.4.3 ), a set is countable if and only if there is a one-to-one function from \(S\) to \(\{0,1\}^*\) . Hence, we can rephrase Theorem 2.5 as follows:

The reals are uncountable. That is, there does not exist an onto function \(NtR:\N \rightarrow \R\) .

Theorem 2.6 was proven by Georg Cantor in 1874. This result (and the theory around it) was quite shocking to mathematicians at the time. By showing that there is no one-to-one map from \(\R\) to \(\{0,1\}^*\) (or \(\N\) ), Cantor showed that these two infinite sets have “different forms of infinity” and that the set of real numbers \(\R\) is in some sense “bigger” than the infinite set \(\{0,1\}^*\) . The notion that there are “ shades of infinity ” was deeply disturbing to mathematicians and philosophers at the time. The philosopher Ludwig Wittgenstein (whom we mentioned before) called Cantor’s results “utter nonsense” and “laughable.” Others thought they were even worse than that. Leopold Kronecker called Cantor a “corrupter of youth,” while Henri Poincaré said that Cantor’s ideas “should be banished from mathematics once and for all.” The tide eventually turned, and these days Cantor’s work is universally accepted as the cornerstone of set theory and the foundations of mathematics. As David Hilbert said in 1925, “No one shall expel us from the paradise which Cantor has created for us.” As we will see later in this book, Cantor’s ideas also play a huge role in the theory of computation.

Now that we have discussed Theorem 2.5 ’s importance, let us see the proof. It is achieved in two steps:

Define some infinite set \(\mathcal{X}\) for which it is easier for us to prove that \(\mathcal{X}\) is not countable (namely, it’s easier for us to prove that there is no one-to-one function from \(\mathcal{X}\) to \(\{0,1\}^*\) ).

Prove that there is a one-to-one function \(G\) mapping \(\mathcal{X}\) to \(\mathbb{R}\) .

We can use a proof by contradiction to show that these two facts together imply Theorem 2.5 . Specifically, if we assume (towards the sake of contradiction) that there exists some one-to-one \(F\) mapping \(\mathbb{R}\) to \(\{0,1\}^*\) , then the function \(x \mapsto F(G(x))\) obtained by composing \(F\) with the function \(G\) from Step 2 above would be a one-to-one function from \(\mathcal{X}\) to \(\{0,1\}^*\) , which contradicts what we proved in Step 1!

To turn this idea into a full proof of Theorem 2.5 we need to:

Define the set \(\mathcal{X}\) .

Prove that there is no one-to-one function from \(\mathcal{X}\) to \(\{0,1\}^*\)

Prove that there is a one-to-one function from \(\mathcal{X}\) to \(\R\) .

We now proceed to do precisely that. That is, we will define the set \(\{0,1\}^\infty\) , which will play the role of \(\mathcal{X}\) , and then state and prove two lemmas that show that this set satisfies our two desired properties.

We denote by \(\{0,1\}^\infty\) the set \(\{ f \;|\; f:\N \rightarrow \{0,1\} \}\) .

That is, \(\{0,1\}^\infty\) is a set of functions , and a function \(f\) is in \(\{0,1\}^\infty\) iff its domain is \(\N\) and its codomain is \(\{0,1\}\) . We can also think of \(\{0,1\}^\infty\) as the set of all infinite sequences of bits, since a function \(f:\N \rightarrow \{0,1\}\) can be identified with the sequence \((f(0),f(1),f(2),\ldots )\) . The following two lemmas show that \(\{0,1\}^\infty\) can play the role of \(\mathcal{X}\) to establish Theorem 2.5 .

There does not exist a one-to-one map \(FtS:\{0,1\}^\infty \rightarrow \{0,1\}^*\) . 3

There does exist a one-to-one map \(FtR:\{0,1\}^\infty \rightarrow \R\) . 4

As we’ve seen above, Lemma 2.8 and Lemma 2.9 together imply Theorem 2.5 . To repeat the argument more formally, suppose, for the sake of contradiction, that there did exist a one-to-one function \(RtS:\R \rightarrow \{0,1\}^*\) . By Lemma 2.9 , there exists a one-to-one function \(FtR:\{0,1\}^\infty \rightarrow \R\) . Thus, under this assumption, since the composition of two one-to-one functions is one-to-one (see Exercise 2.12 ), the function \(FtS:\{0,1\}^\infty \rightarrow \{0,1\}^*\) defined as \(FtS(f)=RtS(FtR(f))\) will be one to one, contradicting Lemma 2.8 . See Figure 2.7 for a graphical illustration of this argument.

what is natural representation

Now all that is left is to prove these two lemmas. We start by proving Lemma 2.8 which is really the heart of Theorem 2.5 .

what is natural representation

Warm-up: “Baby Cantor”. The proof of Lemma 2.8 is rather subtle. One way to get intuition for it is to consider the following finite statement “there is no onto function \(f:\{0,\ldots,99\} \rightarrow \{0,1\}^{100}\) ”. Of course we know it’s true since the set \(\{0,1\}^{100}\) is bigger than the set \([100]\) , but let’s see a direct proof. For every \(f:\{0,\ldots,99\} \rightarrow \{0,1\}^{100}\) , we can define the string \(\overline{d} \in \{0,1\}^{100}\) as follows: \(\overline{d} = (1-f(0)_0, 1-f(1)_1 , \ldots, 1-f(99)_{99})\) . If \(f\) was onto, then there would exist some \(n\in [100]\) such that \(f(n) =\overline{d}\) , but we claim that no such \(n\) exists. Indeed, if there was such \(n\) , then the \(n\) -th coordinate of \(\overline{d}\) would equal \(f(n)_n\) but by definition this coordinate equals \(1-f(n)_n\) . See also a “proof by code” of this statement.

We will prove that there does not exist an onto function \(StF:\{0,1\}^* \rightarrow \{0,1\}^\infty\) . This implies the lemma since for every two sets \(A\) and \(B\) , there exists an onto function from \(A\) to \(B\) if and only if there exists a one-to-one function from \(B\) to \(A\) (see Lemma 1.2 ).

The technique of this proof is known as the “diagonal argument” and is illustrated in Figure 2.8 . We assume, towards a contradiction, that there exists such a function \(StF:\{0,1\}^* \rightarrow \{0,1\}^\infty\) . We will show that \(StF\) is not onto by demonstrating a function \(\overline{d}\in \{0,1\}^\infty\) such that \(\overline{d} \neq StF(x)\) for every \(x\in \{0,1\}^*\) . Consider the lexicographic ordering of binary strings (i.e., \(\ensuremath{\text{\texttt{""}}}\) , \(0\) , \(1\) , \(00\) , \(01\) , \(\ldots\) ). For every \(n\in \N\) , we let \(x_n\) be the \(n\) -th string in this order. That is \(x_0 =\ensuremath{\text{\texttt{""}}}\) , \(x_1 = 0\) , \(x_2= 1\) and so on and so forth. We define the function \(\overline{d} \in \{0,1\}^\infty\) as follows: \[\overline{d}(n) = 1 - StF(x_n)(n)\] for every \(n\in \N\) . That is, to compute \(\overline{d}\) on input \(n\in\N\) , we first compute \(g= StF(x_n)\) , where \(x_n \in \{0,1\}^*\) is the \(n\) -th string in the lexicographical ordering. Since \(g \in \{0,1\}^\infty\) , it is a function mapping \(\N\) to \(\{0,1\}\) . The value \(\overline{d}(n)\) is defined to be the negation of \(g(n)\) .

The definition of the function \(\overline{d}\) is a bit subtle. One way to think about it is to imagine the function \(StF\) as being specified by an infinitely long table, in which every row corresponds to a string \(x\in \{0,1\}^*\) (with strings sorted in lexicographic order), and contains the sequence \(StF(x)(0), StF(x)(1), StF(x)(2),\ldots\) . The diagonal elements in this table are the values

\[StF(\ensuremath{\text{\texttt{""}}})(0),StF(0)(1),StF(1)(2),StF(00)(3), StF(01)(4),\ldots\]

which correspond to the elements \(StF(x_n)(n)\) in the \(n\) -th row and \(n\) -th column of this table for \(n=0,1,2,\ldots\) . The function \(\overline{d}\) we defined above maps every \(n\in \N\) to the negation of the \(n\) -th diagonal value.

To complete the proof that \(StF\) is not onto we need to show that \(\overline{d} \neq StF(x)\) for every \(x\in \{0,1\}^*\) . Indeed, let \(x\in \{0,1\}^*\) be some string and let \(g = StF(x)\) . If \(n\) is the position of \(x\) in the lexicographical order then by construction \(\overline{d}(n) = 1-g(n) \neq g(n)\) which means that \(g \neq \overline{d}\) which is what we wanted to prove.

Lemma 2.8 doesn’t really have much to do with the natural numbers or the strings. An examination of the proof shows that it really shows that for every set \(S\) , there is no one-to-one map \(F:\{0,1\}^S \rightarrow S\) where \(\{0,1\}^S\) denotes the set \(\{ f \;|\; f:S \rightarrow \{0,1\} \}\) of all Boolean functions with domain \(S\) . Since we can identify a subset \(V \subseteq S\) with its characteristic function \(f=1_V\) (i.e., \(1_V(x)=1\) iff \(x\in V\) ), we can think of \(\{0,1\}^S\) also as the set of all subsets of \(S\) . This subset is sometimes called the power set of \(S\) and denoted by \(\mathcal{P}(S)\) or \(2^S\) .

The proof of Lemma 2.8 can be generalized to show that there is no one-to-one map between a set and its power set. In particular, it means that the set \(\{0,1\}^\R\) is “even bigger” than \(\R\) . Cantor used these ideas to construct an infinite hierarchy of shades of infinity. The number of such shades turns out to be much larger than \(|\N|\) or even \(|\R|\) . He denoted the cardinality of \(\N\) by \(\aleph_0\) and denoted the next largest infinite number by \(\aleph_1\) . ( \(\aleph\) is the first letter in the Hebrew alphabet.) Cantor also made the continuum hypothesis that \(|\R|=\aleph_1\) . We will come back to the fascinating story of this hypothesis later on in this book. This lecture of Aaronson mentions some of these issues (see also this Berkeley CS 70 lecture ).

To complete the proof of Theorem 2.5 , we need to show Lemma 2.9 . This requires some calculus background but is otherwise straightforward. If you have not had much experience with limits of a real series before, then the formal proof below might be a little hard to follow. This part is not the core of Cantor’s argument, nor are such limits important to the remainder of this book, so you can feel free to take Lemma 2.9 on faith and skip the proof.

We define \(FtR(f)\) to be the number between \(0\) and \(2\) whose decimal expansion is \(f(0).f(1) f(2) \ldots\) , or in other words \(FtR(f) = \sum_{i=0}^{\infty} f(i) \cdot 10^{-i}\) . If \(f\) and \(g\) are two distinct functions in \(\{0,1\}^\infty\) , then there must be some input \(k\) in which they disagree. If we take the minimum such \(k\) , then the numbers \(f(0).f(1) f(2) \ldots f(k-1) f(k) \ldots\) and \(g(0) . g(1) g(2) \ldots g(k) \ldots\) agree with each other all the way up to the \(k-1\) -th digit after the decimal point, and disagree on the \(k\) -th digit. But then these numbers must be distinct. Concretely, if \(f(k)=1\) and \(g(k)=0\) then the first number is larger than the second, and otherwise ( \(f(k)=0\) and \(g(k)=1\) ) the first number is smaller than the second. In the proof we have to be a little careful since these are numbers with infinite expansions . For example, the number one half has two decimal expansions \(0.5\) and \(0.49999\cdots\) . However, this issue does not come up here, since we restrict attention only to numbers with decimal expansions that do not involve the digit \(9\) .

For every \(f \in \{0,1\}^\infty\) , we define \(FtR(f)\) to be the number whose decimal expansion is \(f(0).f(1)f(2)f(3)\ldots\) . Formally, \[ FtR(f) = \sum_{i=0}^\infty f(i) \cdot 10^{-i} \;\;(2.2) \] It is a known result in calculus (whose proof we will not repeat here) that the series on the right-hand side of Equation 2.2 converges to a definite limit in \(\mathbb{R}\) .

We now prove that \(FtR\) is one to one. Let \(f,g\) be two distinct functions in \(\{0,1\}^\infty\) . Since \(f\) and \(g\) are distinct, there must be some input on which they differ, and we define \(k\) to be the smallest such input and assume without loss of generality that \(f(k)=0\) and \(g(k)=1\) . (Otherwise, if \(f(k)=1\) and \(g(k)=0\) , then we can simply switch the roles of \(f\) and \(g\) .) The numbers \(FtR(f)\) and \(FtR(g)\) agree with each other up to the \(k-1\) -th digit up after the decimal point. Since this digit equals \(0\) for \(FtR(f)\) and equals \(1\) for \(FtR(g)\) , we claim that \(FtR(g)\) is bigger than \(FtR(f)\) by at least \(0.5 \cdot 10^{-k}\) . To see this note that the difference \(FtR(g)-FtR(f)\) will be minimized if \(g(\ell)=0\) for every \(\ell>k\) and \(f(\ell)=1\) for every \(\ell>k\) , in which case (since \(f\) and \(g\) agree up to the \(k-1\) -th digit)

\[ FtR(g)-FtR(f) = 10^{-k} - 10^{-k-1} - 10^{-k-2} - 10^{-k-3} - \cdots \;\;(2.3) \]

Since the infinite series \(\sum_{i=0}^{\infty} 10^{-i}\) converges to \(10/9\) , it follows that for every such \(f\) and \(g\) , \(FtR(g) - FtR(f) \geq 10^{-k} - 10^{-k-1}\cdot (10/9) > 0\) . In particular we see that for every distinct \(f,g \in \{0,1\}^\infty\) , \(FtR(f) \neq FtR(g)\) , implying that the function \(FtR\) is one to one.

In the proof above we used the fact that \(1 + 1/10 + 1/100 + \cdots\) converges to \(10/9\) , which plugging into Equation 2.3 yields that the difference between \(FtR(g)\) and \(FtR(h)\) is at least \(10^{-k} - 10^{-k-1}\cdot (10/9) > 0\) . While the choice of the decimal representation for \(FtR\) was arbitrary, we could not have used the binary representation in its place. Had we used the binary expansion instead of decimal, the corresponding sequence \(1 + 1/2 + 1/4 + \cdots\) converges to \(2/1=2\) , and since \(2^{-k} = 2^{-k-1} \cdot 2\) , we could not have deduced that \(FtR\) is one to one. Indeed there do exist pairs of distinct sequences \(f,g\in \{0,1\}^\infty\) such that \(\sum_{i=0}^\infty f(i)2^{-i} = \sum_{i=0}^\infty g(i)2^{-i}\) . (For example, the sequence \(1,0,0,0,\ldots\) and the sequence \(0,1,1,1,\ldots\) have this property.)

Corollary: Boolean functions are uncountable

Cantor’s Theorem yields the following corollary that we will use several times in this book: the set of all Boolean functions (mapping \(\{0,1\}^*\) to \(\{0,1\}\) ) is not countable:

Let \(\ensuremath{\mathit{ALL}}\) be the set of all functions \(F:\{0,1\}^* \rightarrow \{0,1\}\) . Then \(\ensuremath{\mathit{ALL}}\) is uncountable. Equivalently, there does not exist an onto map \(StALL:\{0,1\}^* \rightarrow \ensuremath{\mathit{ALL}}\) .

This is a direct consequence of Lemma 2.8 , since we can use the binary representation to show a one-to-one map from \(\{0,1\}^\infty\) to \(\ensuremath{\mathit{ALL}}\) . Hence the uncountability of \(\{0,1\}^\infty\) implies the uncountability of \(\ensuremath{\mathit{ALL}}\) .

Since \(\{0,1\}^\infty\) is uncountable, the result will follow by showing a one-to-one map from \(\{0,1\}^\infty\) to \(\ensuremath{\mathit{ALL}}\) . The reason is that the existence of such a map implies that if \(\ensuremath{\mathit{ALL}}\) was countable, and hence there was a one-to-one map from \(\ensuremath{\mathit{ALL}}\) to \(\N\) , then there would have been a one-to-one map from \(\{0,1\}^\infty\) to \(\N\) , contradicting Lemma 2.8 .

We now show this one-to-one map. We simply map a function \(f \in \{0,1\}^\infty\) to the function \(F:\{0,1\}^* \rightarrow \{0,1\}\) as follows. We let \(F(0)=f(0)\) , \(F(1)=f(1)\) , \(F(10)=f(2)\) , \(F(11)=f(3)\) and so on and so forth. That is, for every \(x\in \{0,1\}^*\) that represents a natural number \(n\) in the binary basis, we define \(F(x)=f(n)\) . If \(x\) does not represent such a number (e.g., it has a leading zero), then we set \(F(x)=0\) .

This map is one-to-one since if \(f \neq g\) are two distinct elements in \(\{0,1\}^\infty\) , then there must be some input \(n\in \N\) on which \(f(n) \neq g(n)\) . But then if \(x\in \{0,1\}^*\) is the string representing \(n\) , we see that \(F(x) \neq G(x)\) where \(F\) is the function in \(\ensuremath{\mathit{ALL}}\) that \(f\) mapped to, and \(G\) is the function that \(g\) is mapped to.

Equivalent conditions for countability

The results above establish many equivalent ways to phrase the fact that a set is countable. Specifically, the following statements are all equivalent:

The set \(S\) is countable

There exists an onto map from \(\N\) to \(S\)

There exists an onto map from \(\{0,1\}^*\) to \(S\) .

There exists a one-to-one map from \(S\) to \(\N\)

There exists a one-to-one map from \(S\) to \(\{0,1\}^*\) .

There exists an onto map from some countable set \(T\) to \(S\) .

There exists a one-to-one map from \(S\) to some countable set \(T\) .

Make sure you know how to prove the equivalence of all the results above.

Representing objects beyond numbers

Numbers are of course by no means the only objects that we can represent as binary strings. A representation scheme for representing objects from some set \(\mathcal{O}\) consists of an encoding function that maps an object in \(\mathcal{O}\) to a string, and a decoding function that decodes a string back to an object in \(\mathcal{O}\) . Formally, we make the following definition:

Let \(\mathcal{O}\) be any set. A representation scheme for \(\mathcal{O}\) is a pair of functions \(E,D\) where \(E:\mathcal{O} \rightarrow \{0,1\}^*\) is a total one-to-one function, \(D:\{0,1\}^* \rightarrow_p \mathcal{O}\) is a (possibly partial) function, and such that \(D\) and \(E\) satisfy that \(D(E(o))=o\) for every \(o\in \mathcal{O}\) . \(E\) is known as the encoding function and \(D\) is known as the decoding function.

Note that the condition \(D(E(o))=o\) for every \(o\in\mathcal{O}\) implies that \(D\) is onto (can you see why?). It turns out that to construct a representation scheme we only need to find an encoding function. That is, every one-to-one encoding function has a corresponding decoding function, as shown in the following lemma:

Suppose that \(E: \mathcal{O} \rightarrow \{0,1\}^*\) is one-to-one. Then there exists a function \(D:\{0,1\}^* \rightarrow \mathcal{O}\) such that \(D(E(o))=o\) for every \(o\in \mathcal{O}\) .

Let \(o_0\) be some arbitrary element of \(\mathcal{O}\) . For every \(x \in \{0,1\}^*\) , there exists either zero or a single \(o\in \mathcal{O}\) such that \(E(o)=x\) (otherwise \(E\) would not be one-to-one). We will define \(D(x)\) to equal \(o_0\) in the first case and this single object \(o\) in the second case. By definition \(D(E(o))=o\) for every \(o\in \mathcal{O}\) .

While the decoding function of a representation scheme can in general be a partial function, the proof of Lemma 2.14 implies that every representation scheme has a total decoding function. This observation can sometimes be useful.

Finite representations

If \(\mathcal{O}\) is finite , then we can represent every object in \(\mathcal{O}\) as a string of length at most some number \(n\) . What is the value of \(n\) ? Let us denote by \(\{0,1\}^{\leq n}\) the set \(\{ x\in \{0,1\}^* : |x| \leq n \}\) of strings of length at most \(n\) . The size of \(\{0,1\}^{\leq n}\) is equal to \[ |\{0,1\}^0| + |\{0,1\}^1| + |\{0,1\}^2| + \cdots + |\{0,1\}^n| = \sum_{i=0}^n 2^i = 2^{n+1}-1 \] using the standard formula for summing a geometric progression .

To obtain a representation of objects in \(\mathcal{O}\) as strings of length at most \(n\) we need to come up with a one-to-one function from \(\mathcal{O}\) to \(\{0,1\}^{\leq n}\) . We can do so, if and only if \(|\mathcal{O}| \leq 2^{n+1}-1\) as is implied by the following lemma:

For every two non-empty finite sets \(S,T\) , there exists a one-to-one \(E:S \rightarrow T\) if and only if \(|S| \leq |T|\) .

Let \(k=|S|\) and \(m=|T|\) and so write the elements of \(S\) and \(T\) as \(S = \{ s_0 , s_1, \ldots, s_{k-1} \}\) and \(T= \{ t_0 , t_1, \ldots, t_{m-1} \}\) . We need to show that there is a one-to-one function \(E: S \rightarrow T\) iff \(k \leq m\) . For the “if” direction, if \(k \leq m\) we can simply define \(E(s_i)=t_i\) for every \(i\in [k]\) . Clearly for \(i \neq j\) , \(t_i = E(s_i) \neq E(s_j) = t_j\) , and hence this function is one-to-one. In the other direction, suppose that \(k>m\) and \(E: S \rightarrow T\) is some function. Then \(E\) cannot be one-to-one. Indeed, for \(i=0,1,\ldots,m-1\) let us “mark” the element \(t_j=E(s_i)\) in \(T\) . If \(t_j\) was marked before, then we have found two objects in \(S\) mapping to the same element \(t_j\) . Otherwise, since \(T\) has \(m\) elements, when we get to \(i=m-1\) we mark all the objects in \(T\) . Hence, in this case, \(E(s_m)\) must map to an element that was already marked before. (This observation is sometimes known as the “Pigeonhole Principle”: the principle that if you have a pigeon coop with \(m\) holes and \(k>m\) pigeons, then there must be two pigeons in the same hole.)

Prefix-free encoding

When showing a representation scheme for rational numbers, we used the “hack” of encoding the alphabet \(\{ 0,1, \|\}\) to represent tuples of strings as a single string. This is a special case of the general paradigm of prefix-free encoding. The idea is the following: if our representation has the property that no string \(x\) representing an object \(o\) is a prefix (i.e., an initial substring) of a string \(y\) representing a different object \(o'\) , then we can represent a list of objects by merely concatenating the representations of all the list members. For example, because in English every sentence ends with a punctuation mark such as a period, exclamation, or question mark, no sentence can be a prefix of another and so we can represent a list of sentences by merely concatenating the sentences one after the other. (English has some complications such as periods used for abbreviations (e.g., “e.g.”) or sentence quotes containing punctuation, but the high level point of a prefix-free representation for sentences still holds.)

It turns out that we can transform every representation to a prefix-free form. This justifies Big Idea 1 , and allows us to transform a representation scheme for objects of a type \(T\) to a representation scheme of lists of objects of the type \(T\) . By repeating the same technique, we can also represent lists of lists of objects of type \(T\) , and so on and so forth. But first let us formally define prefix-freeness:

For two strings \(y,y'\) , we say that \(y\) is a prefix of \(y'\) if \(|y| \leq |y'|\) and for every \(i<|y|\) , \(y'_i = y_i\) .

Let \(\mathcal{O}\) be a non-empty set and \(E:\mathcal{O} \rightarrow \{0,1\}^*\) be a function. We say that \(E\) is prefix-free if \(E(o)\) is non-empty for every \(o\in\mathcal{O}\) and there does not exist a distinct pair of objects \(o, o' \in \mathcal{O}\) such that \(E(o)\) is a prefix of \(E(o')\) .

Recall that for every set \(\mathcal{O}\) , the set \(\mathcal{O}^*\) consists of all finite length tuples (i.e., lists ) of elements in \(\mathcal{O}\) . The following theorem shows that if \(E\) is a prefix-free encoding of \(\mathcal{O}\) then by concatenating encodings we can obtain a valid (i.e., one-to-one) representation of \(\mathcal{O}^*\) :

Suppose that \(E:\mathcal{O} \rightarrow \{0,1\}^*\) is prefix-free. Then the following map \(\overline{E}:\mathcal{O}^* \rightarrow \{0,1\}^*\) is one to one, for every \((o_0,\ldots,o_{k-1}) \in \mathcal{O}^*\) , we define \[ \overline{E}(o_0,\ldots,o_{k-1}) = E(o_0)E(o_1) \cdots E(o_{k-1}) \;. \]

Theorem 2.18 is an example of a theorem that is a little hard to parse, but in fact is fairly straightforward to prove once you understand what it means. Therefore, I highly recommend that you pause here to make sure you understand the statement of this theorem. You should also try to prove it on your own before proceeding further.

what is natural representation

The idea behind the proof is simple. Suppose that for example we want to decode a triple \((o_0,o_1,o_2)\) from its representation \(x= \overline{E}(o_0,o_1,o_2)=E(o_0)E(o_1)E(o_2)\) . We will do so by first finding the first prefix \(x_0\) of \(x\) that is a representation of some object. Then we will decode this object, remove \(x_0\) from \(x\) to obtain a new string \(x'\) , and continue onwards to find the first prefix \(x_1\) of \(x'\) and so on and so forth (see Exercise 2.9 ). The prefix-freeness property of \(E\) will ensure that \(x_0\) will in fact be \(E(o_0)\) , \(x_1\) will be \(E(o_1)\) , etc.

We now show the formal proof. Suppose, towards the sake of contradiction, that there exist two distinct tuples \((o_0,\ldots,o_{k-1})\) and \((o'_0,\ldots,o'_{k'-1})\) such that

\[ \overline{E}(o_0,\ldots,o_{k-1})= \overline{E}(o'_0,\ldots,o'_{k'-1}) \;. \;\;(2.4) \] We will denote the string \(\overline{E}(o_0,\ldots,o_{k-1})\) by \(\overline{x}\) .

Let \(i\) be the first index such that \(o_i \neq o'_i\) . (If \(o_i=o'_i\) for all \(i\) then, since we assume the two tuples are distinct, one of them must be larger than the other. In this case we assume without loss of generality that \(k'>k\) and let \(i=k\) .) In the case that \(i<k\) , we see that the string \(\overline{x}\) can be written in two different ways:

\[ \overline{x} = \overline{E}(o_0,\ldots,o_{k-1}) = x_0\cdots x_{i-1} E(o_i) E(o_{i+1}) \cdots E(o_{k-1}) \]

\[ \overline{x} = \overline{E}(o'_0,\ldots,o'_{k'-1}) = x_0\cdots x_{i-1} E(o'_i) E(o'_{i+1}) \cdots E(o'_{k'-1}) \]

where \(x_j = E(o_j) = E(o'_j)\) for all \(j<i\) . Let \(\overline{y}\) be the string obtained after removing the prefix \(x_0 \cdots x_{i-1}\) from \(\overline{x}\) . We see that \(\overline{y}\) can be written as both \(\overline{y}= E(o_i)s\) for some string \(s\in \{0,1\}^*\) and as \(\overline{y} = E(o'_i)s'\) for some \(s'\in \{0,1\}^*\) . But this means that one of \(E(o_i)\) and \(E(o'_i)\) must be a prefix of the other, contradicting the prefix-freeness of \(E\) .

In the case that \(i=k\) and \(k'>k\) , we get a contradiction in the following way. In this case

\[\overline{x} = E(o_0)\cdots E(o_{k-1}) = E(o_0) \cdots E(o_{k-1}) E(o'_k) \cdots E(o'_{k'-1})\]

which means that \(E(o'_k) \cdots E(o'_{k'-1})\) must correspond to the empty string \(\text{\ensuremath{\text{\texttt{""}}}}\) . But in such a case \(E(o'_k)\) must be the empty string, which in particular is the prefix of any other string, contradicting the prefix-freeness of \(E\) .

Even if the representation \(E\) of objects in \(\mathcal{O}\) is prefix free, this does not mean that our representation \(\overline{E}\) of lists of such objects will be prefix free as well. In fact, it won’t be: for every three objects \(o,o',o''\) the representation of the list \((o,o')\) will be a prefix of the representation of the list \((o,o',o'')\) . However, as we see in Lemma 2.20 below, we can transform every representation into prefix-free form, and so will be able to use that transformation if needed to represent lists of lists, lists of lists of lists, and so on and so forth.

Making representations prefix-free

Some natural representations are prefix-free. For example, every fixed output length representation (i.e., one-to-one function \(E:\mathcal{O} \rightarrow \{0,1\}^n\) ) is automatically prefix-free, since a string \(x\) can only be a prefix of an equal-length \(x'\) if \(x\) and \(x'\) are identical. Moreover, the approach we used for representing rational numbers can be used to show the following:

Let \(E:\mathcal{O} \rightarrow \{0,1\}^*\) be a one-to-one function. Then there is a one-to-one prefix-free encoding \(\overline{E}\) such that \(|\overline{E}(o)| \leq 2|E(o)|+2\) for every \(o\in \mathcal{O}\) .

For the sake of completeness, we will include the proof below, but it is a good idea for you to pause here and try to prove it on your own, using the same technique we used for representing rational numbers.

The idea behind the proof is to use the map \(0 \mapsto 00\) , \(1 \mapsto 11\) to “double” every bit in the string \(x\) and then mark the end of the string by concatenating to it the pair \(01\) . If we encode a string \(x\) in this way, it ensures that the encoding of \(x\) is never a prefix of the encoding of a distinct string \(x'\) . Formally, we define the function \(\ensuremath{\mathit{PF}}:\{0,1\}^* \rightarrow \{0,1\}^*\) as follows: \[\ensuremath{\mathit{PF}}(x)=x_0 x_0 x_1 x_1 \ldots x_{n-1}x_{n-1}01\] for every \(x\in \{0,1\}^*\) . If \(E:\mathcal{O} \rightarrow \{0,1\}^*\) is the (potentially not prefix-free) representation for \(\mathcal{O}\) , then we transform it into a prefix-free representation \(\overline{E}:\mathcal{O} \rightarrow \{0,1\}^*\) by defining \(\overline{E}(o)=\ensuremath{\mathit{PF}}(E(o))\) .

To prove the lemma we need to show that (1) \(\overline{E}\) is one-to-one and (2) \(\overline{E}\) is prefix-free. In fact, prefix freeness is a stronger condition than one-to-one (if two strings are equal then in particular one of them is a prefix of the other) and hence it suffices to prove (2) , which we now do.

Let \(o \neq o'\) in \(\mathcal{O}\) be two distinct objects. We will prove that \(\overline{E}(o)\) is not a prefix of \(\overline{E}(o')\) , or in other words \(\ensuremath{\mathit{PF}}(x)\) is not a prefix of \(\ensuremath{\mathit{PF}}(x')\) where \(x = E(o)\) and \(x'=E(o')\) . Since \(E\) is one-to-one, \(x \neq x'\) . We will split into three cases, depending on whether \(|x|<|x'|\) , \(|x|=|x'|\) , or \(|x|>|x'|\) . If \(|x|<|x'|\) then the two bits in positions \(2|x|,2|x|+1\) in \(\ensuremath{\mathit{PF}}(x)\) have the value \(01\) but the corresponding bits in \(\ensuremath{\mathit{PF}}(x')\) will equal either \(00\) or \(11\) (depending on the \(|x|\) -th bit of \(x'\) ) and hence \(\ensuremath{\mathit{PF}}(x)\) cannot be a prefix of \(\ensuremath{\mathit{PF}}(x')\) . If \(|x|=|x'|\) then, since \(x \neq x'\) , there must be a coordinate \(i\) in which they differ, meaning that the strings \(\ensuremath{\mathit{PF}}(x)\) and \(\ensuremath{\mathit{PF}}(x')\) differ in the coordinates \(2i,2i+1\) , which again means that \(\ensuremath{\mathit{PF}}(x)\) cannot be a prefix of \(\ensuremath{\mathit{PF}}(x')\) . If \(|x|>|x'|\) then \(|\ensuremath{\mathit{PF}}(x)|=2|x|+2>|\ensuremath{\mathit{PF}}(x')|=2|x'|+2\) and hence \(\ensuremath{\mathit{PF}}(x)\) is longer than (and cannot be a prefix of) \(\ensuremath{\mathit{PF}}(x')\) . In all cases we see that \(\ensuremath{\mathit{PF}}(x)=\overline{E}(o)\) is not a prefix of \(\ensuremath{\mathit{PF}}(x')=\overline{E}(o')\) , hence completing the proof.

The proof of Lemma 2.20 is not the only or even the best way to transform an arbitrary representation into prefix-free form. Exercise 2.10 asks you to construct a more efficient prefix-free transformation satisfying \(|\overline{E}(o)| \leq |E(o)| + O(\log |E(o)|)\) .

“Proof by Python” (optional)

The proofs of Theorem 2.18 and Lemma 2.20 are constructive in the sense that they give us:

A way to transform the encoding and decoding functions of any representation of an object \(O\) to encoding and decoding functions that are prefix-free, and

A way to extend prefix-free encoding and decoding of single objects to encoding and decoding of lists of objects by concatenation.

Specifically, we could transform any pair of Python functions encode and decode to functions pfencode and pfdecode that correspond to a prefix-free encoding and decoding. Similarly, given pfencode and pfdecode for single objects, we can extend them to encoding of lists. Let us show how this works for the case of the NtS and StN functions we defined above.

We start with the “Python proof” of Lemma 2.20 : a way to transform an arbitrary representation into one that is prefix free . The function prefixfree below takes as input a pair of encoding and decoding functions, and returns a triple of functions containing prefix-free encoding and decoding functions, as well as a function that checks whether a string is a valid encoding of an object.

Note that the Python function prefixfree above takes two Python functions as input and outputs three Python functions as output. (When it’s not too awkward, we use the term “Python function” or “subroutine” to distinguish between such snippets of Python programs and mathematical functions.) You don’t have to know Python in this course, but you do need to get comfortable with the idea of functions as mathematical objects in their own right, that can be used as inputs and outputs of other functions.

We now show a “Python proof” of Theorem 2.18 . Namely, we show a function represlists that takes as input a prefix-free representation scheme (implemented via encoding, decoding, and validity testing functions) and outputs a representation scheme for lists of such objects. If we want to make this representation prefix-free then we could fit it into the function prefixfree above.

Representing letters and text

We can represent a letter or symbol by a string, and then if this representation is prefix-free, we can represent a sequence of symbols by merely concatenating the representation of each symbol. One such representation is the ASCII that represents \(128\) letters and symbols as strings of \(7\) bits. Since the ASCII representation is fixed-length, it is automatically prefix-free (can you see why?). Unicode is the representation of (at the time of this writing) about 128,000 symbols as numbers (known as code points ) between \(0\) and \(1,114,111\) . There are several types of prefix-free representations of the code points, a popular one being UTF-8 that encodes every codepoint into a string of length between \(8\) and \(32\) .

what is natural representation

The Braille system is another way to encode letters and other symbols as binary strings. Specifically, in Braille, every letter is encoded as a string in \(\{0,1\}^6\) , which is written using indented dots arranged in two columns and three rows, see Figure 2.10 . (Some symbols require more than one six-bit string to encode, and so Braille uses a more general prefix-free encoding.)

The Braille system was invented in 1821 by Louis Braille when he was just 12 years old (though he continued working on it and improving it throughout his life). Braille was a French boy who lost his eyesight at the age of 5 as the result of an accident.

We can use programming languages to probe how our computing environment represents various values. This is easiest to do in “unsafe” programming languages such as C that allow direct access to the memory.

Using a simple C program we have produced the following representations of various values. One can see that for integers, multiplying by 2 corresponds to a “left shift” inside each byte. In contrast, for floating-point numbers, multiplying by two corresponds to adding one to the exponent part of the representation. In the architecture we used, a negative number is represented using the two’s complement approach. C represents strings in a prefix-free form by ensuring that a zero byte is at their end.

Representing vectors, matrices, images

Once we can represent numbers and lists of numbers, then we can also represent vectors (which are just lists of numbers). Similarly, we can represent lists of lists, and thus, in particular, can represent matrices . To represent an image, we can represent the color at each pixel by a list of three numbers corresponding to the intensity of Red, Green and Blue. (We can restrict to three primary colors since most humans only have three types of cones in their retinas; we would have needed 16 primary colors to represent colors visible to the Mantis Shrimp .) Thus an image of \(n\) pixels would be represented by a list of \(n\) such length-three lists. A video can be represented as a list of images. Of course these representations are rather wasteful and much more compact representations are typically used for images and videos, though this will not be our concern in this book.

Representing graphs

A graph on \(n\) vertices can be represented as an \(n\times n\) adjacency matrix whose \((i,j)^{th}\) entry is equal to \(1\) if the edge \((i,j)\) is present and is equal to \(0\) otherwise. That is, we can represent an \(n\) vertex directed graph \(G=(V,E)\) as a string \(A\in \{0,1\}^{n^2}\) such that \(A_{i,j}=1\) iff the edge \(\overrightarrow{i\;j}\in E\) . We can transform an undirected graph to a directed graph by replacing every edge \(\{i,j\}\) with both edges \(\overrightarrow{i\; j}\) and \(\overleftarrow{i\;j}\)

Another representation for graphs is the adjacency list representation. That is, we identify the vertex set \(V\) of a graph with the set \([n]\) where \(n=|V|\) , and represent the graph \(G=(V,E)\) as a list of \(n\) lists, where the \(i\) -th list consists of the out-neighbors of vertex \(i\) . The difference between these representations can be significant for some applications, though for us would typically be immaterial.

what is natural representation

Representing lists and nested lists

If we have a way of representing objects from a set \(\mathcal{O}\) as binary strings, then we can represent lists of these objects by applying a prefix-free transformation. Moreover, we can use a trick similar to the above to handle nested lists. The idea is that if we have some representation \(E:\mathcal{O} \rightarrow \{0,1\}^*\) , then we can represent nested lists of items from \(\mathcal{O}\) using strings over the five element alphabet \(\Sigma = \{\) 0 , 1 , [ , ] , , \(\}\) . For example, if \(o_1\) is represented by 0011 , \(o_2\) is represented by 10011 , and \(o_3\) is represented by 00111 , then we can represent the nested list \((o_1,(o_2,o_3))\) as the string "[0011,[10011,00111]]" over the alphabet \(\Sigma\) . By encoding every element of \(\Sigma\) itself as a three-bit string, we can transform any representation for objects \(\mathcal{O}\) into a representation that enables representing (potentially nested) lists of these objects.

We will typically identify an object with its representation as a string. For example, if \(F:\{0,1\}^* \rightarrow \{0,1\}^*\) is some function that maps strings to strings and \(n\) is an integer, we might make statements such as “ \(F(n)+1\) is prime” to mean that if we represent \(n\) as a string \(x\) , then the integer \(m\) represented by the string \(F(x)\) satisfies that \(m+1\) is prime. (You can see how this convention of identifying objects with their representation can save us a lot of cumbersome formalism.) Similarly, if \(x,y\) are some objects and \(F\) is a function that takes strings as inputs, then by \(F(x,y)\) we will mean the result of applying \(F\) to the representation of the ordered pair \((x,y)\) . We use the same notation to invoke functions on \(k\) -tuples of objects for every \(k\) .

This convention of identifying an object with its representation as a string is one that we humans follow all the time. For example, when people say a statement such as “ \(17\) is a prime number”, what they really mean is that the integer whose decimal representation is the string “ 17 ”, is prime.

When we say \(A\) is an algorithm that computes the multiplication function on natural numbers. what we really mean is that \(A\) is an algorithm that computes the function \(F:\{0,1\}^* \rightarrow \{0,1\}^*\) such that for every pair \(a,b \in \N\) , if \(x\in \{0,1\}^*\) is a string representing the pair \((a,b)\) then \(F(x)\) will be a string representing their product \(a\cdot b\) .

Defining computational tasks as mathematical functions

Abstractly, a computational process is some process that takes an input which is a string of bits and produces an output which is a string of bits. This transformation of input to output can be done using a modern computer, a person following instructions, the evolution of some natural system, or any other means.

In future chapters, we will turn to mathematically defining computational processes, but, as we discussed above, at the moment we focus on computational tasks . That is, we focus on the specification and not the implementation . Again, at an abstract level, a computational task can specify any relation that the output needs to have with the input. However, for most of this book, we will focus on the simplest and most common task of computing a function . Here are some examples:

Given (a representation of) two integers \(x,y\) , compute the product \(x\times y\) . Using our representation above, this corresponds to computing a function from \(\{0,1\}^*\) to \(\{0,1\}^*\) . We have seen that there is more than one way to solve this computational task, and in fact, we still do not know the best algorithm for this problem.

Given (a representation of) an integer \(z>1\) , compute its factorization ; i.e., the list of primes \(p_1 \leq \cdots \leq p_k\) such that \(z = p_1\cdots p_k\) . This again corresponds to computing a function from \(\{0,1\}^*\) to \(\{0,1\}^*\) . The gaps in our knowledge of the complexity of this problem are even larger.

Given (a representation of) a graph \(G\) and two vertices \(s\) and \(t\) , compute the length of the shortest path in \(G\) between \(s\) and \(t\) , or do the same for the longest path (with no repeated vertices) between \(s\) and \(t\) . Both these tasks correspond to computing a function from \(\{0,1\}^*\) to \(\{0,1\}^*\) , though it turns out that there is a vast difference in their computational difficulty.

Given the code of a Python program, determine whether there is an input that would force it into an infinite loop. This task corresponds to computing a partial function from \(\{0,1\}^*\) to \(\{0,1\}\) since not every string corresponds to a syntactically valid Python program. We will see that we do understand the computational status of this problem, but the answer is quite surprising.

Given (a representation of) an image \(I\) , decide if \(I\) is a photo of a cat or a dog. This corresponds to computing some (partial) function from \(\{0,1\}^*\) to \(\{0,1\}\) .

An important special case of computational tasks corresponds to computing Boolean functions, whose output is a single bit \(\{0,1\}\) . Computing such functions corresponds to answering a YES/NO question, and hence this task is also known as a decision problem . Given any function \(F:\{0,1\}^* \rightarrow \{0,1\}\) and \(x\in \{0,1\}^*\) , the task of computing \(F(x)\) corresponds to the task of deciding whether or not \(x\in L\) where \(L = \{ x : F(x)=1 \}\) is known as the language that corresponds to the function \(F\) . (The language terminology is due to historical connections between the theory of computation and formal linguistics as developed by Noam Chomsky.) Hence many texts refer to such a computational task as deciding a language .

what is natural representation

For every particular function \(F\) , there can be several possible algorithms to compute \(F\) . We will be interested in questions such as:

For a given function \(F\) , can it be the case that there is no algorithm to compute \(F\) ?

If there is an algorithm, what is the best one? Could it be that \(F\) is “effectively uncomputable” in the sense that every algorithm for computing \(F\) requires a prohibitively large amount of resources?

If we cannot answer this question, can we show equivalence between different functions \(F\) and \(F'\) in the sense that either they are both easy (i.e., have fast algorithms) or they are both hard?

Can a function being hard to compute ever be a good thing ? Can we use it for applications in areas such as cryptography?

In order to do that, we will need to mathematically define the notion of an algorithm , which is what we will do in Chapter 3 .

Distinguish functions from programs!

You should always watch out for potential confusions between specifications and implementations or equivalently between mathematical functions and algorithms/programs . It does not help that programming languages (Python included) use the term “functions” to denote (parts of) programs . This confusion also stems from thousands of years of mathematical history, where people typically defined functions by means of a way to compute them.

For example, consider the multiplication function on natural numbers. This is the function \(\ensuremath{\mathit{MULT}}:\N\times \N \rightarrow \N\) that maps a pair \((x,y)\) of natural numbers to the number \(x \cdot y\) . As we mentioned, it can be implemented in more than one way:

Both mult1 and mult2 produce the same output given the same pair of natural number inputs. (Though mult1 will take far longer to do so when the numbers become large.) Hence, even though these are two different programs , they compute the same mathematical function . This distinction between a program or algorithm \(A\) , and the function \(F\) that \(A\) computes will be absolutely crucial for us in this course (see also Figure 2.13 ).

what is natural representation

A function is not the same as a program . A program computes a function.

Distinguishing functions from programs (or other ways for computing, including circuits and machines ) is a crucial theme for this course. For this reason, this is often a running theme in questions that I (and many other instructors) assign in homework and exams (hint, hint).

Functions capture quite a lot of computational tasks, but one can consider more general settings as well. For starters, we can and will talk about partial functions, that are not defined on all inputs. When computing a partial function, we only need to worry about the inputs on which the function is defined. Another way to say it is that we can design an algorithm for a partial function \(F\) under the assumption that someone “promised” us that all inputs \(x\) would be such that \(F(x)\) is defined (as otherwise, we do not care about the result). Hence such tasks are also known as promise problems .

Another generalization is to consider relations that may have more than one possible admissible output. For example, consider the task of finding any solution for a given set of equations. A relation \(R\) maps a string \(x\in \{0,1\}^*\) into a set of strings \(R(x)\) (for example, \(x\) might describe a set of equations, in which case \(R(x)\) would correspond to the set of all solutions to \(x\) ). We can also identify a relation \(R\) with the set of pairs of strings \((x,y)\) where \(y\in R(x)\) . A computational process solves a relation if for every \(x\in \{0,1\}^*\) , it outputs some string \(y\in R(x)\) .

Later in this book, we will consider even more general tasks, including interactive tasks, such as finding a good strategy in a game, tasks defined using probabilistic notions, and others. However, for much of this book, we will focus on the task of computing a function, and often even a Boolean function, that has only a single bit of output. It turns out that a great deal of the theory of computation can be studied in the context of this task, and the insights learned are applicable in the more general settings.

  • We can represent objects we want to compute on using binary strings.
  • A representation scheme for a set of objects \(\mathcal{O}\) is a one-to-one map from \(\mathcal{O}\) to \(\{0,1\}^*\) .
  • We can use prefix-free encoding to “boost” a representation for a set \(\mathcal{O}\) into representations of lists of elements in \(\mathcal{O}\) .
  • A basic computational task is the task of computing a function \(F:\{0,1\}^* \rightarrow \{0,1\}^*\) . This task encompasses not just arithmetical computations such as multiplication, factoring, etc. but a great many other tasks arising in areas as diverse as scientific computing, artificial intelligence, image processing, data mining and many many more.
  • We will study the question of finding (or at least giving bounds on) what the best algorithm for computing \(F\) for various interesting functions \(F\) is.

Which one of these objects can be represented by a binary string?

An integer \(x\)

An undirected graph \(G\) .

A directed graph \(H\)

All of the above.

Prove that the function \(NtS:\N \rightarrow \{0,1\}^*\) of the binary representation defined in Equation 2.1 satisfies that for every \(n\in \N\) , if \(x = NtS(n)\) then \(|x| =1+\max(0,\floor{\log_2 n})\) and \(x_i = \floor{x/2^{\floor{\log_2 n}-i}} \mod 2\) .

Prove that \(NtS\) is a one to one function by coming up with a function \(StN:\{0,1\}^* \rightarrow \N\) such that \(StN(NtS(n))=n\) for every \(n\in \N\) .

The ASCII encoding can be used to encode a string of \(n\) English letters as a \(7n\) bit binary string, but in this exercise, we ask about finding a more compact representation for strings of English lowercase letters.

Prove that there exists a representation scheme \((E,D)\) for strings over the 26-letter alphabet \(\{ a, b,c,\ldots,z \}\) as binary strings such that for every \(n>0\) and length- \(n\) string \(x \in \{ a,b,\ldots,z \}^n\) , the representation \(E(x)\) is a binary string of length at most \(4.8n+1000\) . In other words, prove that for every \(n\) , there exists a one-to-one function \(E:\{a,b,\ldots, z\}^n \rightarrow \{0,1\}^{\lfloor 4.8n +1000 \rfloor}\) .

Prove that there exists no representation scheme for strings over the alphabet \(\{ a, b,\ldots,z \}\) as binary strings such that for every length- \(n\) string \(x \in \{ a,b,\ldots, z\}^n\) , the representation \(E(x)\) is a binary string of length \(\lfloor 4.6n+1000 \rfloor\) . In other words, prove that there exists some \(n>0\) such that there is no one-to-one function \(E:\{a,b,\ldots,z \}^n \rightarrow \{0,1\}^{\lfloor 4.6n + 1000 \rfloor}\) .

Python’s bz2.compress function is a mapping from strings to strings, which uses the lossless (and hence one to one ) bzip2 algorithm for compression. After converting to lowercase, and truncating spaces and numbers, the text of Tolstoy’s “War and Peace” contains \(n=2,517,262\) . Yet, if we run bz2.compress on the string of the text of “War and Peace” we get a string of length \(k=6,274,768\) bits, which is only \(2.49n\) (and in particular much smaller than \(4.6n\) ). Explain why this does not contradict your answer to the previous question.

Interestingly, if we try to apply bz2.compress on a random string, we get much worse performance. In my experiments, I got a ratio of about \(4.78\) between the number of bits in the output and the number of characters in the input. However, one could imagine that one could do better and that there exists a company called “Pied Piper” with an algorithm that can losslessly compress a string of \(n\) random lowercase letters to fewer than \(4.6n\) bits. 5 Show that this is not the case by proving that for every \(n>100\) and one to one function \(Encode:\{a,\ldots,z\}^{n} \rightarrow \{0,1\}^*\) , if we let \(Z\) be the random variable \(|Encode(x)|\) (i.e., the length of \(Encode(x)\) ) for \(x\) chosen uniformly at random from the set \(\{a,\ldots,z\}^n\) , then the expected value of \(Z\) is at least \(4.6n\) .

Show that there is a string representation of directed graphs with vertex set \([n]\) and degree at most \(10\) that uses at most \(1000 n\log n\) bits. More formally, show the following: Suppose we define for every \(n\in\mathbb{N}\) , the set \(G_n\) as the set containing all directed graphs (with no self loops) over the vertex set \([n]\) where every vertex has degree at most \(10\) . Then, prove that for every sufficiently large \(n\) , there exists a one-to-one function \(E:G_n \rightarrow \{0,1\}^{\lfloor 1000 n \log n \rfloor}\) .

Define \(S_n\) to be the set of one-to-one and onto functions mapping \([n]\) to \([n]\) . Prove that there is a one-to-one mapping from \(S_n\) to \(G_{2n}\) , where \(G_{2n}\) is the set defined in Exercise 2.4 above.

In this question you will show that one cannot improve the representation of Exercise 2.4 to length \(o(n \log n)\) . Specifically, prove for every sufficiently large \(n\in \mathbb{N}\) there is no one-to-one function \(E:G_n \rightarrow \{0,1\}^{\lfloor 0.001 n \log n \rfloor +1000}\) .

Recall that the grade-school algorithm for multiplying two numbers requires \(O(n^2)\) operations. Suppose that instead of using decimal representation, we use one of the following representations \(R(x)\) to represent a number \(x\) between \(0\) and \(10^n-1\) . For which one of these representations you can still multiply the numbers in \(O(n^2)\) operations?

The standard binary representation: \(B(x)=(x_0,\ldots,x_{k})\) where \(x = \sum_{i=0}^{k} x_i 2^i\) and \(k\) is the largest number s.t. \(x \geq 2^k\) .

The reverse binary representation: \(B(x) = (x_{k},\ldots,x_0)\) where \(x_i\) is defined as above for \(i=0,\ldots,k-1\) .

Binary coded decimal representation: \(B(x)=(y_0,\ldots,y_{n-1})\) where \(y_i \in \{0,1\}^4\) represents the \(i^{th}\) decimal digit of \(x\) mapping \(0\) to \(0000\) , \(1\) to \(0001\) , \(2\) to \(0010\) , etc. (i.e.  \(9\) maps to \(1001\) )

Suppose that \(R:\N \rightarrow \{0,1\}^*\) corresponds to representing a number \(x\) as a string of \(x\) \(1\) ’s, (e.g., \(R(4)=1111\) , \(R(7)=1111111\) , etc.). If \(x,y\) are numbers between \(0\) and \(10^n -1\) , can we still multiply \(x\) and \(y\) using \(O(n^2)\) operations if we are given them in the representation \(R(\cdot)\) ?

Recall that if \(F\) is a one-to-one and onto function mapping elements of a finite set \(U\) into a finite set \(V\) then the sizes of \(U\) and \(V\) are the same. Let \(B:\N\rightarrow\{0,1\}^*\) be the function such that for every \(x\in\N\) , \(B(x)\) is the binary representation of \(x\) .

Prove that \(x < 2^k\) if and only if \(|B(x)| \leq k\) .

Use 1. to compute the size of the set \(\{ y \in \{0,1\}^* : |y| \leq k \}\) where \(|y|\) denotes the length of the string \(y\) .

Use 1. and 2. to prove that \(2^k-1 = 1 + 2 + 4+ \cdots + 2^{k-1}\) .

Suppose that \(F:\N\rightarrow\{0,1\}^*\) is a one-to-one function that is prefix-free in the sense that there is no \(a\neq b\) s.t. \(F(a)\) is a prefix of \(F(b)\) .

Prove that \(F_2:\N\times \N \rightarrow \{0,1\}^*\) , defined as \(F_2(a,b) = F(a)F(b)\) (i.e., the concatenation of \(F(a)\) and \(F(b)\) ) is a one-to-one function.

Prove that \(F_*:\N^*\rightarrow\{0,1\}^*\) defined as \(F_*(a_1,\ldots,a_k) = F(a_1)\cdots F(a_k)\) is a one-to-one function, where \(\N^*\) denotes the set of all finite-length lists of natural numbers.

Suppose that \(F:O\rightarrow\{0,1\}^*\) is some (not necessarily prefix-free) representation of the objects in the set \(O\) , and \(G:\N\rightarrow\{0,1\}^*\) is a prefix-free representation of the natural numbers. Define \(F'(o)=G(|F(o)|)F(o)\) (i.e., the concatenation of the representation of the length \(F(o)\) and \(F(o)\) ).

Prove that \(F'\) is a prefix-free representation of \(O\) .

Show that we can transform any representation to a prefix-free one by a modification that takes a \(k\) bit string into a string of length at most \(k+O(\log k)\) .

Show that we can transform any representation to a prefix-free one by a modification that takes a \(k\) bit string into a string of length at most \(k+ \log k + O(\log\log k)\) . 6

Suppose that \(S \subseteq \{0,1\}^*\) is some finite prefix-free set, and let \(n\) some number larger than \(\max \{ |x| : x\in X \}\) .

For every \(x\in S\) , let \(L(x) \subseteq \{0,1\}^n\) denote all the length- \(n\) strings whose first \(k\) bits are \(x_0,\ldots,x_{k-1}\) . Prove that (1) \(|L(x)|=2^{n-|x|}\) and (2) For every distinct \(x,x' \in S\) , \(L(x)\) is disjoint from \(L(x')\) .

Prove that \(\sum_{x\in S}2^{-|x|} \leq 1\) . ( Hint: first show that \(\sum_{x \in S} |L(x)| \leq 2^n\) .)

Prove that there is no prefix-free encoding of strings with less than logarithmic overhead. That is, prove that there is no function \(\ensuremath{\mathit{PF}}:\{0,1\}^* \rightarrow \{0,1\}^*\) s.t. \(|\ensuremath{\mathit{PF}}(x)| \leq |x|+0.9\log |x|\) for every sufficiently large \(x\in \{0,1\}^*\) and such that the set \(\{ \ensuremath{\mathit{PF}}(x) : x\in \{0,1\}^* \}\) is prefix-free. The factor \(0.9\) is arbitrary; all that matters is that it is less than \(1\) .

Prove that for every two one-to-one functions \(F:S \rightarrow T\) and \(G:T \rightarrow U\) , the function \(H:S \rightarrow U\) defined as \(H(x)=G(F(x))\) is one to one.

We have shown that the natural numbers can be represented as strings. Prove that the other direction holds as well: that there is a one-to-one map \(StN:\{0,1\}^* \rightarrow \N\) . ( \(StN\) stands for “strings to numbers.”)

Recall that Cantor proved that there is no one-to-one map \(RtN:\R \rightarrow \N\) . Show that Cantor’s result implies Theorem 2.5 .

Recall that for every set \(S\) , the set \(S^*\) is defined as the set of all finite sequences of members of \(S\) (i.e., \(S^* = \{ (x_0,\ldots,x_{n-1}) \;|\; n\in\mathbb{N} \;,\; \forall_{i\in [n]} x_i \in S \}\) ). Prove that there is a one-one-map from \(\mathbb{Z}^*\) to \(\mathbb{N}\) where \(\mathbb{Z}\) is the set of \(\{ \ldots, -3 , -2 , -1,0,+1,+2,+3,\ldots \}\) of all integers.

Bibliographical notes

The study of representing data as strings, including issues such as compression and error corrections falls under the purview of information theory , as covered in the classic textbook of Cover and Thomas ( Cover, Thomas, 2006 ) . Representations are also studied in the field of data structures design , as covered in texts such as ( Cormen, Leiserson, Rivest, Stein, 2009 ) .

The question of whether to represent integers with the most significant digit first or last is known as Big Endian vs. Little Endian representation. This terminology comes from Cohen’s ( Cohen, 1981 ) entertaining and informative paper about the conflict between adherents of both schools which he compared to the warring tribes in Jonathan Swift’s “Gulliver’s Travels” . The two’s complement representation of signed integers was suggested in von Neumann’s classic report ( von Neumann, 1945 ) that detailed the design approaches for a stored-program computer, though similar representations have been used even earlier in abacus and other mechanical computation devices.

The idea that we should separate the definition or specification of a function from its implementation or computation might seem “obvious,” but it took quite a lot of time for mathematicians to arrive at this viewpoint. Historically, a function \(F\) was identified by rules or formulas showing how to derive the output from the input. As we discuss in greater depth in Chapter 9 , in the 1800s this somewhat informal notion of a function started “breaking at the seams,” and eventually mathematicians arrived at the more rigorous definition of a function as an arbitrary assignment of input to outputs. While many functions may be described (or computed) by one or more formulas, today we do not consider that to be an essential property of functions, and also allow functions that do not correspond to any “nice” formula.

We have mentioned that all representations of the real numbers are inherently approximate . Thus an important endeavor is to understand what guarantees we can offer on the approximation quality of the output of an algorithm, as a function of the approximation quality of the inputs. This question is known as the question of determining the numerical stability of given equations. The Floating-Point Guide website contains an extensive description of the floating-point representation, as well the many ways in which it could subtly fail, see also the website 0.30000000000000004.com .

Dauben ( Dauben, 1990 ) gives a biography of Cantor with emphasis on the development of his mathematical ideas. ( Halmos, 1960 ) is a classic textbook on set theory, also including Cantor’s theorem. Cantor’s Theorem is also covered in many texts on discrete mathematics, including ( Lehman, Leighton, Meyer, 2018 ) ( Lewis, Zax, 2019 ) .

The adjacency matrix representation of graphs is not merely a convenient way to map a graph into a binary string, but it turns out that many natural notions and operations on matrices are useful for graphs as well. (For example, Google’s PageRank algorithm relies on this viewpoint.) The notes of Spielman’s course are an excellent source for this area, known as spectral graph theory . We will return to this view much later in this book when we talk about random walks .

While the Babylonians already invented a positional system much earlier, the decimal positional system we use today was invented by Indian mathematicians around the third century. Arab mathematicians took it up in the 8th century. It first received significant attention in Europe with the publication of the 1202 book “Liber Abaci” by Leonardo of Pisa, also known as Fibonacci, but it did not displace Roman numerals in common usage until the 15th century.

\(RtS\) stands for “real numbers to strings”.

\(FtS\) stands for “functions to strings”.

\(FtR\) stands for “functions to reals.”

Actually that particular fictional company uses a metric that focuses more on compression speed then ratio , see here and here .

Hint: Think recursively how to represent the length of the string.

Compiled on 12/06/2023 00:06:46

Creative Commons License

Produced using pandoc and panflute with templates derived from gitbook and bookdown .

  • Math Article

Natural Numbers

Natural numbers are a part of the number system which includes all the positive integers from 1 till infinity and are also used for counting purpose. It does not include zero (0). In fact, 1,2,3,4,5,6,7,8,9…., are also called counting numbers .

Natural numbers are  part of real numbers,  that include only the positive integers i.e. 1, 2, 3, 4,5,6, ………. excluding zero, fractions, decimals and negative numbers.

Note: Natural numbers do not include negative numbers or zero.

In this article, you will learn more about natural numbers with respect to their definition, comparison with whole numbers, representation in the number line, properties, etc.

Natural Number Definition

As explained in the introduction part, natural numbers are the numbers which are positive integers and includes numbers from 1 till infinity(∞). These numbers are countable and are generally used for calculation purpose.  The set of natural numbers is represented by the letter “ N ”.

N = {1,2,3,4,5,6,7,8,9,10…….}

Natural Numbers and Whole Numbers

Natural numbers include all the whole numbers excluding the number 0. In other words, all natural numbers are whole numbers, but all whole numbers are not natural numbers.

  • Natural Numbers = {1,2,3,4,5,6,7,8,9,…..}
  • Whole Numbers = {0,1,2,3,4,5,7,8,9,….}

Check out the difference between natural and whole numbers to know more about the differentiating properties of these two sets of numbers.

Natural Numbers and Whole Numbers Set Representation

The above representation of sets shows two regions. A ∩ B i.e. intersection of natural numbers and whole numbers (1, 2, 3, 4, 5, 6, ……..) and the green region showing A-B, i.e. part of the whole number (0).

Thus, a whole number is “a part of Integers consisting of all the natural number including 0.”

Is ‘0’ a Natural Number?

The answer to this question is ‘No’. As we know already, natural numbers start with 1 to infinity and are positive integers. But when we combine 0 with a positive integer such as 10, 20, etc. it becomes a natural number. In fact, 0 is a whole number which has a null value.

Every Natural Number is a Whole Number. True or False?

Every natural number is a whole number. The statement is true because natural numbers are the positive integers that start from 1 and goes till infinity whereas whole numbers also include all the positive integers along with 0.

Representing Natural Numbers on a Number Line

Natural numbers representation on a number line is as follows:

Natural Numbers and Whole numbers on a Number line

The above number line represents natural numbers and whole numbers. All the integers on the right-hand side of 0 represent the natural numbers, thus forming an infinite set of numbers. When 0 is included, these numbers become whole numbers which are also an infinite set of numbers.

Set of Natural Numbers

In a set notation, the symbol of natural number is “N” and it is represented as given below.

N = Set of all numbers starting from 1.

In Roster Form:

N = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ………………………………}

In Set Builder Form:

N = {x : x is an integer starting from 1}

Natural Numbers Examples

The natural numbers include the positive integers (also known as non-negative integers) and a few examples include 1, 2, 3, 4, 5, 6, …∞.  In other words, natural numbers are a set of all the whole numbers excluding 0.

23, 56, 78, 999, 100202, etc. are all examples of natural numbers.

Properties of Natural Numbers

Natural numbers properties are segregated into four main properties which include:

  • Closure property
  • Commutative property
  • Associative property
  • Distributive property 

Each of these properties is explained below in detail.

Closure Property

Natural numbers are always closed under addition and multiplication. The addition and multiplication of two or more natural numbers will always yield a natural number. In the case of subtraction and division, natural numbers do not obey closure property, which means subtracting or dividing two natural numbers might not give a natural number as a result.

  • Addition: 1 + 2 = 3, 3 + 4 = 7, etc. In each of these cases, the resulting number is always a natural number.
  • Multiplication: 2 × 3 = 6, 5 × 4 = 20, etc. In this case also, the resultant is always a natural number.
  • Subtraction: 9 – 5 = 4, 3 – 5 = -2, etc. In this case, the result may or may not be a natural number.
  • Division: 10 ÷ 5 = 2, 10 ÷ 3 = 3.33, etc. In this case, also, the resultant number may or may not be a natural number.

Note: Closure property does not hold, if any of the numbers in case of multiplication and division, is not a natural number. But for addition and subtraction, if the result is a positive number, then only closure property exists.

For example: 

  • -2 x 3 = -6; Not a natural number
  • 6/-2 = -3; Not a natural number
  • Associative Property

The associative property holds true in case of addition and multiplication of natural numbers i.e. a + ( b + c ) = ( a + b ) + c and a × ( b × c ) = ( a × b ) × c. On the other hand, for subtraction and division of natural numbers, the associative property does not hold true . An example of this is given below.

  • Addition: a + ( b + c ) = ( a + b ) + c => 3 + (15 + 1 ) = 19 and (3 + 15 ) + 1 = 19.
  • Multiplication: a × ( b × c ) = ( a × b ) × c => 3 × (15 × 1 ) = 45 and ( 3 × 15 ) × 1 = 45.
  • Subtraction: a – ( b – c ) ≠ ( a – b ) – c => 2 – (15 – 1 ) = – 12 and ( 2 – 15 ) – 1 = – 14.
  • Division: a ÷ ( b ÷ c ) ≠ ( a ÷ b ) ÷ c => 2 ÷( 3 ÷ 6 ) = 4 and ( 2 ÷ 3 ) ÷ 6 = 0.11.
  • Commutative Property

For commutative property

  • Addition and multiplication of natural numbers show the commutative property. For example, x + y = y + x and a × b = b × a
  • Subtraction and division of natural numbers do not show the commutative property. For example, x – y ≠ y – x and x ÷ y ≠ y ÷ x
  • Distributive Property
  • Multiplication of natural numbers is always distributive over addition. For example, a × (b + c) = ab + ac
  • Multiplication of natural numbers is also distributive over subtraction. For example, a × (b – c) = ab – ac

Read More Here:

Operations With Natural Numbers

An overview of algebraic operation with natural numbers i.e. addition, subtraction, multiplication and division, along with their respective properties are summarized in the table given below.

Video Lesson on Numbers

what is natural representation

Solved Examples

Question 1: Sort out the natural numbers from the following list: 20, 1555, 63.99, 5/2, 60, −78, 0, −2, −3/2

Solution: Natural numbers from the above list are 20, 1555 and 60.

Question 2: What are the first 10 natural numbers?

Solution: The first 10 natural numbers on the number line are 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.

Question 3: Is the number 0 a natural number?

Solution: 0 is not a natural number. It is a whole number. Natural numbers only include positive integers.

Stay tuned with BYJU’S and keep learning various other Maths topics in a simple and easily understandable way . Also, get other maths study materials, video lessons, practice questions, etc. by registering at BYJU’S.

Frequently Asked Questions on Natural Numbers

What are natural numbers.

Natural numbers are the positive integers or non-negative integers which start from 1 and ends at infinity, such as:

1,2,3,4,5,6,7,8,9,10,……,∞.

Is 0 a Natural Number?

Zero does not have a positive or negative value. Since all the natural numbers are positive integers, hence we cannot say zero is a natural number. Although zero is called a whole number.

What are the first ten Natural Numbers?

The first ten natural numbers are: 1,2,3,4,5,6,7,8,9, and 10.

What is the difference between Natural numbers and Whole numbers?

Natural numbers include only positive integers and starts from 1 till infinity. Whereas whole numbers are the combination of zero and natural numbers, as it starts from 0 and ends at infinite value.

What are the examples of Natural numbers?

The examples of natural numbers are 5, 7, 21, 24, 99, 101, etc.

Quiz Image

Put your understanding of this concept to test by answering a few MCQs. Click ‘Start Quiz’ to begin!

Select the correct answer and click on the “Finish” button Check your score and answers at the end of the quiz

Visit BYJU’S for all Maths related queries and study materials

Your result is as below

Request OTP on Voice Call

Leave a Comment Cancel reply

Your Mobile number and Email id will not be published. Required fields are marked *

Post My Comment

what is natural representation

I like your answers for my mathematics Project Work

what is natural representation

  • Share Share

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

close

What is Naturalism in Art History Style and Examples Featured

  • Cinematography

What is Naturalism in Art — History, Style & Examples

P icture this: you’re at an art gallery, and you come across a painting that looks so real, you feel like you could step right into it. Well, you’ve probably stumbled upon a work of Naturalism. But what exactly is Naturalism in art? How did it emerge, and what sets it apart from other art movements? 

TYPES OF ART STYLES

Art styles explained & art history timeline.

  • Abstract Expressionism
  • Art Nouveau
  • Avant-Garde
  • Conceptual Art
  • Constructivism Art
  • Contemporary Art
  • Expressionism
  • Harlem Renaissance
  • Installation Art
  • Kinetic Art
  • Magical Realism
  • Neoclassicism
  • Performance Art
  • Photorealism
  • Post-Impressionism
  • Primitivism
  • Romanticism
  • Renaissance
  • Suprematism

What is Naturalism in Art?

First, let’s define naturalism in art.

In the vast universe of art history , there are countless art styles and movements , each with its very own distinct character and charm. Among these, Naturalism and its distinct qualities stand as one of the most fascinating.

NATURALISM DEFINITION IN ART

What is naturalism in art.

Naturalism , as an art movement, is a precise and unadulterated representation of reality. It's like the artist's lens is focused on capturing the world precisely as it is, with no exaggerations or embellishments. This art movement thrives on the depiction of realistic subjects in a natural setting. It's the visual equivalent of holding up a mirror to society and the world at large.

Characteristics of the Naturalism Art Movement:

  • Unadulterated reality
  • Depicts natural settings
  • Reflects society

Naturalism Art History

Historical roots of naturalism.

So, how did Naturalism come to be? What historical events or transitions led to the birth of this art movement that so vividly mirrors reality? Let's find out.

For a general idea of the naturalism in art, we've collected some of the best naturalism paintings in StudioBinder's mood board creator. Click the image below to explore the collection and you can even download a PDF for future reference and inspiration.

Naturalism vs Realism Naturalism Mood Board StudioBinder Mood Boarding Software

Naturalism Paintings Collection

Tracing back to the origins.

Let's take a step back in time, specifically to the mid-19th century. This was when Naturalism started to take root, primarily in France. The world was going through the Industrial Revolution and societies were transforming at an unprecedented pace.

Amidst all this change, artists began to see things differently. Naturalism artists wanted to depict life as it was, raw and unfiltered, rather than presenting an idealized version of it.

In this era of transformation, Gustave Courbet emerged as a pioneering figure. Known widely as the 'father' of Naturalism, Courbet dared to step away from traditional norms.

His naturalism painting A Burial at Ornans is a testament to this, as it brought to canvas the reality of ordinary people and everyday events.

What is Naturalism in Art A Burial at Ornans () by Gustave Courbet

A Burial at Ornans (1851) by Gustave Courbet

Societal and cultural factors.

The mid-to-late 19th century was a period of significant social and economic change. The Industrial Revolution had altered the fabric of society, leading to urbanization and a new focus on the working class.

Artists began to reflect these changes in their work, focusing on realistic portrayals of everyday life. They moved away from the grandeur and romanticism of previous art movements. Instead, they chose to capture the world as they saw it, warts and all. This shift marked the birth of Naturalism.

One of the most iconic examples of Naturalism is Édouard Manet's Le Déjeuner sur l'herbe ("The Luncheon on the Grass"). Painted in 1863, this artwork was scandalous and audacious, capturing an unembellished depiction of society and challenging traditional norms.

What is Naturalism in Art Le Déjeuner sur l'herbe () by Édouard Manet

Le Déjeuner sur l'herbe (1863) by Édouard Manet

Then there's Ilya Repin's Barge Haulers on the Volga a vivid portrayal of laborers struggling under harsh conditions. The Naturalism painting is a powerful example of the movement's focus on the realities of working-class life.

What is Naturalism in Art Barge Haulers on the Volga ( ) by Ilya Repin

Barge Haulers on the Volga (1870-1873) by Ilya Repin

These works, and many others, highlight the unique ability of Naturalism to depict ordinary life in an extraordinary manner, turning the mundane into something truly remarkable.

Related Posts

  • What is Modern Art? →
  • A Guide to Western Art Movements →
  • Understanding the Abstract Expressionist Art Movement →

What is Naturalism in Art Defined By?

Characteristics of naturalism.

Naturalism's characteristics are more than just stylistic choices. They're a statement about the world we live in and the stories we choose to tell – and that's what makes them so significant in the art world.

Unvarnished Truth

The first and perhaps most prominent characteristic of Naturalism is its dedication to presenting reality unfiltered. This isn't about creating an idealized version of the world, but rather portraying things exactly as they are. 

Think of a painting capturing a bustling city street – not just the grandeur of the architecture, but also the grime on the pavements and the weary expressions of the pedestrians.

Detailed Realism

Detailing is another hallmark of Naturalism. Artists painstakingly capture every minute detail of the subjects they're representing, from the texture of an old man's skin to the way light reflects off a dew-soaked leaf.

It's akin to reading a richly descriptive novel, where the author leaves nothing to your imagination.

Everyday Subjects

Naturalism doesn't concern itself with the extraordinary or the heroic. Instead, it focuses on everyday life and ordinary people. A naturalistic painting might depict a woman doing laundry or a group of men playing cards in a tavern – mundane scenes that nevertheless tell compelling stories.

The significance of naturalistic style is twofold. Firstly, they force us to confront realities we might otherwise ignore by portraying the world in all its unadorned truth. 

Secondly, naturalism's allows viewers to connect more deeply with the artwork, fostering empathy and connection. By focusing on everyday subjects, Naturalism in art gives a voice to the often overlooked and broadens our understanding of the human experience.

Naturalism vs Realism

Naturalism’s close cousin is realism. They have some overlap, but the two schools of style are in fact distinct. Let’s nail down the difference between naturalism and realism .

First, it’s important to note that the definitions of both styles shift depending on the medium. In literature, naturalism typically has a darker tone than realism.

Both styles are preoccupied with depicting “reality,” but they have different perceptions of it. Realist writers believe that their characters have free will and control over their lives. This typically results in more hopeful narratives– rags to riches, overcoming obstacles, etc..

What is Naturalism in Art Oliver Twist is a realist work

Oliver Twist is a realist work

Naturalists, however, have a more deterministic view. Their characters’ lives are completely shaped by outside forces, whether it be society or nature. Many naturalist narratives show disadvantaged characters getting pummeled by structures outside of their control.

In art, the two fields are pretty different. Naturalist paintings are primarily concerned with showing beatific natural scenes– idyllic pastures, grand undeveloped landscapes.

What is Naturalism in Art “Sunrise in the Catskill Mountains by Thomas Cole a naturalist painting

“Sunrise in the Catskill Mountains” by Thomas Cole, a naturalist painting

Realist paintings, meanwhile, focus on the hardships of the proletariat. Realists would paint toiling farmers, struggling seamstresses, overworked manual laborers. 

What is Naturalism in Art The Angelus by Jean Francois Millet a realist painting

“The Angelus” by Jean-François Millet, a realist painting

It’s a far cry from the gorgeous natural world on which naturalists focused.

Naturalism Artists Today

Evolution and impact of naturalism.

Naturalism has beautifully evolved and unfurled its influence across time. It's not just a movement confined to history; it continues to breathe life into contemporary art and culture.

The Evolutionary Journey

What is Naturalism in Art Un dimanche après midi à l'Île de la Grande Jatte ( ) by Georges Seurat

Un dimanche après-midi à l'Île de la Grande Jatte (1884-1886) by Georges Seurat

Today, we see its legacy in the hyper-realistic works of artists who, much like their Naturalist predecessors, strive to depict the world with startling accuracy.

The Artistic Influence

Naturalism's impact echoes in the corridors of modern art . Contemporary artists continue to draw inspiration from this movement, appreciating the beauty in everyday life and finding extraordinary stories in ordinary moments. 

This influence is evident in the works of artists like Richard Estes and Chuck Close, who are known for their meticulous detail and uncompromising realism.

What is Naturalism in Art People's Flowers () by Richard Estes

People's Flowers (1971) by Richard Estes 

The cultural ripple effect.

Beyond the canvas, Naturalism has left a profound imprint on our society and culture. Its commitment to authenticity has encouraged a more honest conversation about the human experience. It's helped us appreciate the beauty of simplicity and find value in the mundane.

Naturalism's evolution is a testament to its enduring relevance. Its impact on art and society underscores the power of truth and authenticity, principles that continue to resonate in today's world.

Explore More Styles and Movements

This was just one of many fascinating segments of art history. There are many eras, styles, artists, and movements to discover. Let's continue our study by choosing the next stop on your way to becoming an art aficionado. Below you can visit our  Art Styles Index , our  Art History Timeline , or choose an individual movement.

Showcase your vision with elegant shot lists and storyboards.

Create robust and customizable shot lists. Upload images to make storyboards and slideshows.

Learn More ➜

Leave a comment

Your email address will not be published. Required fields are marked *

  • Pricing & Plans
  • Product Updates
  • Featured On
  • StudioBinder Partners
  • The Ultimate Guide to Call Sheets (with FREE Call Sheet Template)
  • How to Break Down a Script (with FREE Script Breakdown Sheet)
  • The Only Shot List Template You Need — with Free Download
  • Managing Your Film Budget Cashflow & PO Log (Free Template)
  • A Better Film Crew List Template Booking Sheet
  • Best Storyboard Softwares (with free Storyboard Templates)
  • Movie Magic Scheduling
  • Gorilla Software
  • Storyboard That

A visual medium requires visual methods. Master the art of visual storytelling with our FREE video series on directing and filmmaking techniques.

We’re in a golden age of TV writing and development. More and more people are flocking to the small screen to find daily entertainment. So how can you break put from the pack and get your idea onto the small screen? We’re here to help.

  • Making It: From Pre-Production to Screen
  • Is Film School Worth It — Why You Should or Shouldn’t Go
  • How to Write a Poem — A Step-by-Step Guide
  • What is a Character Flaw — And Why Writers Love Them
  • How to Become a Movie Critic — Career Tips Explained
  • What is an Indie Film — Definition & History Explained
  • 1 Pinterest

art in context logo retina

Naturalistic Art – Capturing the World’s Imperfections Through Art

Avatar for Isabella Meyer

Figure and landscape paintings before the early nineteenth century had an air of romanticism about them. The nineteenth-century saw a move towards a more realistic representation of the world around us. Naturalism became one of the most prominent movements of this century, and many believe that the combination of Naturalism and Realism led to modern art and Impressionism.

Table of Contents

  • 1.1 Naturalism and Idealism: Towards a Naturalism Definition
  • 1.2 What Is the Difference Between Realism and Naturalism?
  • 1.3 The Development of Photography
  • 1.4 The Role of Nationalism in Naturalism
  • 1.5 Figurative and Landscape Naturalistic Paintings
  • 2.1.1 The Norwich School (c.1803-1833)
  • 2.1.2 The Hudson River School (c. 1825-1875)
  • 2.1.3 The Barbizon School (c. 1830-1875)
  • 2.1.4 The Hague School (c. 1860-1900)
  • 2.1.5 The Russian Wanderers (c. 1863-1890)
  • 2.1.6 Impressionism (c. 1873-1886)
  • 2.1.7 The Glasgow School of Painting (c. 1880-1915)
  • 2.1.8 The Newlyn School (c. 1884-1914)
  • 2.1.9 The Heidelberg School (c. 1886-1900)
  • 3.1 The Hay Wain (1821) by John Constable
  • 3.2 Sunrise in the Catskills (1826) by Thomas Cole
  • 3.3 View of the Forest of Fontainebleau (1830) by Jean-Baptiste-Camille Corot
  • 3.4 Pardon in Brittany (1884) by Pascal-Adolphe-Jean Dagnan-Bouveret

A Brief Introduction to the Key Ideas of Naturalism

Naturalism is a complex art movement, and it can be tricky to settle on an exact Naturalism art definition. What is Naturalistic art all about? We find that Naturalism can be best understood in relation to other art movements and styles.

Naturalism and Idealism: Towards a Naturalism Definition

What is Naturalistic art? One of the easiest ways to understand Naturalism is to consider it in comparison to Idealism. Idealism is an artistic concept often related to figure painting, wherein an artist attempts to create a perfect image. Naturalism is, in essence, at the other end of the spectrum to Idealism. Rather than trying to make the world seem perfect, Naturalist painters prefer to depict all the world’s imperfections true to form.

What Is the Difference Between Realism and Naturalism?

Naturalism inherited much from the Realism movement, including the focus on depicting everyday people in everyday situations. Compared to Realism, however, Naturalism was more concerned with hyperreal precision in composition. Furthermore, Naturalism was unique in its attempts to integrate the human form into these scenes and landscapes. It is possible to understand Naturalist art as a fusion of Romantic landscape painting effects and techniques with the Realism ideology.

Although the movements are related and share several similarities, Naturalism in art is distinct from Realism in two critical ways. Realism is a style of true-to-life art that focuses on presenting the world and its social conditions and objects in the most realistic way possible.

The first distinct difference between Naturalism and Realism lies in the content of the paintings. While Naturalism tends to focus on the painting method, including the invention of en plein air painting, Realism focuses on the subject. Typically, Realist painters present ordinary people in everyday situations rather than idealized heroes.

The second distinction lies in the social awareness infused into Realism paintings. Realist painters often promoted political and social issues through their works. Socialist Realism and American Scene Painting are examples of social movements spurred on by developments in the Realism art movement. Naturalist painters were entirely concerned with developing the most natural style of painting.

Naturalism Art

The Development of Photography

Part of the Naturalistic art definition hinges on photography. Technological developments in taking photographs greatly influenced this art movement. Developments in photography challenged Realist landscape painters , and many began to experiment with different styles.

Failing to capture the world with the accuracy and speed of a camera, artists tried different techniques. Naturalist painters, however, reveled in this new technology. Using photography on its terms, Naturalist artists produced lifelike paintings unparalleled in any other period.

The Role of Nationalism in Naturalism

Another principal element of Naturalism in art is the way it incorporated regionalist and nationalist sentiments. Naturalist painters would tie their aesthetics to particular locations, often areas that the artists knew intimately, and carried sentimental value. Art historians believe that this tendency to paint scenes of familiarity to many people was integral to the democratization of art. The subjects of Naturalist paintings were familiar and sentimental to a wider viewership.

Naturalistic Paintings

Figurative and Landscape Naturalistic Paintings

It would be incorrect to assume that the subjects of Naturalist paintings were exclusively landscapes and scenes of nature. The naturalistic art definition is not synonymous with landscape paintings. Although landscape paintings were the most common among Naturalist painters, portraits and other genre paintings were also frequent subjects.

As not all Naturalist paintings are landscapes, so too are all landscapes, not Naturalist. The subjective view of each artist influenced their work in subtle ways. For example, the post-apocalyptic landscapes of John Martin, a religious visionary artist, are thought to represent God’s power. Furthermore, the landscape paintings of Turner lean more towards Expressionist experiments with light and color.

Despite producing landscape paintings, neither of these artists can be considered Naturalist. These artists create works of self-expression rather than attempting to capture the world for the sake of accurate representation.

The Historical Development of Naturalism

Naturalism has a complex history, depending on the artistic medium you consider. In terms of Naturalist sculpture, we could cite the exponents of classic Greek sculpture and their impeccable ability to replicate the human figure. Figurative Naturalist painting took its first steps in the Italian Renaissance with artists like Albrecht Durer, Leonardo da Vinci, and Michelangelo.

The Northern Renaissance, steeped in the Protestant Reformation in Northern Europe, saw a surge in replicating nature in perfect accuracy. Paintings of landscapes, figures, and everyday situations were proliferating during this period. Artists like Jan Vermeer, Aelbert Cuyp, and Willem Kalf created still life, landscapes, and paintings of the human figure in a recognizable Naturalist style. The Counter-Reformation of the Catholic church heralded Idealism, and romanticism reigned for the next century.

Naturalist Art

Modern Naturalism: Important Groups

The Naturalism definition and traditions of the modern era have their roots in several groups of artists. These artist groups tried to depict the natural world without subjective interpretation or distortion.

The Norwich School (c.1803-1833)

This painting school was led first by John Crome and later by John Sell Cotman, a watercolor artist . Taking inspiration from the Norfolk Broads, East Anglian landscapes, and salt marshes, the Norwich school also found inspiration in Dutch landscape painters from the 17th century, including Jacob van Ruisdael and Meindert Hobbema.

The Hudson River School (c. 1825-1875)

Led by Thomas Cole , an English artist, this school of romantic artists was based in New York City. This school of landscape artists took inspiration from romanticism, and many of their paintings depict the Hudson River Valley.

The Barbizon School (c. 1830-1875)

Art historians believe this to be the most influential Naturalist group, inspiring artists throughout America, Europe, and Australia. The French Naturalist school is renowned for the spontaneous en plein air landscape compositions of its members. Theodore Rousseau led this school, which included artists like Charles Daubigny, Jean-Francois Millet, and Camille Corot. While the Hague school in the Netherlands was adopting a more Impressionist style, the Barbozion school produced a Naturalist alternative that was more true-to-life.

The Hague School (c. 1860-1900)

The Hague school took inspiration from the Barbizon school, but with a more Post-Impressionist style. Anton Mauve, Johannes Bosboom, the Maris brothers, Hendrik Willem Mesdag, and Johan Hendrik Weissenbruch were prominent members of the Hague school.

The Russian Wanderers (c. 1863-1890)

This group of young Russian artists from the St Petersburg Imperial Academy of Arts traveled throughout Russia, painting people and landscapes. The most prominent landscape artists among them were Ivan Shishkin, Feodor Vasilyev, and Isaac Levitan. Other members of this artist group included Vasily Polenov, Vasily Perov, Nikolai Gay, Ilya Repin, and Vasily Surikov.

Naturalism in Art

Impressionism (c. 1873-1886)

By far one of the most famous Naturalist movements, Impressionism is best exemplified by the landscape paintings of Claude Monet, Alfred Sisley, Renoir , and Camille Pissarro. The greatest contribution of the Impressionists to the Naturalist movement was their ability to reproduce the effects of light.

Many of the other features of Impressionist landscapes do not, however, fit with our Naturalism definition. Impressionist painters used a range of non-naturalist colors, and the painterly techniques, including brush strokes, lend the paintings an air of expressionism and atmosphere.

The Glasgow School of Painting (c. 1880-1915)

Concerned with the Naturalistic depiction of life and work in the country, the loose-knit group of progressive artists was led by John Lavery and James Guthrie. These leading members were familiar with the Hague school, the Barbizon, Impressionism, and the German Worpswede group.

The Newlyn School (c. 1884-1914)

Set in the stunning Cornish countryside, this group of artists specialized in capturing the landscapes around them. You can see the impressive natural light of Newlyn in the paintings created by these artists who worked directly with the natural world. Stanhope Forbes, Norman Garstin, Frank Bramley, and Walter Langley were leading members of the Newlyn school.

The Heidelberg School (c. 1886-1900)

This group of Australian Naturalist artists practiced en plein air painting in a style that fused Impressionist brushwork with Barbizon details. Leading members include Arthur Streeton, Tom Roberts, Fred McCubbin, and Charles Conder.

Famous Examples of Naturalistic Paintings

Now that we have explored the theoretical foundations of Naturalism, let us look at some of the most famous Naturalistic paintings. The following paintings were created between 1821 and 1884 and include landscapes and figures.

The Hay Wain (1821) by John Constable

A perfect example of Naturalist paintings, this work depicts a horse-drawn hay cart crossing a river in front of a stunningly accurate agricultural landscape. Dabbled sunlight floats on top of the river water, interspersed with subtle reflections of the trees and embankment.

In line with the nationalism that often accompanied Naturalistic paintings, this painting represents an area of Suffolk now known as “Constable Country”. The way that the human figure is subtly placed within the natural landscape is typical of Naturalistic artworks. Although en plein air painting was a common practice for many Naturalist artists, Constable completed his paintings in his London studio.

Regardless of where it was painted, this painting is a perfect example of how Naturalist artists presented an informal snapshot of the natural world. The accuracy of the painting subtly communicates the emotive and poetic qualities of the landscape.

Sunrise in the Catskills (1826) by Thomas Cole

This painting is one of the first by the English-American landscape painter. Finding inspiration in the areas around the Hudson River Valley, Cole was one of the first artists to present these landscapes in the style of European Romanticism. Many regard Cole as the father of the Hudson River school and we cannot overstress his role in developing the Naturalist style.

The painting presents a view of a rocky outcrop bathed in the early morning sunlight. You can see the swirling mist rising from the valleys of the Catskill mountains and catching the sunlight. The dark looming mountains are framed by tangled brush, fallen trees, and jagged rocks. The view we have through this painting is of an untainted American Wilderness.

Naturalistic Art Definition

View of the Forest of Fontainebleau (1830) by Jean-Baptiste-Camille Corot

A single reclined figure lies before intimidating oak trees and the embankment of a river. The figure is secondary to the composition of the landscape, which is typical of Naturalistic paintings. The river is highlighted with sunlight, while the large oak trees cast deep shadows across the landscape. The landscape in question is inspired by the countryside surrounding the Barbizon school.

This painting is the outcome of years of oil studies and sketches. Although not completed en Plein air , the extensive preparation allows Corot to capture an incredibly life-like landscape. Many believe Corot’s work represents a transition between Neo-Classicism and Impressionism, and in so doing, he contributed to the Naturalist movement in significant ways.

Pardon in Brittany (1884) by Pascal-Adolphe-Jean Dagnan-Bouveret

This Naturalist artwork is the only one on our list that does not focus on the natural landscape. Instead, the human experience is the subject of this Naturalistic painting. The painting presents a religious custom from Brittany where citizens from the surrounding rural villages would proceed around the church begging for forgiveness through prayer. The human figures, although barefooted, are dressed in ceremonial clothes.

Pascal-Adolphe-Jean Dagnan-Bouveret was a Naturalist artist whose favored subjects were often the life of peasants in Brittany. Dagnan-Bouveret placed great emphasis on Naturalistic painting with photographic exactitude. This painting appears to depict a snapshot moment, but was actually the result of studio composition and an artificially crafted tableau.

Naturalism Definition

The desire to capture the natural world without the influence of human subjectivity permeates Naturalist paintings. Although landscapes are the most popular subject for Naturalistic artists, portraits and still life compositions also had their place. The history of Naturalism is complex, and not entirely over. The incredible talent and beauty of Naturalist art is undeniable and has profoundly influenced many modern art movements.

isabella meyer

Isabella studied at the University of Cape Town in South Africa and graduated with a Bachelor of Arts majoring in English Literature & Language and Psychology. Throughout her undergraduate years, she took Art History as an additional subject and absolutely loved it. Building on from her art history knowledge that began in high school, art has always been a particular area of fascination for her. From learning about artworks previously unknown to her, or sharpening her existing understanding of specific works, the ability to continue learning within this interesting sphere excites her greatly.

Her focal points of interest in art history encompass profiling specific artists and art movements, as it is these areas where she is able to really dig deep into the rich narrative of the art world. Additionally, she particularly enjoys exploring the different artistic styles of the 20 th century, as well as the important impact that female artists have had on the development of art history.

Learn more about Isabella Meyer and the Art in Context Team .

Cite this Article

Isabella, Meyer, “Naturalistic Art – Capturing the World’s Imperfections Through Art.” Art in Context. February 8, 2021. URL: https://artincontext.org/naturalistic-art/

Meyer, I. (2021, 8 February). Naturalistic Art – Capturing the World’s Imperfections Through Art. Art in Context. https://artincontext.org/naturalistic-art/

Meyer, Isabella. “Naturalistic Art – Capturing the World’s Imperfections Through Art.” Art in Context , February 8, 2021. https://artincontext.org/naturalistic-art/ .

Similar Posts

Why Is Art Important? – The Value of Creative Expression

Why Is Art Important? – The Value of Creative Expression

Representational Art – The Development of Representational Artworks

Representational Art – The Development of Representational Artworks

Cycladic Art – A Look at the Marble Figures and Sculptures of This Era

Cycladic Art – A Look at the Marble Figures and Sculptures of This Era

Famous Ancient Artifacts – A List of the Most Famous Artifacts Ever

Famous Ancient Artifacts – A List of the Most Famous Artifacts Ever

Hellenistic Art – Ancient Greek Multiculturalism

Hellenistic Art – Ancient Greek Multiculturalism

Generative Art – Discover Where Code Meets Creativity

Generative Art – Discover Where Code Meets Creativity

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

The Most Famous Artists and Artworks

Discover the most famous artists, paintings, sculptors…in all of history! 

what is natural representation

MOST FAMOUS ARTISTS AND ARTWORKS

Discover the most famous artists, paintings, sculptors!

artincontext art history newsletter mobile

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 12 October 2020

Revealing the multidimensional mental representations of natural objects underlying human similarity judgements

  • Martin N. Hebart   ORCID: orcid.org/0000-0001-7257-428X 1 , 2 ,
  • Charles Y. Zheng   ORCID: orcid.org/0000-0003-3427-0845 3 ,
  • Francisco Pereira   ORCID: orcid.org/0000-0003-2773-3426 3 &
  • Chris I. Baker   ORCID: orcid.org/0000-0001-6861-8964 1  

Nature Human Behaviour volume  4 ,  pages 1173–1185 ( 2020 ) Cite this article

10k Accesses

83 Citations

272 Altmetric

Metrics details

  • Computational models
  • Human behaviour

Objects can be characterized according to a vast number of possible criteria (such as animacy, shape, colour and function), but some dimensions are more useful than others for making sense of the objects around us. To identify these core dimensions of object representations, we developed a data-driven computational model of similarity judgements for real-world images of 1,854 objects. The model captured most explainable variance in similarity judgements and produced 49 highly reproducible and meaningful object dimensions that reflect various conceptual and perceptual properties of those objects. These dimensions predicted external categorization behaviour and reflected typicality judgements of those categories. Furthermore, humans can accurately rate objects along these dimensions, highlighting their interpretability and opening up a way to generate similarity estimates from object dimensions alone. Collectively, these results demonstrate that human similarity judgements can be captured by a fairly low-dimensional, interpretable embedding that generalizes to external behaviour.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 digital issues and online access to articles

111,21 € per year

only 9,27 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

what is natural representation

Similar content being viewed by others

what is natural representation

A data-driven investigation of human action representations

Diana C. Dima, Martin N. Hebart & Leyla Isik

what is natural representation

Skeletal descriptions of shape provide unique perceptual information for object recognition

Vladislav Ayzenberg & Stella F. Lourenco

what is natural representation

The role of semantics in the perceptual organization of shape

Filipp Schmidt, Jasmin Kleis, … Roland W. Fleming

Data availability

The learned embedding, triplet odd-one-out behavioural data for testing model performance, typicality scores, participant-generated dimension labels and dimension ratings are available at https://osf.io/z2784 . The behavioural data used for training the model are available from the corresponding author upon request.

Code availability

To reproduce the relevant analyses and figures, the relevant MATLAB scripts and functions are available at https://osf.io/z2784 . The computational modelling code to create an embedding is available from the corresponding author upon request.

Biederman, I. Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94 , 115–147 (1987).

Article   Google Scholar  

Edelman, S. Representation is representation of similarities. Behav. Brain Sci. 21 , 449–467 (1998).

Article   CAS   Google Scholar  

Nosofsky, R. M. Attention, similarity, and the identification–categorization relationship. J. Exp. Psychol. Gen. 115 , 39–57 (1986).

Goldstone, R. L. The role of similarity in categorization: providing a groundwork. Cognition 52 , 125–157 (1994).

Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M. & Boyes-Braem, P. Basic objects in natural categories. Cognit. Psychol. 8 , 382–439 (1976).

Hahn, U. & Chater, N. in Knowledge, Concepts and Categories (eds Lamberts, K. & Shanks, D.) 43–92 (Psychology Press, 1997).

Rips, L. J., Smith, E. E. & Medin, D. L. in The Oxford Handbook of Thinking and Reasoning (eds Holyoak, K. J. & Morrison, R. G.) 177–209 (Oxford Univ. Press, 2012).

Rogers, T. T. & McClelland, J. L. Semantic Cognition: A Parallel Distributed Processing Approach (MIT Press, 2004).

Goldstone, R. L. & Son, J. Y. in The Oxford Handbook of Thinking and Reasoning (eds Holyoak, K. J. & Morrison, R. G.) 155–176 (Oxford Univ. Press, 2012).

Kriegeskorte, N. & Kievit, R. A. Representational geometry: integrating cognition, computation, and the brain. Trends Cogn. Sci. 17 , 401–412 (2013).

Caramazza, A. & Shelton, J. R. Domain-specific knowledge systems in the brain: the animate–inanimate distinction. J. Cogn. Neurosci. 10 , 1–34 (1998).

Chao, L. L., Haxby, J. V. & Martin, A. Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nat. Neurosci. 2 , 913–919 (1999).

Konkle, T. & Oliva, A. Canonical visual size for real-world objects. J. Exp. Psychol. Hum. Percept. Perform. 37 , 23–37 (2011).

Murphy, G. The Big Book of Concepts (MIT Press, 2004).

McRae, K., Cree, G. S., Seidenberg, M. S. & McNorgan, C. Semantic feature production norms for a large set of living and nonliving things. Behav. Res. Methods 37 , 547–559 (2005).

Devereux, B. J., Tyler, L. K., Geertzen, J. & Randall, B. The Centre for Speech, Language and the Brain (CSLB) concept property norms. Behav. Res. Methods 46 , 1119–1127 (2014).

Hebart, M. N. et al. THINGS: a database of 1,854 object concepts and more than 26,000 naturalistic object images. PLoS ONE 14 , e0223792 (2019).

Tversky, A. Features of similarity. Psychol. Rev. 84 , 327–352 (1977).

Barsalou, L. W. Context-independent and context-dependent information in concepts. Mem. Cognit. 10 , 82–93 (1982).

Maddox, W. T. & Ashby, F. G. Comparing decision bound and exemplar models of categorization. Percept. Psychophys. 53 , 49–70 (1993).

Hoyer, P. O. Modeling receptive fields with non-negative sparse coding. Neurocomputing 52 , 547–552 (2003).

Murphy, B., Talukdar, P. & Mitchell, T. Learning effective and interpretable semantic models using non-negative sparse embedding. In Proc. of COLING 2012 1933–1950 (2012).

Shepard, R. N. Stimulus and response generalization: a stochastic model relating generalization to distance in psychological space. Psychometrika 22 , 325–345 (1957).

Kobak, D. & Berens, P. The art of using t-SNE for single-cell transcriptomics. Nat. Commun. 10 , 5416 (2019).

Shelton, J. R., Fouch, E. & Caramazza, A. The selective sparing of body part knowledge: a case study. Neurocase 4 , 339–351 (1998).

Pedersen, T., Patwardhan, S. & Michelizzi, J. WordNet::Similarity—measuring the relatedness of concepts. In HLT-NAACL 2004: Demonstration Papers (eds Dumais, S. et al.) 38–41 (ACL Press, 2004).

Warrington, E. K. & Shallice, T. Category specific semantic impairments. Brain 107 , 829–853 (1984).

Rips, L. J. in Similarity and Analogical Reasoning (eds Vosniadou, S. & Ortony, A.) 21–59 (Cambridge Univ. Press, 1989).

Smith, E. E. & Sloman, S. A. Similarity- versus rule-based categorization. Mem. Cognit. 22 , 377–386 (1994).

Pilehvar, M. T. & Collier, N. De-conflated semantic representations. In 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) 1680–1690 (2016).

Nosofsky, R. M., Sanders, C. A., Meagher, B. J. & Douglas, B. J. Toward the development of a feature-space representation for a complex natural category domain. Behav. Res. Methods 50 , 530–556 (2018).

Nosofsky, R. M., Sanders, C. A., Meagher, B. J. & Douglas, B. J. Search for the missing dimensions: building a feature-space representation for a natural-science category domain. Comput. Brain Behav. 3 , 13–33 (2020).

Keil, F. C. Constraints on knowledge and cognitive development. Psychol. Rev. 88 , 187–227 (1981).

Shepard, R. N. The analysis of proximities: multidimensional scaling with an unknown distance function. Psychometrika 27 , 125–140 (1962).

Torgerson, W. S. Multidimensional scaling: I. Theory and method. Psychometrika 17 , 401–419 (1952).

Thurstone, L. L. Multiple factor analysis. Psychol. Rev. 38 , 406–427 (1931).

Tranel, D., Logan, C. G., Frank, R. J. & Damasio, A. R. Explaining category-related effects in the retrieval of conceptual and lexical knowledge for concrete entities: operationalization and analysis of factors. Neuropsychologia 35 , 1329–1339 (1997).

Shepard, R. N. & Arabie, P. Additive clustering: representation of similarities as combinations of discrete overlapping properties. Psychol. Rev. 86 , 87–123 (1979).

Navarro, D. J. & Lee, M. D. Common and distinctive features in stimulus similarity: a modified version of the contrast model. Psychon. Bull. Rev. 11 , 961–974 (2004).

Carlson, T. A., Ritchie, J. B., Kriegeskorte, N., Durvasula, S. & Ma, J. Reaction time for object categorization is predicted by representational distance. J. Cogn. Neurosci. 26 , 132–142 (2013).

Yee, E. & Thompson-Schill, S. L. Putting concepts into context. Psychon. Bull. Rev. 23 , 1015–1027 (2016).

Charest, I., Kievit, R. A., Schmitz, T. W., Deca, D. & Kriegeskorte, N. Unique semantic space in the brain of each beholder predicts perceived similarity. Proc. Natl Acad. Sci. USA 111 , 14565–14570 (2014).

De Haas, B., Iakovidis, A. L., Schwarzkopf, D. S. & Gegenfurtner, K. R. Individual differences in visual salience vary along semantic dimensions. Proc. Natl Acad. Sci. USA 116 , 11687–11692 (2019).

Google Scholar  

Peterson, J. C., Abbott, J. T. & Griffiths, T. L. Evaluating (and improving) the correspondence between deep neural networks and human representations. Cogn. Sci. 42 , 2648–2669 (2018).

Rajalingham, R. et al. Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. J. Neurosci. 38 , 7255–7269 (2018).

Jozwik, K. M., Kriegeskorte, N., Storrs, K. R. & Mur, M. Deep convolutional neural networks outperform feature-based but not categorical models in explaining object similarity judgments. Front. Psychol. 8 , 1726 (2017).

Iordan, M. C., Giallanza, T., Ellis, C. T., Beckage, N. & Cohen, J. D. Context matters: recovering human semantic structure from machine learning analysis of large-scale text corpora. Preprint at arXiv https://arxiv.org/abs/1910.06954 (2019).

Bauer, A. J. & Just, M. A. in The Oxford Handbook of Neurolinguistics (eds de Zubicaray, G. I. & Schiller, N. O.) 518–547 (Oxford Univ. Press, 2019).

Binder, J. R. et al. Toward a brain-based componential semantic representation. Cogn. Neuropsychol. 33 , 130–174 (2016).

Huth, A. G., Nishimoto, S., Vu, A. T. & Gallant, J. L. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76 , 1210–1224 (2012).

Cichy, R. M., Kriegeskorte, N., Jozwik, K. M., van den Bosch, J. J. & Charest, I. The spatiotemporal neural dynamics underlying perceived similarity for real-world objects. NeuroImage 194 , 12–24 (2019).

Bankson, B. B., Hebart, M. N., Groen, I. I. A. & Baker, C. I. The temporal evolution of conceptual object representations revealed through models of behavior, semantics and deep neural networks. NeuroImage 178 , 172–182 (2018).

Zheng, C. Y., Pereira, F., Baker, C. I. & Hebart, M. N. Revealing interpretable object representations from human behavior. Preprint at arXiv https://arxiv.org/abs/1901.02915 (2019).

Abadi, M. et al. TensorFlow: a system for large-scale machine learning. In 12th Symposium on Operating Systems Design and Implementation 265–283 (2016).

Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at arXiv https://arxiv.org/abs/1412.6980 (2015).

Download references

Acknowledgements

We thank A. Corriveau for help with the data collection in the laboratory experiment, L. Stoinksi and J. Perkuhn for help with finding public-domain images for this publication, and I. Charest, B. Love, A. Martin and P. McClure for helpful discussions and/or comments on the manuscript. This research was supported by the Intramural Research Program of the National Institutes of Health (grant nos ZIA-MH-002909 and ZIC-MH002968), under National Institute of Mental Health Clinical Study Protocol 93-M-1070 (NCT00001360), and by a research group grant awarded to M.N.H. by the Max Planck Society. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and affiliations.

Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA

Martin N. Hebart & Chris I. Baker

Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany

  • Martin N. Hebart

Machine Learning Core, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA

Charles Y. Zheng & Francisco Pereira

You can also search for this author in PubMed   Google Scholar

Contributions

M.N.H. and C.I.B. conceived and designed the study. M.N.H. collected the data. C.Y.Z., M.N.H. and F.P. designed the computational model. M.N.H., C.Y.Z. and F.P. analysed the data. M.N.H., C.I.B., F.P. and C.Y.Z. wrote the manuscript and provided critical revisions.

Corresponding author

Correspondence to Martin N. Hebart .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Peer review information Primary Handling Editor: Marike Schiffer.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended data fig. 1 reproducibility of dimensions..

Reproducibility of dimensions in the chosen 49-dimensional embedding across 20 random initializations (see Extended Data Fig. 2 for a list of all dimension labels). Shaded areas reflect 95% confidence intervals.

Extended Data Fig. 2 Labels and word clouds for all 49 model dimensions.

Labels for all 49 dimensions, with respective word clouds reflecting the naming frequency across 20 participants. The dimensions appear to reflect both perceptual and conceptual properties of objects. A visual comparison between labels and word clouds indicates a generally good agreement between participant naming and the labels we provided for the dimensions.

Extended Data Fig. 3 Category-typicality correlations.

Detailed results of inferential statistical analyses correlating category-related dimensions with typicality of their category. One-sided p-values were generated using randomization tests and were controlled for false discovery rate (FDR) across multiple tests. 90% confidence intervals were used to complement one-sided tests.

Extended Data Fig. 4 Model performance and dimensionality as a function of training data size.

Model performance and dimensionality varied as a function of the amount of data used for training the model. Models were trained in steps of 100,000 trials. Six models with random initialization and random subsets of data were trained per step and all models applied to the same test data as in the main text, making it a total of 78 trained models. For each step, computation of up to two models did not complete on the compute server for technical reasons, making the total between 4 and 6 models per step. Results for each individual model and the average for each step are shown in the Figure. a . Model performance was already high for 100,000 trials as training data but increased with more data, saturating around the final model performance. b . Dimensionality increased steadily with the amount of training data.

Supplementary information

Reporting summary, peer review information, rights and permissions.

Reprints and permissions

About this article

Cite this article.

Hebart, M.N., Zheng, C.Y., Pereira, F. et al. Revealing the multidimensional mental representations of natural objects underlying human similarity judgements. Nat Hum Behav 4 , 1173–1185 (2020). https://doi.org/10.1038/s41562-020-00951-3

Download citation

Received : 18 March 2020

Accepted : 17 August 2020

Published : 12 October 2020

Issue Date : November 2020

DOI : https://doi.org/10.1038/s41562-020-00951-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Differential effects of intrinsic properties of natural scenes and interference mechanisms on recognition processes in long-term visual memory.

  • Anastasiia Mikhailova
  • Sophie Lightfoot
  • Moreno I. Coco

Cognitive Processing (2024)

Neural and behavioral signatures of the multidimensionality of manipulable object processing

  • Jorge Almeida
  • Alessio Fracasso
  • Jonathan Walbrin

Communications Biology (2023)

  • Diana C. Dima

Scientific Reports (2023)

Representational formats of human memory traces

  • Rebekka Heinen
  • Anne Bierbrauer
  • Nikolai Axmacher

Brain Structure and Function (2023)

Five years of Nature Human Behaviour

  • Samantha Antusch
  • Aisha Bradshaw
  • Mary Elizabeth Sutherland

Nature Human Behaviour (2022)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

what is natural representation

The Nature of Representation

  • First Online: 11 September 2019

Cite this chapter

Book cover

  • Michael Potter 2  

156 Accesses

For a fuller understanding of what constitutes ‘representation’, as is being assessed in the book, this chapter provides an exploration of the theory and meanings of this in the political sphere. Various theories of representation are explored, such as descriptive and substantive representation, presence theory, and critical mass theory. Theories of democracy are then examined from the perspective of inclusion and the deliberative democracy is then discussed as the basis for the analytical framework. The framework adopted for this book from Yvonne Galligan and Sara Clavero’s gender democracy analysis is then described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The 2017 election was an exception, where 30% of those elected were women, but the Assembly did not sit.

Aristotle. (1991 [350BC]). The Politics (T. A. Sinclair, Trans.). London: Guild.

Google Scholar  

Azar, E. (1990). The Management of Protracted Social Conflict: Theory and Cases . Aldershot: Dartmouth.

Bächtiger, A., Niemeyer, S., Neblo, M., Steenbergen, M., & Steiner, J. (2010). Disentangling Diversity in Deliberative Democracy: Competing Theories, Their Blind Spots and Complementarities. Journal of Political Philosophy, 18 (1), 32–63.

Article   Google Scholar  

Bächtiger, A., Spörmdli, M., Steenbergan, M., & Steiner, J. (2005). The Deliberative Dimensions of Legislatures. Acta Politica, 40, 225–238.

Bächtiger, A., & Wegmann, A. (2014). ‘Scaling Up’ Deliberation. In S. Elstub & P. McLaverty (Eds.), Deliberative Democracy: Issues and Cases (pp. 118–135). Edinburgh: Edinburgh University Press.

Blakely, G. (2014). Conflict and Deliberation. In S. Elstub & P. McLaverty (Eds.), Deliberative Democracy: Issues and Cases (pp. 17–33). Edinburgh: Edinburgh University Press.

Blondel, J. (1963). Voters, Parties and Leaders: The Social Fabric of British Politics . London: Penguin.

Briquet, J.-L. (1997). La Tradition en Mouvement: Clientélism et Politique en Corse . Paris: Bélin.

Brown, M. (2004). Expertise and Deliberative Democracy. In S. Elstub & P. McLaverty (Eds.), Deliberative Democracy: Issues and Cases (pp. 50–68). Edinburgh: Edinburgh University Press.

Burke, E. (1974 [1774]). Speech to the Electors of Bristol. In B. Hill (Ed.), On Government, Politics and Society . London: Fontana.

Chambliss, E., & Uggen, C. (2000). Men and Women of Elite Law Firms: Re-evaluating Kanter’s Legacy. Law and Social Inquiry, 25 (1), 41–68.

Cinalli, M., & O’Flynn, I. (2014). Pluralism and Deliberative Democracy. In S. Elstub & P. McLaverty (Eds.), Deliberative Democracy: Issues and Cases (pp. 82–97). Edinburgh: Edinburgh University Press.

Cohen, J. (1998). Democracy and Liberty. In J. Elster (Ed.), Deliberative Democracy . Cambridge: Cambridge University Press.

Coole, D. (1993). Women in Political Theory: From Ancient Misogyny to Contemporary Feminism . New York: Harvester Wheatsheaf.

Deutsch, K. (1974). Politics and Government . Boston: Houghton Mifflin.

Dryzek, J. (2000). Deliberative Democracy and Beyond: Liberals, Critics, Protestations . Oxford: Blackwell.

Dryzek, J. (2005). Deliberative Democracy in Divided Societies: Alternatives to Agonism and Analgesia. Political Theory, 33 , 218–242.

Elstub, S. (2014). Mini Publics: Issues and Cases. In S. Elstub & P. McLaverty (Eds.), Deliberative Democracy: Issues and Cases (pp. 166–188). Edinburgh: Edinburgh University Press.

Elstub, S., & McLaverty, P. (2014). Introduction: Issues and Cases in Deliberative Democracy. In S. Elstub & P. McLaverty (Eds.), Deliberative Democracy: Issues and Cases . Edinburgh: Edinburgh University Press.

Finer, S. (1985). Comparative Government: An Introduction to the Study of Politics . Harmondsworth: Penguin.

Galligan, Y., & Clavero, S. (2008). Assessing Gender Democracy in the European Union: A Methodological Framework (RECON Working Paper 2008/16). Oslo: Arena.

Gutmann, A., & Thompson, D. (2003). Deliberative Democracy Beyond Process. In J. Fishkin & P. Laslett (Eds.), Debating Deliberative Democracy . Oxford: Blackwell.

Habermas, J. (1979). Communication and the Evolution of Society . London: Heinemann.

Habermas, J. (1999). The Inclusion of the Other: Studies in Political Theory . Cambridge: Polity.

Horowitz, D. (2000). Ethnic Groups in Conflict . Berkeley: University of California Press.

Hunter, J. (1983). An Analysis of the Conflict in Northern Ireland. In D. Rea (Ed.), Political Co-operation in Divided Societies . Dublin: Gill and Macmillan.

Izraeli, D. (1983). Sex Effects of Structural Effects? An Empirical Test of Kanter’s Theory of Proportions. Social Forces, 62, 153–165.

Kanter, R. (1977). Some Effects of Proportions on Group Life: Skewed Sex Ratios and Responses to Token Women. American Journal of Sociology, 82 (5), 965–990.

Laski, H. (1971). A Grammar of Politics . London: George Allen and Unwin.

Lecky, W. (1899). Democracy and Liberty (Vol. I). London: Green.

Mansbridge, J. (2010). The Place of Self-Interest and the Role of Power in Deliberative Democracy. Journal of Political Philosophy, 18 (1), 64–100.

McLaverty, P. (2014). Inequality and Deliberative Democracy. In S. Elstub & P. McLaverty (Eds.), Deliberative Democracy: Issues and Cases (pp. 34–49). Edinburgh: Edinburgh University Press.

Miller, D. (2000). Citizenship and National Identity . Cambridge: Polity.

O’Flynn, I. (2006). Deliberative Democracy and Divided Societies . Edinburgh: Edinburgh University Press.

Book   Google Scholar  

Okin, S. (1989). Justice, Gender and the Family . New York: Basic.

Pateman, C. (1989). The Disorder of Women: Democracy, Feminism and Political Theory . Cambridge: Polity.

Pateman, C. (2012). Participatory Democracy Revisited. Perspectives on Politics, 10 (1), 7–19.

Peter, F. (2009). Democratic Legitimacy . New York: Routledge.

Phillips, A. (1993). Democracy and Difference . Cambridge: Polity.

Phillips, A. (1998). The Politics of Presence: The Political Representation of Gender, Ethnicity and Race . Oxford: Clarendon.

Pitkin, H. (1967). The Concept of Representation . Berkeley: University of California Press.

Przeworski, A. (1998). Deliberation and Ideological Domination. In J. Ester (Ed.), Deliberative Democracy . Cambridge: Cambridge University Press.

Rosenberg, S. (2014). Citizen Competence and the Psychology of Deliberation. In S. Elstub & P. McLaverty (Eds.), Deliberative Democracy: Issues and Cases (pp. 98–117). Edinburgh: Edinburgh University Press.

Rousseau, J.-J. (1974 [1762]). The Social Contract and Discourses (G. D. H. Cole, Trans.). London: Dent.

Ruedin, D. (2013). Why Aren’t They There? The Political Representation of Women, Ethnic Groups and Issue Positions in Legislatures . Colchester: ECPR.

Spangler, E., Gordon, M., & Pipkin, R. (1978). Token Women: An Empirical Test of Kanter’s Hypothesis. American Journal of Sociology, 84 (1), 160–170.

Steiner, J., Bächtiger, A., Spörndli, M., & Steenbergen, M. (2004). Deliberative Politics in Action: Analysing Parliamentary Discourse . Cambridge: Cambridge University Press.

Young, I. (1990). Justice and the Politics of Difference . Princeton: Princeton University Press.

Young, I. (2000). Inclusion and Democracy . Oxford: Oxford University Press.

Download references

Author information

Authors and affiliations.

Centre for the Study of Ethnic Conflict, Queen’s University Belfast, Belfast, UK

Michael Potter

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Michael Potter .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 The Author(s)

About this chapter

Potter, M. (2020). The Nature of Representation. In: Inclusion in Post-Conflict Legislatures. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-25536-7_4

Download citation

DOI : https://doi.org/10.1007/978-3-030-25536-7_4

Published : 11 September 2019

Publisher Name : Palgrave Macmillan, Cham

Print ISBN : 978-3-030-25535-0

Online ISBN : 978-3-030-25536-7

eBook Packages : Political Science and International Studies Political Science and International Studies (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Smarthistory Logo

Naturalism, realism, abstraction and idealization

Caravaggio, The Incredulity of Saint Thomas, c. 1601-02, oil on canvas, 107 x 146 cm (Sanssouci Picture Gallery)

Caravaggio, The Doubting of Thomas , c. 1601-02, oil on canvas, 107 x 146 cm (Sanssouci Picture Gallery)

Hakuin Ekaku, Portrait of Daruma,mid-18th century, Edo period Japan, hanging scroll, ink on paper, 117.5 x 54 cm (The Metropolitan Museum of Art)

Hakuin Ekaku, Portrait of Daruma , mid-18th century, Edo period Japan, hanging scroll, ink on paper, 117.5 x 54 cm (The Metropolitan Museum of Art)

Naturalism is resemblance to the “real world,” as we see it around us. The more naturalistic a work, the more it looks like our world, and the less naturalistic, the less so.

Representational versus non-representation

For extremes, we will compare a work by Caravaggio with a work by Hakuin Ekaku, a Japanese Zen Buddhist monk. Caravaggio’s Italian Baroque The Doubting of Thomas is highly naturalistic, whereas Hakuin’s portrait of Daruma rejects naturalism. Both paintings are representational, which means that they both depict something, as opposed to non-representational works like Pollock’s Autumn Rhythm (Number 30) , where there is no visual picture hidden in the lines.

In Caravaggio’s image, the face of Thomas, who leans down to look directly at the wound in Jesus’ side, is rendered in minute, realistic detail. His hairs are individually rendered, his features distinct and recognizable, his skin browned from the sun and with dirt.

Caravaggio found his models for holy figures in the streets, and their humble conditions are carefully depicted in his paintings. All this adds up to naturalism, and perhaps even to the highest degree of naturalism, referred to as verism.

Moving toward abstraction

Ekaku’s portrait of Daruma, on the other hand, is recognizable as a human face, but all the elements have all been rendered somewhat abstractly — that is, in some way deviating from or ignoring the natural world, simplifying or altering it for effect. We might also say that Ekaku’s image is stylized — designed according to the principles of a particular style rather than being beholden to the way things look in “the real world.”

This does not make it less “good” as a work of art, and also does not mean that the artist was less skilled. Rather, it is simply a different, less naturalistic style, designed to suit different goals. Here, the emphasis is not on the great Buddhist sage’s literal appearance (he died over a millennium before this work was made), but on his wisdom, embodied in his giant, upturned eyes and vast forehead.

Kazmir Malevich, Painterly Realism of a Boy with a Knapsack - Color Masses in the Fourth Dimension, 1915, oil on canvas, 71.1 x 44.5 cm (The Museum of Modern Art)

Kazmir Malevich, Painterly Realism of a Boy with a Knapsack – Color Masses in the Fourth Dimension, 1915, oil on canvas, 71.1 x 44.5 cm (The Museum of Modern Art)

Polykleitos, Doryphoros (Spear-Bearer) or The Canon, c. 450-40 B.C.E., ancient Roman marble copy found in Pompeii of the lost bronze original, 211 cm, Museo Archeologico Nazionale di Napoli

Polykleitos,  Doryphoros , c. 450-40 B.C.E., ancient Roman marble copy of a lost bronze original, 211 cm (Archaeological Museum, Naples)

Non-representational art

The farthest extent of abstraction is non-representational or non-objective art, in which the subject is not merely abstract, but wholly absent.

This type of art, which many viewers find highly challenging, makes little or no reference to the natural world, such as Kazimir Malevich’s geometric paintings such as Painterly Realism of a Boy with a Knapsack – Color Masses in the Fourth Dimension.

In such works, we are in a purely formal world, with no recognizable subject in the work but the work, itself, and where the tools of visual analysis are the only points of entry into the work.

Idealization

There is one other way, though, that we can speak of naturalism. Instead of contrasting it with abstraction, we can contrast it with realism.

We can see that the Doryphoros is highly naturalistic, in that he looks very much like a living person. On the other hand, he presents his culture’s physical ideal of symmetry, youth, fitness, and, indeed, relaxation.

If we are being honest, we would have to admit that while a person could look like this, rather few of us do. It is therefore highly idealized. In this sense, the work lacks realism — since, in a way, according to ancient Greek views, it is too perfect.

Satsubari, the Second of the Sixteen Rakan, late 14th century, hanging scroll; ink, color, and gold on silk, 115.3 x 49.3cm (Mary Griggs Burke collection)

Satsubari, The Second of the Sixteen Rakan , late 14th century, hanging scroll; ink, color, and gold on silk, 115.3 x 49.3cm (Mary Griggs Burke collection)

Naturalism and realism

Let’s take another example — a fourteenth century Japanese painting, The Second of the Sixteen Rakan ( Rakan are figures in Buddhism who protect Buddhist law and bless donors). Here the figure does not appear naturalistic, in that the work is clearly a painting, with a loose style and exaggeration of features.

On the other hand, the work is highly realistic, in that the artist has presented us with a figure that is ordinary or even slightly grotesque, rather than idealized. He is old, wrinkled and bent, his earlobe hanging low (stretched out by large gold earrings) as a marker of his past wealth. His clothes are rumpled, his socks slipping downward. While the image is not very naturalistic, it is still highly realistic.

Disentangling naturalism from realism is tricky, and the terms are sometimes used interchangeably, despite their important differences.

Naturalism and realism might or might not appear in the same work. An image might be naturalistic and realistic, like The Doubting of Thomas , or naturalistic and idealized, as is Doryphoros .

Alternately, a work might be abstracted and realistic, like the Second Rakan , or abstracted and idealized, like St. John in the Lindisfarne Gospels ). A diagram presents the nexus of these terms, and we might plot just about every work of art in this book somewhere within it .

naturalism-diagram

Duane Hanson, Slab Man . 1974-75, vinyl resin and fiberglass, polychromed in oil (Cantor Arts Center)

Some works, of course, like those by Duane Hanson , have both forms of naturalism.

His life-size sculptures of people revel in their ordinary nature, and are also startlingly lifelike — veristic — so much so that they are frequently mistaken for real people when on display in museum exhibitions.

I have been fooled by two of them, over the years. One was dressed as a museum guard and another as a workman. Even though I had seen Slab Man dozens of times, when his gallery was under construction and he was surrounded by carts of equipment and tools, I was fooled anew.

Cite this page

Your donations help make art history free and accessible to everyone!

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Entropy (Basel)

Logo of entropy

Language Representation Models: An Overview

Associated data.

Publicly available datasets were used in this study. These data can be found here: https://gluebenchmark.com/ (last accessed on 2 January 2021).

In the last few decades, text mining has been used to extract knowledge from free texts. Applying neural networks and deep learning to natural language processing (NLP) tasks has led to many accomplishments for real-world language problems over the years. The developments of the last five years have resulted in techniques that have allowed for the practical application of transfer learning in NLP. The advances in the field have been substantial, and the milestone of outperforming human baseline performance based on the general language understanding evaluation has been achieved. This paper implements a targeted literature review to outline, describe, explain, and put into context the crucial techniques that helped achieve this milestone. The research presented here is a targeted review of neural language models that present vital steps towards a general language representation model.

1. Introduction

Text mining, as a subset of data mining, is the process in which algorithms try to discover and extract knowledge from a text that has no apparent structure. Text mining thus deals with information retrieval (e.g., documents or websites), classification (e.g., news tagging), clustering, and other similar tasks. However, text mining has a limit when it comes to “understanding” the mined results.

The discipline of natural language processing (NLP) tries to understand and exploit natural language. It covers a wide range of tasks such as translation, summarizing, or question-answering. Thus, text mining and NLP are closely related and intertwined approaches to text processing. Researchers and practitioners of NLP use tools available from linguistic toolboxes, such as the part of speech, which denotes how a given word functions within a sentence in its two essences: the meaning and the structure (grammar). Grammatical structures are presented as phrases or dependency relations. NLP tackles the problems of anaphora (an expression that depends on an antecedent expression), cataphora (the use of an expression that depends on a postcedent expression), and ambiguities of words and of grammatical structures (what a given word or prepositional phrase is modifying). In dealing with these problems, the NLP uses knowledge representations (such as word lexicons with word meanings, grammatical properties, grammar rules, a thesaurus of synonyms, a list of abbreviations, and many kinds of ontologies).

Applying neural networks and deep learning techniques to NLP led to a new way of solving NLP tasks: neural language models . These models are very successful in solving numerous NLP tasks. The introduction of transfer learning to NLP has brought a major breakthrough in NLP: machines have exceeded human performance in several NLP tasks. The progress of natural language models is being actively monitored and assessed by the open General Language Understanding Evaluation (GLUE) benchmark score platform [ 1 ] ( https://gluebenchmark.com , accessed on 2 January 2021). The goal of this paper is to provide readers with an overview of the most-used concepts and milestones of the last five years in the area of NLP. This paper aims to explain the most important concepts behind the language representation models that have helped achieve the milestone, i.e., develop the models to outperform the human baseline on the GLUE benchmark ( https://gluebenchmark.com/leaderboard , accessed on 2 January 2021). The research question is as follows: which concepts and milestones in the area of natural language processing are the most important ones to help outperform the human baseline performance.

2. Related Works

Language representation and neural language models are rapidly growing fields that will probably play a substantial role in the future of NLP. A few studies have conducted surveys of the field.

For instance, ref. [ 2 ] conducted a short survey of neural language models and possible evaluation tasks and corpora.

Another example is [ 3 ]; in this work, a pre-trained neural language model taxonomy was conducted.

Similarly, ref. [ 4 ] provided a systemization of many language representation models by representation level, input level, model type, model architecture, and model supervision.

All of these three surveys lack information about novel techniques and models that contributed to the breakthrough of surpassing the baseline human performance.

The authors of [ 5 ] conducted a rich survey of language representation models, including many models from 2020 and 2021. They additionally provided useful insights about pre-processing, classification techniques, evaluation metrics, and applications. Again, the survey neither focused on breakthrough methods and techniques nor on neural language models.

In contrast to these related works, our work focuses on language representation models that outperform the human benchmark on the GLUE score. It explains the concepts and background of these models in an articulated and systematic way.

3. Targeted Literature Review

The scope of this targeted review about neural language models is papers describing the concepts, internals, and descriptions of these models and is evaluated using the GLUE benchmark [ 1 ]. There are 77 models listed on the benchmark’s leaderboard. We discarded the models that do not outperform human baselines (ranked 16/77). Of the 15 remaining models, we review the models developed in the last five years, i.e., since 2016, starting with the attention-based concepts [ 6 ] and the transformer architecture in 2017 [ 7 ] and ending with DeBERTa in 2021 [ 8 ]. The reason for limiting the targeted review to the last 5 years is to show the quick progress achieved recently. The concepts and milestones of the latest research in natural language models are described in this paper.

Figure 1 depicts a few of these models and their novel techniques or concepts. Additionally, it shows how rapid new milestones and novel techniques have been published in the last years. Table 1 gives an overview of a model’s GLUE benchmark score, the parameters, and the number of steps of each of the models discussed in this paper. The models and concepts in the orange highlighted boxes are further discussed in this paper. The concepts of Roberta (hyper-parameter optimization, 2019) and Funnel (transformer redundancy filtering, 2020) are not covered in detail because they have not introduced a substantial improvement over the initial proposal, as is also the case with GPT2 (subword embeddings, 2018) and ERNIE (knowledge graph incorporation, 2019). Table 1 allows the reader to obtain a general overview of the particular performance and efficiency of the key models.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g001.jpg

Selected overview of the milestones in neural language models of the last 5 years. Chronologically ordered.

Comparison of the GLUE Score (* as reported by authors in the papers), the amount of steps and amount of parameters of the baseline version of the key models, which are further discussed in this paper.

Figure 2 presents the inclusion/exclusion criteria of models in our overview.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g002.jpg

Inclusion and exclusion criteria.

4. Key Concepts of Neural Language Models

In this section, we present key concepts of natural language models and how they connect, and how each of the concepts was further developed and refined by the next one. First, we introduce the basic and starting concept of word representation in Section 4.1 . Built on this topic, Section 4.2 talks about sequence-to-sequence models and how they process sequential information (e.g., texts). Section 4.3 describes how encoder-decoder models boost the capabilities of sequence-to-sequence models. Next, Section 4.4 shows how attention helps encoder-decoder models to better understand longer sequences. Section 4.5 shows how transfer learning becomes applicable in NLP and how it helps developing better performing models. Section 4.6 explains how a bidirectional, thereby bigger, context scope for word representations helps models to gain a better understanding of their input words and phrases. This bidirectionality becomes possible by using masked language modelling (MLM). Section 4.7 shows how further refining these word representations can significantly increase the model’s performance. Section 4.8 illustrates how to improve the model’s performance and efficiency by expanding the MLM task into a generator and a discriminator. Next, Section 4.9 showcases the positive effect of a few parameter reduction techniques and how they help in building more efficient models. Additionally, Section 4.10 explores the current possibilities and results of multi-task trained models. Section 4.11 delivers a recent use-case of applying transformer models to the machine translation. In Section 6 , we discuss recent milestones and provide the reader with an outlook on what areas could become important in the future of neural language models.

4.1. Word Representations

The basis for each neural language model is vocabulary. A vocabulary is a table that assigns each token or word a representation. This representation could be a number or a multi-dimensional vector. These representations are usually called word embeddings . There are a few different implementations of word embeddings. They can be categorized into context-free and contextualized word embeddings. Context-free representations such as Glove [ 14 ] and Word2Vec [ 15 ] are based only on the characters the word consists of. Contextualized representations are based on the character span of the word and enhancing the representation by the words surrounding the word. Figure 3 illustrates an example that shows the different representations of the word “bank”. In the left sentence, it means a credit institution, and in the right sentence, a geographical phenomenon, although it has the same span of characters.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g003.jpg

Example comparison of the representations of the same span of characters in a contextualized and in a not contextualized model.

4.2. Sequence-To-Sequence Models

Sequence-to-sequence models are specialized in processing a sequential input, e.g., a list of numbers or a consequential sentence, and producing a sequential output. Probably the most used architecture families are recurrent neural networks or RNNs , introduced by [ 16 ]. They are well applicable to language tasks because many tasks rely on sequential data such as continuous texts. A key strength of most RNNs is that they can process and generalize across sequences of values of variable lengths. This strength relies on the model’s ability to share parameters across different parts of a model. Thereby, RNNs can connect information that is distributed across the sequence: for example, where the model’s task is to extract temporal information from a sequence. It has the following two sentences as input: “Today I am feeling sick” and “I am feeling sick today”. In both cases, it should extract “today” independently from its position in the sequence. Ref. [ 17 ], p. 373, describes RNNs as a chain of multiple copies of the same network, each one sharing the parameters with its succeeding copy. Figure 4 depicts a chained neural network.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g004.jpg

Schematic depiction of a Recurrent Neural Network (RNN) (modified from [ 18 ]).

4.3. Encoder-Decoder Models

The major limitation of the previously discussed RNNs is that their output length is determined by the input length. Many tasks such as translation depend on this feature. The first achievements to overcome this limitation are [ 19 , 20 ]. They introduce the encoder-decoder framework and apply it to the task of neural machine translation. These frameworks consist of three main components as they are depicted in Figure 5 :

  • Encoder: Extracts features by reading and converting the input into distributed representations, with one feature vector associated with each word position.
  • Context: Either a feature vector or a list of feature vectors based on the extracted features. If it is a list of feature vectors, it has the benefit that each feature vector can be accessed independently of its position in the input.
  • Decoder: Sequentially processes the context to generate the final output and solve the task.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g005.jpg

Illustration of the Encoder-Decoder Architecture.

4.4. Attention

The encoder-decoder framework performs efficiently on a variety of tasks, e.g., machine translation [ 21 ]. However, the fact that all the information is stored in a context can limit the framework’s ability to understand long and complex inputs. One approach to overcome this limitation is the attention mechanism . It was firstly applied to machine translation by [ 6 , 22 ]. The main idea is similar to the way humans perceive and understand their environment. Not all information has the same influence on the output. This mechanism is built on the encoder-decoder idea but differs from it in two main ways. Firstly, the encoder passes allthe hidden states to the decoder. Secondly, an additional attention step is performed by the decoder, in which each annotation receives a weight that further determines its amount of influence on the output. The box, with the dashed outline, in Figure 6 illustrates the attention step in additive attention, the kind of attention [ 6 ] (see Figure 7 ). used. There are different kinds of attention, which all share the core idea of calculating and assigning weights to annotations.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g006.jpg

Visualization of the attention mechanism applied to an encoder-decoder model.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g007.jpg

Visualization of Additive Attention mechanism. Modified from [ 6 ]. The emphasized text gives further information about parts of the process.

Currently, probably the most used application of the attention mechanism is the transformer architecture, which was introduced by [ 7 ]. It uses self-attention , which is an extension of the attention mechanism. The main advantage of self-attention is its auto-regressive property. This means that the model is able to look back to previously processed inputs while encoding the current one. The example in Figure 8 illustrates self-attention. The word “it’s” is correctly associated with the “law”, allowing the model to possibly understand that “it” is semantically equivalent to “the law”. Ref. [ 23 ] describes the process of self-attention by dividing it into six steps. In step one, each encoder input word (represented by a vector) is multiplied by three weight matrices, resulting in three different vectors: a query vector, a key vector, and a value vector for each word. These weight matrices are trained in the training process. In step two, the self-attention is calculated for each word. This is done by individually calculating the dot-product of the word’s query vector q i and the key vector of each word ( k 1 1 , ⋯ , k n ) in the sentence, resulting in n scores. In steps three and four, the scores are divided by the square root (e.g., 8 in [ 7 ]) of the number of dimensions (e.g., 64 in [ 7 ]). Afterward, these scores are all normalized using Softmax. These scores indicate how much each word is expressed at the current position. In step five, each value vector is multiplied by the Softmax score. In the final step, the weighted value vectors are added together and form the output of the self-attention layer of this position.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g008.jpg

Example of how self-attention influences the representation of the word “its” (modified from [ 7 ]).

In addition to the content of the word or phrase, ref. [ 7 ] injected positional information into the embedding. This positional information is important to enable the model to make use of the order of the input sequence. Their positional encoding represents the absolute position of the word or phrase in the sequence. Later papers discussed the usage of relative position embeddings (e.g., [ 24 ]). The results of [ 25 ] showed that relative position representations are more effective for natural language understanding and generation tasks. This injection of language information is a core feature of so-called language-representation models , models that are specialized in representing valuable natural language information. Another feature of the transformer by [ 7 ] is sharing parameters across layers. With parameter sharing , the overall number of parameters a model should learn can be significantly reduced.

4.5. Transfer Learning and Pre-Trained Models

Textual data exist abundantly. However, task-specific and labelled data are sparse and expensive to acquire. One way to boost the model’s performance, when only few task-specific data are available, is pre-training . The approach introduced by [ 26 ] used the transformer architecture in their pre-training setup. They first pre-trained the model in an unsupervised way on a large data set to gain a general or world understanding. Then, the model’s weights are used as the foundation of the semi-supervised fine-tuning on the task-specific data to solve the task. This approach is called transfer learning . Human intelligence is very capable of it. By writing many essays in school, humans not only become better at the concrete task but are able to generalize information and capabilities: for example, applying the knowledge to other similar tasks such as writing a business e-mail or a letter of application.

4.6. Bidirectional Representations

One significant factor for the performance of neural language models is the quality of their representations or embeddings. Refining the embedding can help the model to gain a more detailed understanding of the words. One way to achieve this is done by increasing the scope of the contextualized word embeddings. Ref. [ 13 ] was the first to implement a deeply bidirectional model based on contextualized and pre-trained representations with their bidirectional encoder representations from transformers ( BERT ). Technically, BERT is a transformer encoder stack. Its architecture is depicted in Figure 9 a. It encodes an input into its representation. BERT uses many sophisticated concepts to produce representations. One major concept is bidirectionality. Bidirectionality means that both the next and previous words determine the representation of a word.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g009.jpg

Comparison of the archictures of BERT and DeBERTa. ( a ) schematically depicts the architecture of BERT (modified from [ 13 ]). ( b ) shows in a similar way the architecture of DeBERTa (modified from [ 29 ]).

One example could be the word “bank”. The different meanings and thereby representations of it in the two sentences “I am going to the bank to sit down” and “I am going to the bank to withdraw cash” can only be understood by allowing the succeeding words “sit” and “withdraw” to influence the representation. Unidirectional models would only include the parts of the sentence which are previous to the word “bank”, in both cases, “I am going to the”. By considering both directions, the previous and succeeding neighbours, a more complete and differentiated representation of the word “bank” is possible.

BERT accomplishes the bidirectionality by using the masked language modelling ( MLM ) pre-training objective. MLM randomly selects a subset of tokens that either replaces some of the input tokens with the <MASK> token or with a random token or leaves them unchanged. This process is called masking. MLM aims to predict the original vocabulary of the masked token only on the basis of its context. After the prediction, the masked is replaced with the predicted token, except in cases where the masked token was not replaced in the first place. The embedding of the token enables the incorporation of information about the left and the right neighbouring tokens: in other words, both directions of its context. The idea behind not always replacing the tokens selected for masking with <MASK> aims to mitigate the mismatch between pre-training and fine-tuning since the <MASK> token does not appear during fine-tuning. Furthermore, there are other ways to achieve bidirectionality than MLM. XLNet (Generalized Auto-regressive Pre-training for Language Understanding) [ 27 ]. It uses permutations of each sentence to take the two contexts into account and uses two-stream self-attention to capture the tokens’ positional information.

Additionally, bidirectionality [ 13 ] includes another concept that significantly helped to improve BERT’s performance: next sentence prediction ( NSP ). It increases the model’s capability of understanding the relationship between two sentences. Many NLP tasks rely heavily on the understanding of these relationships. It is a binarized task, in which the model decides if a sentence is followed by another (label: IsNext ) or not (label: NotNext ).

4.7. Disentangled Bidirectional Representations

Ref. [ 8 ] presented a new model architecture, DeBERTa (decoding-enhanced BERT with disentangled attention), that builds on BERT [ 13 ] and RoBERTa [ 28 ] and improves their results by introducing two novel techniques: a disentangled attention mechanism and an enhanced masked decoder.

Disentangled attention. BERT represents each word by a single vector. This vector is computed by the sum of its word embedding and an absolute position embedding. DeBERTa separates or disentangles these embeddings into two different vectors, thereby representing each word by two vectors. The attention weights are computed using disentangled matrices on their word and relative position . The idea behind this approach is that the attention weight of a word pair not only depends on the content of the words but also on their relative position. Consider the words “deep” and “learning”. Their dependency is much stronger when their position to each other is smaller than when their position to each other is larger.

Enhanced mask decoder BERT uses masked language modelling (MLM). This is a fill-in-the-blank task, where the model learns to predict what word a token should mask, based on its context, i.e., the words surrounding the mask token. Ref. [ 8 ], p. 2, also uses MLM while pre-training. Their disentangled attention mechanism already considers the contents and relative positions of the context words but not their absolute positions. The absolute positions are in many cases crucial and can help the model learn syntactical nuances which would not be tangible if it only considered the relative position. Consider this example:

  • “a new store opened beside the new mall”
  • “a new <mask1> opened beside the new <mask2> ”

The masked tokens “store” and “mall” have a similar context but store is the subject and mall is the object. This example is illustrated in Figure 10 . These syntactical nuances can only be learned by considering the absolute position of the masked tokens. DeBERTa incorporates these absolute word embeddings before the Softmax layer, right before the model decodes the masked words based on the contextual embedding (word contents and positions). Thereby, ref. [ 8 ], p. 2, not only considers the findings of [ 25 ] but combines the relative and absolute position to exceed the results of previous models. Figure 9 illustrates the DeBERTa architecture and shows the way the BERT-like representation is supplemented by the relative position embedding and the enhanced mask decoder.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g010.jpg

Illustration of processing an example with enhanced MLM.

4.8. Token Replacement

BERT and DeBERTa [ 8 ] both use MLM. Ref. [ 12 ] proposed a new approach to further exploit the benefits of MLM. Their ELECTRA model consists of two encoders: the generator and the discriminator . Firstly, some of the input tokens are replaced with the [MASK] token. The generator then learns to predict the original identities of the masked-out tokens and replaces the masked-out tokens with its predictions. Then, the discriminator is trained to identify tokens that have been replaced by the generator’s tokens. Figure 11 depicts the process of masking, replacing, and discriminating tokens. Ref. [ 12 ] showed that this approach works well on relatively smaller computing resources and still outperforms previous models such as BERT.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g011.jpg

Illustration of the replaced token detection mechanism used in ELECTRA (modified from [ 12 ]).

4.9. Parameter Reduction

Approaches like DeBERTa [ 8 ] and other larger models explored the boundaries of how large a model can become thanks to modern hardware. They showed what a positive impact a large number of parameters and a big data set can have on the model performance. To empower teams and organizations with smaller resources, ref. [ 10 ] proposed their ALBERT (A lite BERT for self-supervised learning of language representations) in which they showed strategies for optimizing the efficiency of existing models. ALBERT optimizes the original BERT approach by two parameter reduction techniques:

Factorized Embedding Parameterization: ref. [ 10 ] separated the size of the hidden layers from the size of vocabulary embedding. This is done by decomposing the large vocabulary embedding matrix into two small matrices. Through this approach, it becomes easier to grow the hidden size without significantly increasing the parameter size of the vocabulary embeddings.

Cross-Layer Parameter Sharing: The idea of sharing parameters across layers has been previously explored with the transformer architecture [ 7 ]. However, ref. [ 10 ] focused more on the pre-training/fine-tuning setting, and prior works focused more on training for standard encoder-decoder tasks. Sharing parameters across layers prevents the parameter from growing with the depth of the network.

In addition to the parameter reduction techniques, ALBERT also includes an approach to exceed BERT in terms of sentence-level understanding. Refs. [ 28 , 30 ] discussed the ineffectiveness on next sentence prediction loss in BERT [ 13 ]. To overcome this weakness, ref. [ 10 ] proposed a self-supervised loss for sentence-order prediction (SOP), which primarily focuses on inter-sentence coherence. As a consequence of applying all these techniques, ALBERT outperforms the original BERT on well-known tasks such as GLUE while having considerably fewer parameters.

4.10. Multi-Task Learning

Previously discussed models are based on the sequence-to-sequence framework. A certain input sequence leads to a certain output sequence produced by the model. The task that the model tries to solve is fixed and cannot be changed. By extending this framework into the text-to-text framework in their “Text-to-Text Transfer Transformer” ( T5 ), the authors of [ 11 ] try to overcome this limitation of models that can only be applied to one fixed task. It is a transformer-based model using this text-to-text framework. The basic idea of the text-to-text framework is to treat every NLP problem as a “text-to-text” problem, i.e., text as input and new text as output. The input not only incorporates the data, which should be processed, but also information to indicate the specific task, thereby allowing for the model multi-task learn , because many different tasks can now be considered as one general “text-to-text” task. Figure 12 shows example inputs and outputs of T5’s text-to-text approach. There are several alternative approaches for multi-task learning with NLM. For instance, ML-DNN [ 31 ] uses task-specific layers that share an encoder layer stack.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g012.jpg

Example of inputs and outputs of T5 (Source: [ 11 ], (p. 3)).

Before applying any input, it has to be brought or unified into the text-to-text structure. One way of unifying the input is to transform every task into one static structure. Ref. [ 32 ] did this and formulated everything as a question-answer structure. Ref. [ 11 ], p. 11, did it differently: they selected a specific structure for each of their tasks. This allowed them to always train using maximum likelihood. Here are a few examples of how tasks are pre-processed before generating a corresponding output with T5:

  • SQuAD: question: <question> context: <context>
  • CNN/Daily Mail: summarize: <text>
  • WMT English to German: translate English to German: <text>
  • MRPC: mrpc sentence1: <sentence1> sentence2: <sentence2>

All these different tasks are mixed into one large fine-tuning data set. This is possible because of their common text-to-text task schema. The text-to-text framework allows [ 11 ], p. 2, to apply the same model, objective, training procedure, and decoding to every task, which can be formulated into a text-to-text structure. Therefore, T5 can be fine-tuned on a broad variety of English-based NLP tasks. Ref. [ 11 ], p. 7, used the “Colossal Clean Crawled Corpus” (or C4 for short) for pre-training the model. This data set was also introduced in [ 11 ], p. 7, and is a lot larger (750GB) than most of the data sets used for NLP pre-training. In comparison, GPT was trained on the BookCorpus 4 GB [ 11 ], p. 26, and BERT on Wikipedia data and the book corpus combined 20 GB [ 11 ], p. 26. Since C4 is based on the common crawl project, it not only contains book-like or article-like texts but a much wider variety of English texts. Despite its colossal size [ 11 ], p. 7, ensured that, by their used techniques, C4 only consists of reasonably clean and natural English text.

Despite many of the top-performing transformer-based model architectures for NLP only consisting of a single “stack” (e.g., the encoder stack in BERT [ 13 ]), the authors in [ 11 ] have decided to use the standard encoder-decoder structure because their experiments indicated that it achieved good results on generative as well as classification tasks. The base model in [ 11 ], p. 7, is called T5-base . It consists of an encoder-decoder model. The encoder and the decoder are both similar in size to bert-base [ 11 ], p. 11, resulting in a model twice as large as bert-base because BERT only consists of an encoder. In addition to these novel features, T5 incorporates the techniques of the previously discussed ALBERT [ 10 ] ( Section 4.9 ).

4.11. Use Case of Transformer Models

There are several task-specific enhancements or variations of the previously discussed approaches and architectures. One specific example is BertSUM. It was introduced in [ 33 ]. It optimizes and modifies the original BERT architecture to achieve better results in extractive text summarization. We applied BertSUM in the field of medieval Latin documents [ 34 ]. By using a multi-lingual pre-trained BERT version and a translation service, we were able to generate German summaries of the Latin documents. Figure 13 illustrates the architecture of [ 34 ] and shows how summarization and translation can be combined.

An external file that holds a picture, illustration, etc.
Object name is entropy-23-01422-g013.jpg

Illustration of the architecture used in [ 34 ].

4.12. Further Techniques

There are several interesting and promising techniques used in language representation models which are not directly linked to high-performing models in the GLUE leader board: for instance, the family of generative pre-trained transformer (GPT). It consists of generative models, which were trained unsupervised. The most recent are GPT-2 [ 35 ] and GPT-3 [ 36 ]. GPT-2 was trained to predict the next word in a large text corpus. GPT-3 further improved the concepts of GPT-2 and is currently the largest language model with 175 billion parameters. Even though it was not fine-tuned, GPT-3 could achieve similar performance to many state-of-the-art models when using few-shot learning. For example, [ 37 ], inspired by GPT-3, presented a suite called LM-BFF with simple and complementary techniques for fine-tuning language models on a small number of annotated examples, further substantially improving the performance.

Another mentionable model is BART [ 38 ]. It takes the idea behind MLM further by firstly corrupting the text with an arbitrary noising function and secondly training a model to de-noise (reconstruct) the original text.

5. Limitations

This work does not claim to cover all aspects of successful neural language models. It aimed at explaining, discussing, and contextualizing the most important aspect derived in the way described in Section 3 with a focus on the models’ performance as measured by the GLUE score. One threat to this approach is that it is biased towards models which are specialized and optimized to perform well under the measurement metrics of GLUE. The overview lacks the coverage of other models or solutions which might perform well in other specific tasks. Additionally, it lacks the coverage of other models not being submitted to be evaluated under GLUE score and hence not listed in GLUE leaderboard. Additionally, the coverage only includes the recent models developed since 2016. Older models might still have some room for improvements and could potentially be interesting for further research and development. Nevertheless, the GLUE benchmarking is used to objectively measure the models’ performance against a determined baseline. Including other models would be a difficult task since the comparisons would have to be based on some arbitrary parameters and not ones agreed upon by the community.

6. Discussion

The most recently published research on transfer learning and bidirectional models has by far outperformed previous approaches in different disciplines. Further development in these two fields could be very promising to even increase today’s state-of-the-art models. Future developments in the following areas are of special interest currently:

Model efficiency : Ref. [ 11 ], p. 42, showed with their experiments on the C4 data set and on the T5 model that larger models tend to perform better. The rapid development in making hardware faster and cheaper makes work with larger models possible. Transfer learning increases the performance of models on low-resource tasks (settings with a lack of assets to label more data). Thus, further research in developing smaller and cheaper models makes it possible to apply transfer learning where it has the most impact. Ref. [ 12 ], for instance, showed how increasing the complexity of a sub-task in a model (in their case, MLM) can lead to a decrease in the overall computing costs to develop a model. Ref. [ 10 ] showed that by optimizing the vocabulary and the parameter sharing, a model can significantly reduce its overall number of parameters while still achieving state-of-the-art results.

Compositional Generalization : Ref. [ 8 ], p. 10, marked an important milestone by surpassing human performance on SuperGLUE. Nonetheless, the machine learning models are not close to being capable of reaching human-level intelligence in natural language understanding (NLU). Humans are especially good at compositional generalization. This means the ability to leverage the knowledge learned from different tasks to solve novel compositions (new tasks) of familiar constituents (subtasks or basic problem-solving skills). Models which incorporate compositional structures in a more explicit manner, which could allow for combining neural and symbolic computation of natural language, could become even more closer to reaching human-level intelligence in NLU. These compositional structures could, for instance, be developed based on the progress in the field of multi-task learning, such as in [ 11 ].

Measurement of Model Capabilities : Due to the quick progress of neural language models, ways of how to measure the quality of the output of models are hitting their limitations. As discussed earlier, the fact that DeBERTa [ 8 ], p. 10, beat the human performance in the SuperGLUE task is remarkable. However, the NLU capability of the model is still far behind human performance. Most of the quality metrics such as ROUGE work by comparing the generated output to a gold standard. Thus, they always have a bias towards this gold standard and the similarity metric. One way of overcoming this bias is human-in-the-loop methods. Ref. [ 39 ] showed how to additionally fine-tune pre-trained models with direct human feedback. Their results beat models without this approach on classic metrics such as GLUE. They used online reinforcement learning in fine-tuning that lead to complex machine learning and software systems. They described their system as not easy to debug and that the quality of the human feedback was hard to control.

Generalized Models : With the rise of home assistants, models which can solve multiple tasks are in demand. These models should support users in various everyday life situations. The multi-task approach of [ 11 ], p. 42, showed very promising results. Their way of encoding multiple tasks into a text-to-text format seems to be well applicable to home assistants. Further research can be done in developing a wider variety of tasks. Another way of generalizing models is by increasing their capability of understanding multiple languages. Home assistants could expand to environments, where they are exposed to many languages, such as hotel lobbies or airport counters in the future. In those environments, their ability to solve multiple tasks in multiple languages could be a primary factor to determine their usefulness.

The rapid development in the field of neural language models especially in the last five years is indicating that future models could become better and better in performing in these rather difficult areas.

Abbreviations

The following abbreviations are used in this manuscript:

Author Contributions

Writing—original draft preparation, T.S. and M.T.-F.; writing—review and editing, T.S. and M.T.-F. All authors have read and agreed to the published version of the manuscript.

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Data availability statement, conflicts of interest.

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Search Icon

Events See all →

Investigating homelessness.

A yellow and green victorian house

Kelly Writers House, 3805 Locust Walk

Earth Week 2024

Purple blooms on the verge of opening, backlit by the sun

This is a campuswide week of events, lectures, and volunteer opportunities designed to educate and inspire action related to environmental justice, climate, and nature-based solutions. This year’s theme is Restore & Regenerate.

Various locations

Take Our Children to Work Day

A child sits at a table reading a book, two closed books are on the table beside them.

Excellence in Graduate Teaching Reception

Penn Grad Center brick exterior with foliage

5:00 p.m. - 6:30 p.m.

Penn Graduate Student Center, 3615 Locust Walk

Science & Technology

Defining neural ‘representation’

Neuroscientists frequently say that neural activity ‘represents’ certain phenomena. pik professor konrad kording and postdoc ben baker led a study that took a philosophical approach to tease out what the term means..

A computer screen displays the brain activity of a man with electrodes on his head.

One of neuroscience’s greatest challenges is to bridge the gaps between the external environment, the brain’s internal electrical activity, and the abstract workings of behavior and cognition. Many neuroscientists rely on the word “representation” to connect these phenomena: A burst of neural activity in the visual cortex may represent the face of a friend or neurons in the brain’s memory centers may represent a childhood memory.

But with the many complex relationships between mind, brain, and environment, it’s not always clear what neuroscientists mean when they say neural activity “represents” something. Lack of clarity around this concept can lead to miscommunication, flawed conclusions, and unnecessary disagreements.

To tackle this issue, an interdisciplinary paper takes a philosophical approach to delineating the many aspects of the word “representation” in neuroscience. The work, published in Trends in Cognitive Sciences , comes from the lab of Konrad Kording , a Penn Integrates Knowledge University Professor and senior author on the study whose research lies at the intersection of neuroscience and machine learning.

“The term ‘representation’ is probably one of the most common words in all of neuroscience,” says Kording, who has appointments in the Perelman School of Medicine and School of Engineering and Applied Science . “But it might mean something very different from one professor to another.”

It’s also a term that is “philosophically loaded,” says Ben Baker , lead study author and a postdoctoral researcher in Kording’s lab . Discourse on mental representation dates back as far as Aristotle, he says, and those early discussions laid the groundwork for modern cognitive science. However, as neuroscientists adopted the term, its evolving definition was never fully fleshed out.

Using his background in philosophy of mind and philosophy of science, Baker conducted a literature review, analyzing foundational experiments and major themes in neuroscience to determine how researchers applied the term “representation.” He and his coauthors compared these usages to the term’s function in philosophy and identified three aspects to the use of the word in neuroscience, each of which builds from the previous one.

The first, most straightforward aspect is correlational, in which a neural state correlates with an event or feature of the environment. For example, a researcher may find that a group of neurons fires whenever a mouse sees a predator. Because that particular neural state is correlated to seeing the predator, the researcher may say the neural state represents the predator.

In the past, technology limited early neuroscientists primarily to recording neural activity, rather than inducing or modeling it. These experiments produced mostly correlative conclusions, which shaped neuroscientists’ early concept of representation, according to Kording and Baker.

But those who hope to build comprehensive models of behavior are interested in more than just correlation; they want to know how neural activity causes behavior related to a particular event or feature. The second aspect of representation that Baker and Kording identified also includes a causal component. When researchers use the term in this sense, they mean that neural activity related to some event or feature causes behavior related to that event or feature. Building off the previous example, if a mouse’s neural state represents seeing a predator, then that neural state may cause an action such as hiding or running away.

This definition is more common in contemporary neuroscience studies. Modern experimental techniques such as optogenetics allow for precise stimulation of specific populations of neurons, making it easier for scientists to prove causal links between neural activity and behavior.

Beyond what behaviors a neural state may cause, many neuroscientists also want to know why animals behave that way. The third aspect of representation encompasses this idea by adding a teleological component, which emphasizes an action’s purpose rather than its cause.

In this usage, there is a reason the neural state and its corresponding behavior correlate with an event or feature. Returning to the mouse example, the neural activity that represents seeing a predator and causes the mouse to run away exists so the mouse will survive. That’s its purpose .

Teleology is more often discussed in philosophy than in neuroscience, but neuroscientists’ reasoning often implies teleology, according to Kording and Baker. Teleological definitions of representation are especially common in neuroscience research that tries to model abstract components of cognitive tasks. Many evolutionary explanations of behavior rely on teleological reasoning as well.

The researchers say they hope their paper will help tamp down ambiguous use of the word “representation” and promote more rigorous discourse in the field. “The end goal is clear communication,” Baker says. Once that happens, Kording adds, “there's many different discussions you can have.”

Konrad Kording is a Penn Integrates Knowledge University Professor with joint appointments in the Department of Neuroscience the Perelman School of Medicine and in the Department of Bioengineering in the School of Engineering and Applied Science .

Ben Baker is a postdoctoral researcher in the Kording lab and a Provost Postdoctoral Fellow . Baker received his Ph.D. in philosophy from Penn.

Also coauthor on the paper is Benjamin Lansdell , a data scientist in the Department of Developmental Neurobiology at St. Jude Children’s Hospital and former postdoctoral researcher in the Kording lab.

Funding for this study came from the National Institutes of Health (awards 1-R01-EB028162-01 and R01EY021579) and the University of Pennsylvania Office of the Vice Provost for Research.

Picturing artistic pursuits

interim president larry jameson at solar panel ribbon cutting

Campus & Community

Penn celebrates operation and benefits of largest solar power project in Pennsylvania

Solar production has begun at the Great Cove I and II facilities in central Pennsylvania, the equivalent of powering 70% of the electricity demand from Penn’s academic campus and health system in the Philadelphia area.

elementary age students with teacher

Education, Business, & Law

Investing in future teachers and educational leaders

The Empowerment Through Education Scholarship Program at Penn’s Graduate School of Education is helping to prepare and retain teachers and educational leaders.

barbara earl thomas with seth parker woods

Arts, Humanities, & Social Sciences

‘The Illuminated Body’ fuses color, light, and sound

A new Arthur Ross Gallery exhibition of work by artist Barbara Earl Thomas features cut-paper portraits reminiscent of stained glass and an immersive installation constructed with intricately cut material lit from behind.

dramatic light on Robert Indiana’s LOVE statue on Penn’s caption.

25 years of ‘LOVE’

The iconic sculpture by pop artist Robert Indiana arrived on campus in 1999 and soon became a natural place to come together.

IMAGES

  1. Human Nature Paintings

    what is natural representation

  2. Power Series Representation With Natural Logarithms

    what is natural representation

  3. Draw A Flowchart On Components Of Environment

    what is natural representation

  4. Diagrammatic representation of vegetation with its various functions

    what is natural representation

  5. Perfect, Natural Representation of the Buckeye State. Found in

    what is natural representation

  6. Data Representation Natural numbers

    what is natural representation

VIDEO

  1. Representation Matters Natural Kibbe Soft Natural Hair

  2. Create a Digital Persona with Apple Vision Pro

  3. Proofs: Natural Deduction

  4. Data Representation & Computer Arithmetic

  5. Representation Learning: Basic and Key Features

  6. Nature in Blender 3.6

COMMENTS

  1. The natural representation of $SO(n)$ is irreducible for $n\\ge 3$

    The natural representation $(\\pi,\\mathbb C^n)$ of $SO(n)$ is the one for which $$\\pi (g)z = g^{-1}z$$ for $g\\in SO(n)$ and $z \\in \\mathbb C^n$ (the product $g ...

  2. Introduction to Theoretical Computer Science: Computation and

    Meaning of representations (discussion) It is natural for us to think of \(236\) as the "actual" number, and of \(11101100\) as "merely" its representation. However, for most Europeans in the middle ages CCXXXVI would be the "actual" number and \(236\) (if they have even heard about it) would be the weird Hindu-Arabic positional representation. 1 When our AI robot overlords ...

  3. Representation theory of the symmetric group

    For all n, there is an n-dimensional representation of the symmetric group of order n!, called the natural permutation representation, which consists of permuting n coordinates. This has the trivial subrepresentation consisting of vectors whose coordinates are all equal.

  4. PDF Representation Theory

    A linear representation ρof Gon a complex vector space V is a set-theoretic action on V which preserves the linear structure, that is, ρ(g)(v 1 +v 2 ... V → V, with the natural addition of linear maps and the composition as multiplication. (If you do not remember, you should verify that the sum and composition of two linear maps is also a ...

  5. PDF A Brief Introduction to Group Representations And

    This last example also gives a representation, called the natural representation, of the symmetric group S nover F: the action of S non Fnis given by ˆ ˙(e i) = e ˙(i). Example 1.3. Suppose that Gacts on a set X. Denote by FX = ff : X !Fgthe free vector space on X, with basis elements e x given by e x(y) = x;y. Then the map ˆ: G!GL(FX ...

  6. By Dr. Asa Simon Mittman (article)

    This oil painting depicts a subject, which is a guitar player, but the way it is depicted is very abstract, with geometric shapes representing limbs and parts of the guitar. This painting would be representational, even though it "rejects reality", because it depicts a subject. However; it would also be abstract because the elements of a human ...

  7. Why Some Theoretically Possible Representations of Natural ...

    Such a unary representation works well for small numbers, but for large numbers—e.g., hundreds or thousands—such a representation requires an un-realistic amount of space. A natural way to shorten the representation is to use other basic numbers in addition to 0, and use more complex operations—e.g., full addition instead of adding 1.

  8. Free Full-Text

    A representation involves two natural entities—one is an accountable physical entity, P, that may cause a consequence upon interaction by virtue of its state, S, in a context, and another is a semantic value, C, a natural correlation of the state with the limits of reality and relations that may cause the S of P. P is referred to as the ...

  9. Natural Numbers

    Natural numbers representation on a number line is as follows: The above number line represents natural numbers and whole numbers. All the integers on the right-hand side of 0 represent the natural numbers, thus forming an infinite set of numbers. When 0 is included, these numbers become whole numbers which are also an infinite set of numbers. ...

  10. What is Naturalism in Art

    Naturalism, as an art movement, is a precise and unadulterated representation of reality. It's like the artist's lens is focused on capturing the world precisely as it is, with no exaggerations or embellishments. This art movement thrives on the depiction of realistic subjects in a natural setting. It's the visual equivalent of holding up a ...

  11. Faithful representation

    Faithful representation. In mathematics, especially in an area of abstract algebra known as representation theory, a faithful representation ρ of a group G on a vector space V is a linear representation in which different elements g of G are represented by distinct linear mappings ρ(g) . In more abstract language, this means that the group ...

  12. Naturalistic Art

    Idealism is an artistic concept often related to figure painting, wherein an artist attempts to create a perfect image. Naturalism is, in essence, at the other end of the spectrum to Idealism. Rather than trying to make the world seem perfect, Naturalist painters prefer to depict all the world's imperfections true to form.

  13. Revealing the multidimensional mental representations of natural

    To characterize the representational space of natural objects, we had to overcome several obstacles. First, we needed to identify a set of objects that is representative of the objects encountered ...

  14. Natural Representation: Diagram and Text in Darwin's 'On the ...

    NATURAL REPRESENTATION 249 this article answers that question with recourse to three aspects of Darwin's argument in the Origin: natural relations, time, and extinc tion. As we will see, text and image do not each play a consistent, single role in his work; their uses and interactions vary depending on the argument in question.

  15. Philosophy of art

    Analysis of representation. Representation always involves a certain degree of abstraction—that is, the taking away of one characteristic or more of the original. Even a fairly realistic painting of a person, for example, lacks some features that characterize actual persons: a painting is two-dimensional, whereas every actual person is three-dimensional; the surface of a painting is paint ...

  16. The Nature of Representation

    Abstract. For a fuller understanding of what constitutes 'representation', as is being assessed in the book, this chapter provides an exploration of the theory and meanings of this in the political sphere. Various theories of representation are explored, such as descriptive and substantive representation, presence theory, and critical mass ...

  17. Smarthistory

    Naturalism, realism, abstraction and idealization. Disentangling naturalism from realism is tricky, and the terms are sometimes used interchangeably, despite their important differences. Naturalism and realism might or might not appear in the same work. An image might be naturalistic and realistic, like The Doubting of Thomas, or naturalistic ...

  18. Representation (arts)

    Representation is the use of signs that stand in for and take the place of something else. ... Aristotle deemed mimesis as natural to man, therefore considered representations as necessary for people's learning and being in the world. Plato, in contrast, looked upon representation with more caution. He recognised that literature is a ...

  19. Representation: Cultural representations and signifying practices

    Representation—the production of meaning through language, discourse and image—occupies a central place in current studies on culture. This broad-ranging text offers treatment of how visual images, language and discourse work as "systems of representation." Individual chapters explain a variety of approaches to representation, bringing to bear concepts from semiotic, discursive ...

  20. Language Representation Models: An Overview

    The research presented here is a targeted review of neural language models that present vital steps towards a general language representation model. Keywords: natural language processing, neural networks, transformer, embeddings, multi-task learning, attention-based models, deep learning. 1.

  21. Defining neural 'representation'

    Defining neural 'representation'. Neuroscientists frequently say that neural activity 'represents' certain phenomena. PIK Professor Konrad Kording and postdoc Ben Baker led a study that took a philosophical approach to tease out what the term means. Neuroscientists use the word "represent" to encompass multifaceted relationships ...