### Announcements

#### Bayesian analysis tutorial using PyMC

posted Jul 1, 2011, 2:54 AM by Olaf Bochmann

 This is a short Bayesian analysis tutorial developed around the following problem: Consider the following dataset, which is a time series of recorded coal mining disasters in the UK from 1851 to 1962 [Jarrett, 1979]. The first step in the analysis is the Bayesian model building. We assume that the occurrences of the disasters can be modelled as a poisson process. Our hypothesis is, that at some point in time the mining process switched to a new technology, resulting in a lower rate of disasters in the later part of the time series. We define the following variables for the Bayesian stochastic model:D_t: The number of disasters in year t.r_t: The rate parameter of the Poisson distribution of disasters in year t. s: The year in which the rate parameter changes (switch point)e: The early rate parameter prior to s.l: The late rate parameter posterior to s.t_l, t_h: The lower and upper boundaries of year t.r_e,r_l: The rate parameters of the priors of the early and late rates.The model can then be defined as where D is dependent on s, e, and l. In a Bayesian network the probabilistic variables s, e, and l are considered as 'parent' nodes od D and D is a 'child' node of s, e, and l. Similarly, the 'parents' of s are t_l and t_h, and s is the 'child' of t_l and t_h. The nest step is fitting the probability model (linked collection of probabilistic variable) to the recorded mining disaster time series. This means we are trying to represent the posterior distribution. Markov chain Monte Carlo (MCMC) algorithm [Gamerman 1997] is the method of choice. In this case we represent the posterior p(s,e,l|D) by a set of joint samples of it. The MCMC sampler produces these samples by randomly updating the values of s, e and l for a number of iterations. This updating algorithm is called Metropolis-Hasting [Gelman et al. 2004].With a reasonable large number of samples, the MCMC distribution of s, e and l converges to the stationary distribution. There are many general-purpose MCMC packages. Here I use PyMC - a python module created by David Huard and Anand Patil and Chris Fonnesbeck. I prefer a programming interface to stand alone programs like WinBugs for its flexibility. It uses high performance numerical libraries like numpy and optimized Fortran routines. The following code imports the model, instantiates the MCMC object and run the sampler algorithm:>>> from pymc.examples import DisasterModel >>> from pymc import MCMC >>> M = MCMC(DisasterModel) >>> M.isample(iter=10000, burn=1000, thin=10)The iter-parameter specifies the number of updating iterations for the sampling algorithm. The algorithm assumes that the burn-parameter specifies a sufficient large number of iterations for convergence to stationarity of distribution s, e and l. To avoid strong autocorrelation among samples in the Markov chain, successive samples can be thinned. This means only every k-th sample is retained, where k is given by the thin-parameter. The number of samples for each variable is therefore (iter-burn)/thin.Below are the 900 samples (left) of variable s - the year in which the rate parameter changed. The histogram (right) with a mean at 40 and 95% HPD interval (35, 44).Next, the 900 samples (left) of variable e - the early rate parameter prior to s. The histogram (right) with a mean at 3.0 and 95% HPD interval (2.5, 3.7). Finally, the 900 samples (left) of variable l - the late rate parameter posterior to s. The histogram (right) with a mean at 0.9 and 95% HPD interval (0.7, 1.2). To conclude our analysis, the data of the UK mining disasters suggest that between the years 1886 and 1895 some technology may have been introduced, that lowered the disaster rate from approximately 3.0 to 0.9 per year. Well done Brits!

#### Econometric Multiplayer Game

posted Jun 24, 2011, 1:40 AM by Olaf Bochmann   [ updated Aug 8, 2011, 8:19 AM ]

In order to harvest behavioral patterns of market participants Doyne Farmer came up with the idea to use an agent based economic model in a gaming mode. The idea is to replace some or all of the artificial agents by real people. In a "war-gaming style" user of a multiplayer game will be able to develop an intuition for consequences of individual economic decisions in a dynamic interconnected gaming environment. The data can be used to understand human decision making. Behavioral information can be useful in calibrating econometric agent based models.
Clearly, it will be quite challenging to design a game engaging enough to draw the attention of the gaming community. Also there are serious doubts that the data can be used for a real world model, because behavioral patterns in a game environment seem to be quite different to behavior in reality. Nevertheless, I think the idea is interesting enough to give it a try. Below I will outline some architectural considerations of such a game. If you are just interested in playing the game you can go right here: http://game.ComplexLab.org. Keep in mind that this is a first prototype. It will be evolving constantly and there is no guaranty of service.

## Architecture of Multiplayer Games

At some point of the game the outcome of individual actions need to be synchronized among different computers/processes. In general, there are two common architectures for resolving arbitration: (1) client-server and (2) peer-to-peer. Client-server is conceptual simpler, easier to implement and better suited to generate game data.

### Client-Server

Each user of the game and each artificial agent runs a local client program. The client programs are connected to a central machine - the server program. The server program is maintaining the state of the game and is broadcasting this information to the individual clients.
This design makes the server the bottleneck, both computational- and bandwidth-wise and it may turn out to be a serious scaling problem. On the other hand it is easy to maintain game state and access control.
The minimal example of a client program is a terminal. It transmits user inputs to the server and reports server messages to the user. The main loop of the client program would look like this:

while not done

   if user has typed any text

      send typed text to server

   if output from server exists

      print output

The main loop of the server program would look like this:

while not done

   for each user in world

      if input exists

         get user command

         execute user command

         log user command and world state

         tell user of the results

   simulate the world

   broadcast to all users

This is as simple as it can get. The client does not perform any simulation. It just performs input/output. MUD is a classic example of a client-server multiplayer game. You can connect to a MUD server using your local telnet program, a unix client or a Java applet.

#### Realtime Push Notification

Ajax Push Engine (APE) is an open source realtime push notification system for streaming data to any browser using web standards only (Asynchronous JavaScript and XML (AJAX)). It includes a comet server and a Javascript Framework. Here is a demo that shows how APE can handle massive multi-user moving on a web page in real-time.
Peer-to-Peer
In a pear-to-pear system (P2P) each user of the game runs the same peer program, or at least groups of users run the same program. The peer program maintains the local state e.g. the position of the player. When moving, the peer program is also responsible for collision avoidance. Therefore, peers need to broadcast their state to other peers. A minimalistic example of a peer loop would look like this:

while not done

   collect player input

   collect network input about other player

   simulate player

      update player's state if informed it's been hit

      inform other peers if they've been hit

      move player

      log local state

      update other player state (ammo, armor, etc.)

   report player state to other clients

There are several issues with this type of architecture that need to be adressed:
• connectivity: how to find and connect to nodes that are running peer programs somewhere in the network?
• instability: nodes may go on and off all the time.
• message routing: how to route messages between nodes, even if they are not directly known to each other?
• searching: how to find information on other nodes?
• security: peer programs can be hacked to report false states to other nodes.
Most gaming architectures are hybrids of client-server and peer-to-peer.

## Possible Gaming Scenarios

Farmer at all [1] proposed a number of gaming scenarios:
• inter bank lending game: players represent banks that can borrow and lend to each other to cover their balance sheets (study the role of trust).
• leverage cycle game: [Geanakoplos, 2009] design a more stable market (study role of leverage in generating financial crisis)
• financial network game: design a financial feedback system
To be continued.

#### Computational Mechanics

posted Feb 22, 2011, 4:38 AM by Olaf Bochmann   [ updated Feb 28, 2011, 1:47 AM ]

Computational Mechanics is an extension of the approaches typically found in statistical mechanics. It is a technique to describe both temporal and spacial organization in systems. It is a more detailed structural analysis of behavior than those captured solely in terms of probability and degrees of randomness. It complements and augments more traditional techniques in statistical physics by describing the structural regularies inherent in the system or data.
Beyond focusing on measures of disorder, computational mechanics aims to detect and quantify structure in natural processes. Natural information processing mechanisms can be represented and analyzed in a diverse set of model classes. Contemporary notions of computation and of useful information processing must be extended in order to be useful within natural processes and their instantiation in multi-agent systems. This is because natural processes are systems that can be spatially extended, continuous, stochastic, or even some combination of these and other characteristics fall often outside the scope of discrete computation theory.
Computational mechanics attempts to construct the minimal model capable of statistically reproducing the observed data. This is called an ε-machine or causal state machine. This approach leads one naturally to address the issue of the information processing capability of the system.
There is an vast amount of literature on computational mechanics. This includes both,  theoretical developments as well as system applications. Jim Crutchfield maintains a rather exhaustive page on computational mechanics called the Computational Mechanics Archive, and almost all papers of interest are listed there.

#### ε-machine (causal state machine)

The ε-machine is a mathematical object that embodies the Occam's Razor principle in that its description of the process has minimal statistical complexity subject to the constraint of maximally accurate prediction. The ε-machine provides
• causal states: partitions of the space of all histories of the process into sets of histories with the same conditional distribution of futures
• minimally-stochastic set of transition matrices, which give the probabilities of going from causal state i to j (while entering the symbols associated with that particular transition.)
This the ideal representation of a process's structure, and the causal states are the minimal sufficient statistics for predicting the process's future.

ε-machine reconstruction

The problem is inferring ε-machines from limited data is called ε-machine reconstruction. The focus is on developing inference algorithms (state-merging versus state-creating) and accompanying statistical error theories, indicatiing how much data and computing time is required to attain a given level of confidence in an ε-machine with an finite number of causal states.

#### causality

Causality is defined in temporal terms: Something causes something else by preceding it. A sequence of causal states has the property that causal states at different times are independent, conditioned on the intermediate causal states. This is a Markov chain: In knowing the whole history of the process up to a point has no more predictive capacity than knowing the causal state it's in at that point. The causal structure is the equivalence relation causal states establish among different histories which have the same conditional distribution of future observables and the symbol emitting state transition probability. A directed graph where states are represented as nodes and edges labeled with emitted symbols and transition probabilities represent stete transitions can be used to visualize the causal information in the ε-machine.

#### type of raw data for inferring causality (suitable processes)

Processes best captured by this method are sequences of a symbolic logic character, with complex pattern and an unknown generative mechanism, such as an extremely
long string on not-totally-random 1's and 0's.
Processes are assumed to be stationary, however online reconstruction can relax this constraint. Algorithmic complexity of the reconstruction increases exponentially with the size of the symbol set.
The even process is reconstructed (using CSSR) from sequences of 10000 samples with correct causal states and transition probabilities within 3% of the correct probabilities. The even process is a process which generates a string of 1's and 0's according to the rule that after a 0 is emitted, there is a coin flip to see whether a 1 or 0 follows. After an odd number of 1's being emitted, the process must emit a 1, but after an even number of 1's it is back to the coin flip.

#### representation of randomness

Randomness is embodied in the minimally stochastic transitions form one state to another. The given state can lead to many states, but a given symbol leads to at most one of those states.

#### limitations

Unlike for graphical causal models a "factor" is not a meaningful term for the definition of an ε-machine. In computational mechanics, the causal state definition sidesteps this, because it doesn't matter what determines the causal state to be a causal state, it just depends on the future morph of a given history.
Pearl introduces the notion of "stability" (robustness of independence patterns) is not satisfied in ε-machine reconstruction. This is when the model's structure can retain its independence pattern in the face of changing parameters. It is unclear in this context what the "parameters" actually are.

#### Agent-Based Model of the Housing Bubble

posted Dec 15, 2010, 2:02 AM by Olaf Bochmann

This article outlines an approach in agent-based computational economics to build a macroeconomic model of the recent housing bubble. The goal of this effort is to gain a better understanding of it's causes and to formulate policy prescriptions. The common view of economists with respect to the housing bubble causes is that the Federal Reserve policies and measures that started in 1993 with respect to interest rates has led to the current crisis. This hypothesis needs to be verified (by observing some emergent properties). This is a model where we observe patterns of development out of the individual interactions of agents qualitatively and try to match them with the empirical data.

In contrast to the conventional approach that produces quantitative house price projections based on trends in incomes, interest rates and housing supply and demand, the agent-based approach simulates the interaction of individual agents who seek to buy and sell properties. House prices emerge from this market process.

### Model Definition

According to McMahon et. all (2009) the main actors in the housing market (people, banks) and the components (houses, mortgages) are modelled as agents. People are either renters or owners of one or more houses. Mortgages are Adjustable Rate Mortgages (ARMs) based on interest rate. Each house is associated with zero or one mortgage that is owned by a bank.
The prices of the houses and the mortgages are determined endogenously by the interacting agents. Disturbances are induced only by controlling for the interest rates exogenously.

#### People

People have a fixed income that follows a uniform distribution within some range. They may relocate, which in turn requires to rent or own houses. This decision requires the evaluation of the financial situation with respect to the rent or ownership situation. House owner may evaluate the decision to buy an extra house for investment and rent it out to other people. House owner may decide to sell a house.

#### Houses

Houses can be rented or owned. Initial house prices follow a uniform distribution within some range. Houses are foreclosed, due to insufficient funds to pay mortgages or rents.

#### Mortgages

The mortgage is owned by a bank, and is associated with a particular person and a house. Mortgage payments are adjusted to represent the notion of ARMs, possible with some time lag.

#### Banks

Banks maintain a balance sheet to keep track of their assets (mortgage payments) and liabilities (mortgage value of the houses owned by the bank).

### Model initialization

Houses are created at a certain density (patchDensity) on the map. Each house is then assigned a price. A fraction of houses are assigned as rentals (rentalDensity) and some percentage is occupied (occupiedPerc) by people. Each person is then assigned an income and People are assigned to houses according to matching income - rental/ownership costs. The parameter patchDensity, rentalDensity, and occupiedPerc can be calibrated to empirical data.
The model generates the following output: average house Price, average mortgage cost, number of owned vs. rented houses, banks balance sheet, and percentage of bankrupt people, and average location of houses.

### First Results

With the Axtell & Epstein (1994) model, we can observe patterns of development out of the individual interactions of agents qualitatively and try to test them with the empirical data. Using the actual interest rate scenario 1993-2009 a big drop in the banks balance sheets can be observed, that corresponds to the decrease in the house prices and the increase in the average mortgage rates. A policy of controlling exogenously for the interest rates leads to the emergence of bubbles and foreclosures.

### Things to do

As first results have shown, an exogenous control for the interest rates was one of the factors that led to the housing crisis. However, this is considered to be a necessity, but not a sufficiency condition: Future models should link the subprime crisis to the housing market. The model can be further enhanced by including the supply side of the market represented by the construction companies, in order to refine the endogenous emergence of the house prices.

### References:

#### Agent-based Computational Economics

posted Dec 13, 2010, 6:30 AM by Olaf Bochmann

The goal of agent based modelin in economics is to  investigate, how a decentralized economy coordinates economic activities. This question is basically the same as the one Keynes (1934) once posed - to what extent, and under what circumstances, particularly under what policy regimes, is the economy self-adjusting?  Agent-based computational economics as an approach to this problem, which is a set of techniques for studying a complex adaptive system involving many interacting agents with exogenously given behavioral rules. The fundamental idea behind this approach is that the economic system can exhibit behavioral patterns that are to complex for the individual agent to comprehend. Therefore the approach assumes simple behavioral rules and allows a coordinated equilibrium to be a possibly emergent property of the system itself.
An agent in a rational-expectations-equilibrium model has a behavioral rule that is not independent of what everyone else is doing. In contrast, autonomous agents are endowed with behavioral rules that can tell them what to do in any given situation, independently of each others’ rules, even when no one has access to a correct model of the economy.

### Coordination Process

The modeling is about autonomous agents and the way they interact. An autonomous agent has simple rules for ﬁnding other people, sending communications to them, and responding to their communications. The interesting questions are what sort of exchange patterns emerge from the interaction between these rules, how orderly and stable are they, and how do they evolve over time.
Modern economic life is largely a sequence of exchanges with other people. Such exchange involves a specialized trader. Traders are the agents that undertake to match buyers and sellers, arrange terms of exchange and bear the costs of adjustment in the face of day-to-day imbalances between spending plans and productive capacity.
It can be modeled, how such a network of coordinating traders might emerge spontaneously, from elementary interactions between people following simple opportunistic rules of behavior. This is a self-organizing model of economy: If the organizational structure was not there, it would quite likely arise from the interaction of intelligent autonomous agents. This organizational structure exhibits the monetary structure of the real-world exchange mechanisms.
Trade is a useful but costly activity. People are motivated by the goal of maximizing consumption. They wander randomly and encounter each other. From time to time people turn into traders.
The economy has N perishable commodities and a discrete number of transactors, each one being identical except for type. A type (i, j) transactor is endowed with commodity i and can consume only commodity j.  There is the same number of each type for each ordered pair of distinct commodities. The model focuses on ﬁve activities: entrepreneurship, search, exchange, business failure and price setting. The model goes through a sequence of periods, in each of which these activities take place in the following sequence.
1. Entrepreneurship: Each transactor can turn into a trader with probability μ. If the transactor is of type (i, j) then the trade will be one where commodities i and j can be traded for each other. There is a ﬁxed cost of setting the shop up and a ﬁxed operating (overhead) cost. Some marked research will be necessary!
2. Search: Each transactor visits one location and encounters a small sample of other transactors. It may establish a relationship. Each relationship will last until the shop exits or the transactor chooses to sever the relationship to form a diﬀerent one while maximizing attainable consumption at currently posted prices.
3. Exchange: Each person can visit the shops with which it has a trading relationship. A type (i, j) transactor having a trading relationship with a shop trading i and j will deliver its endowment of i to that shop for p_i units of j, where p_i is the shop’s oﬀer price for i.
4. Business failure: Anyone who has set up a shop in the past has a chance to exit, and thereby avoid having to pay the ﬁxed operating cost of a shop.
5. Price setting: Each shop that remains in business now sets its prices for the following period (full-cost pricing).

### Emergence of Organization

It can be shown that the model has several absorbing states: Arrays of shops, prices, trading relationships and shopkeeper expectations persist indeﬁnitely once established. Different absorbing state can be observed: barter steady state and “monetary steady states, which is much like the monetary equilibrium in the trading post model. It constitutes a Pareto eﬃcient allocation of resources. Aggregate GDP in the economy is total consumption. Capacity GDP is the sum of all endowments minus the operating costs of the shops.
The emergence of this monetary structure is based on the network externality created by the ﬁxed costs of shops.

### Work to be done

The ultimate goal of this research is to understand how large complex economic systems are capable of
exhibiting such a high degree of coordination most of the time, and yet from time to time are capable of departing drastically from such a coordinated state. They are many free parameters in agent based models, which needs to be calibrated to actual data. So far first results indicate that agent-based computational economics is capable of addressing some of the questions which a rational-expectations equilibrium approach is incapable of addressing, such as how a multiplier process in real estate lending can cause cumulative deviations from equilibrium and how the degree of intrest ﬂexibility aﬀects the ability of an economy to coordinate activities.

#### Empirical Techniques to Probe Social Network Structure

posted Dec 6, 2010, 3:13 AM by Olaf Bochmann   [ updated Dec 13, 2010, 6:29 AM ]

Social networks are networks in which vertices represent people or groups of people and edges are social interaction among them, e.g. friendship. In sociological terms vertices are actors and edges are ties
The study of social networks goes back to 19th century psychiatrist Jacob Moreno who became interested in the dynamics of interactions within groups of people. Moreno [1934] called his diagrams sociograms which later become known as social networks. In his study on schoolchildren he used triangles and circles as vertices to represent boys and girls respectively. A friendship relationship is indicated by an edge connecting two vertices. The diagram reveals that there are many friendships between two boys and two girls, but few between a boy and a girl. Once drawn the diagram, it was easy to see. This is what persuaded social scientists that there was merit in Moreno's methods.
Depending on the question one is interested in answering there are many different ways to define an edge in such an network. Edges may represent friendship, professional relationships, exchange of commodities, communication patterns, romantic or sexual relationships, or many other types of connections between individuals.
The techniques to probe different types of interaction may involve direct questioning (e.g. interviews) [Rea97, Rap61], direct observation of experimental subjects, the use of archival records (e.g. the "southern woman study" [Davis41]), ego-centered data analysis [Burt84, Bern89], affiliation analysis [Davis41, Gal85], small-world experiments [Mil67, Trav69], snowball sampling [Erick78, Frank79, Thom00], contact tracing and random walk sampling [Klov89, Thom00]. This techniques have been applied to problems like friendship and acquaintance patterns on different scales of the population, e.g. students, professionals, bord of directors, collaboration of scientists, movie actors, musicians, sexual contact networks, dating patterns, covert and criminal networks such as drug users or terrorists, historical networks, online communities and others.
A classic problem of social network analysis is to discover clustering. In the reminder of this article we will focus on different empirical methods used to measure social networks.

### Interviews

Asking people questions is the most common way to accumulate data about social networks. This can be done in the form of direct interviews, by questionaries or a combination of both - each with advantages and disadvantages with respect to the quality of data. A good introduction to social survey design and implementation is Rea and Parker [1997].
Surveys typically employ a name generator - a mechanism to invite respondents to name other nodes in the network as well as their relationship to them in order to explore the network. In the study of Rapoport and Horvath a question to the schoolchildren was to name eight best friends within the school.
There are some interesting points to notice about name generators. Nominating other vertices by ties is an asymmetric process. Person A may nominate person B as friend but there is no need that person B has to nominate person A as friend. Therefore it makes sense to represent this data as directed networks. Vertexes in directed networks have two types of degree: in-degree - the number of individuals who identified the vertex as friend - and out-degree - the number of friends identified by the vertex.
The second point concerns the limit of responses given to the respondents. In the study above the limit was to name up to eight friends. Such fixed choice studies limit the out-degree of the vertices. This cut-off may lead to the loss of information about the small-world effect of the network, which is caused by a small number of highly connected vertices. However, the in-degree is not affected by such cut-offs.
Studies based on direct questions are not only laborious, inaccurate and costly. Most of all, the data contain uncontrolled biases.

### Ego-Centered

For determining network structure, sosiometric studies - such as in the previous section require a survey of all or nearly all of the population. A reconstruction of the complete networks of ties is not possible. Given the high costs to survey large networks, a study of personal networks or ego-centered networks may be a feasible alternative. An ego-centered network is a network about one individual (ego) and its surrounding immediate contacts (alters).
A typical survey would be to sample the population at random and ask them to identify all  those with whom they have a certain type of contact. Also, they are asked for information about characteristics of themselves and there alters. This type of of survey is useful in particular if we are interested in the degree of the network. A random sample of degrees can give a reasonable degree statistics. In case we also gather information about contacts between alters, we can also estimate clustering coefficients. If we have data on characteristics of egos and alters we can estimate assortative mixing.

### Observation

Direct observation over a period of time is an obvious method to construct social networks. This is a rather labor-intensive method. It is restricted to small groups, primarily ones with face-to-face interactions in public settings. It is the only viable experimental technique for social network studies in animals.

### Archival Data

A highly reliable source of social network data is archival records.

### Affiliation Networks

Affiliation networks are special kind of social networks to focus on cluster discovering.

### References

The best general introduction on network theory is the book from Mark Newman [2010]. There is an active research community mainly somehow affiliated with the Santa Fe Institute
• H. R. Bernard, E. C. Johnsen, P. D. Killworth, and S. Robinson. Estimating the size of an average personal network and of an event subpopulation: Some empirical results. Social Science Research, 20(2):109 – 121, 1991.
• R. S. Burt. Network items and the General Social Survey. Social Networksc, 6:293–339, 1984.
• A. DAVIS, B. B. GARDNER, and M. R. GARDNER. Deep South. University of Chicago Press, 1941.
• B. Erickson. Some problems of inerence from chain data. In K. F. Schuessler, editor, Sociological Methodology. Jossey-Bass, 1979.
• O. Frank. Estimation of population totals by use of snowball samples. In P. W. Holland and S. Leinhard, editors, Perspectives on social network research. Academic Press, 1979.
• J. Galaskiewicz. Social organization of an urban grants economy: A study of business philanthropy and nonprofit organizations. Academic Press (Orlando), 1985.
• A. S. Klovdahl. rban social networks: Some methodological problems and possibilities. In M. Kochen, editor, he small world. Ablex Publishing, 1989.
• S. Milgram. The Small World Problem. Psychology Today, 2:60–67, 1967.
• J. Moreno. Who shall survive?  : a new approach to the problem of Human Interrelations, volume 58 of Nervous and mental disease monograph series. Nervous and Mental Disease Publ., Washington, 1934.
• M. E. J. Newman. Networks: An Introduction. Oxford University Press, 2010.
• A. Rapoport and W. J. Horvath. A study of a large sociogram. Behavioral Science, 6:279–291, 1961.
• L. M. Rea and R. A. Parker. Designing and Conducting Survey Research: A Comprehensive Guide. Jossey-Bass, 1997.
• S. K. Thompson and O. Frank. Model-based estimation with link-tracing sampling designs. Survey Methodology, 26(1), 2000.
• J. Travers and S. Milgram. An Experimental Study of the Small World Problem. Sociometry, 32(4):425–443, 1969.

#### Information Theory

posted Aug 8, 2010, 4:09 AM by Olaf Bochmann   [ updated Nov 10, 2010, 9:22 AM ]

## Entropy and Information

Information is reduction in uncertainty and has nothing to do with knowledge. Imagine you are about to observe the outcome of a coin flip and the outcome of a die roll. Which event contains more information?

After Abramson, the information contained in the outcome of an event E with probability P(E) is equal to log 1/P(E) bits of information. For the unit bits we use log base 2. The result of a fair coin flip we get (log 2 = 1 bit) and for the die roll (log 6 2.585 bits).

### Entropy

Now imagine a zero-memory information source X. The source emits symbols from an alphabet {x_1, x_2, . . . , x_k} with probabilities {p_1, p_2, . . . , p_k}. The symbols emitted are statistically independent. What is the average amount of information in observing the output of the source X?

Shannon formulated the most fundamental notion in information theory for a discrete random variable, taking values from $\mathcal{X}$. The entropy of X is

Alternative explanations of entropy can be the average amount of info provided per symbol, average amount of surprise when observing a symbol, uncertainty an observer has before seeing the symbol, avg # of bits needed to communicate each symbol.

#### Proposition

• H[X]>=0, and = 0 only when Pr (X = x) = 1 for some x
• H[X] is maximal when all X are equally probable, and then H[X] = log2 #\mathcal{X}H[f (X)]
• H[X], equality if and only if f is 1-1

#### Interpretation

H[X] measures:

• How random X is
• How variable X is
• How uncertain we should be about X

“paleface” problem
consistent resolution leads to a completely subjective probability theory

So, a sequence of (independent) 0’s-and-1’s can provide up to 1 bit of information per digit, provided the 0’s and 1’s are equally likely at any point. If they are not equally likely, the sequence provides less information and can be compressed.

### Description Length

H[X] = how concisely can we describe X?

Imagine X as text message:

• in Reno
• in Reno send money
• in Reno divorce final
• marry me?
• in Reno send lawyers guns and money kthxbai

Known and finite number of possible messages (#X). I know what X is but won’t show it to you. You can guess it by asking yes/no (binary) questions

First goal: ask as few questions as possible

• Starting with “is it y?” is optimal iff X = y
• Can always achieve no worse than log2 #X questions

New goal: minimize the mean number of questions

• Ask about more probable messages first
• Still takes -log2 Pr (X = x) questions to reach x
• Mean is then H[X]

Theorem: H[X] is the minimum mean number of binary distinctions needed to describe X. (Units of H[X] are bits)

### Multiple Variables

Joint entropy of two variables X and Y:

Entropy of joint distribution: This is the minimum mean length to describe both X and Y

• H[X;Y] >=. H[X]
• H[X;Y] .<= H[X] + H[Y]
• H[f (X);X] = H[X]

## Entropy and Ergodicity

(Dynamical systems as information sources, long-run randomness)

## Relative Entropy and Statistics

(The connection to statistics)

## References

Cover and Thomas (1991) is the best single book on information theory.

### Research Groups

CSSS C.S.

#### Processes and their Causal Architecture

posted Feb 28, 2010, 2:55 AM by Olaf Bochmann   [ updated Mar 1, 2010, 8:18 AM ]

The behavior of complex systems is a result of their internal structure. This structure reflects how processes compute information.  This is determined by answering three questions:
1. How much historical information does a process store?
2. How is that information stored?
3. How is the stored information used to produce future behavior?
The answers to these questions tell one how a system intrinsically computes.
intrinsic unpredictability (deterministic chaos)
emergence of structure (self-organization)

#### Main Idea

• Structure = Information + Computation
• How Nature is Structured is How Nature Computes

Readings
1. Nonlinear Dynamics and Chaos: with applications to physics, biology, chemistry, and engineering, S. H. Strogatz, Addison-Wesley, Reading, Massachusetts (2001).
2. Elements of Information Theory, T. M. Cover and J. A. Thomas, Second Edition, Wiley-Interscience, New York (2006).
3. J. P. Crutchfield, JD Farmer, NH Packard, RS Shaw, "Chaos", Scientific American 255 (1986) 46–57.
4. J. P. Crutchfield, "Z-Transforms for Stochastic Automata", Research Notes (1991) 1-7.
5. J. P. Crutchfield, "Is Anything Ever New? Considering Emergence", in Complexity: Metaphors, Models, and Reality, G. Cowan, D. Pines, and D. Melzner, editors, SFI Series in the Sciences of Complexity XIX, Addison-Wesley, Redwood City (1994) 479-497.
6. J. P. Crutchfield and D. P. Feldman, "Regularities Unseen, Randomness Observed: Levels of Entropy Convergence", CHAOS 13:1 (2003) 25-54.
7. Stanislaw Lem, "Odds", New Yorker 54 (1978) 38–54.
8. Stanislaw Lem, "Chance and Order", The New Yorker 59 (1984) 88-98.
9. D. P. Varn, G. S. Canright, and J. P. Crutchfield, "Inferring Pattern and Disorder in Close-Packed Structures from X-ray Diffraction Studies, Part I: epsilon-Machine Spectral Reconstruction Theory"
10. D. P. Varn, G. S. Canright, and J. P. Crutchfield, "Inferring Pattern and Disorder in Close-Packed Structures from X-ray Diffraction Studies, Part II: Structure and Intrinsic Computation in Zinc Sulphide"
11. C. R. Shalizi and J. P. Crutchfield, "Computational Mechanics: Pattern and Prediction, Structure and Simplicity", Journal of Statistical Physics 104 (2001) 819--881.
12. A. Archibald, "A Review of Residues and Integration---A Procedural Approach", Online Research Notes (date unknown) 1-9.
13. Mark Buchanan, "Revealing the Order in Chaos", New Scientist (26 February 2005).

#### Cellular Automata

posted Jan 19, 2010, 9:08 AM by Olaf Bochmann   [ updated Jan 19, 2010, 6:18 PM ]

A one-dimensional (elementary) cellular automaton consist of a row of cells. Each cell can be in, say one of two states - state '0' and state '1'. At each step there is a rule to update the state of each cell, say based on that cell and its immediate left and right neighbors.
The table below is a representation of the rule for this kind of cellular automata. It is a kind of lookup table. The top row contains all possible state combinations for a cell and its left and right neighbor at step n. The botom row specifies the state of that cell at the next step in each of these cases.
 n 000 001 010 011 100 101 110 111 n+1 0 0 0 0 0 1 1 1
The botom row contains a sequence of '1' and '0' that can be considered as a binary number - from right to left 1110 0000. The decimal equivalent represents the rule number 224.

#### Nonlinear Dynamics

posted Dec 6, 2009, 3:57 AM by Olaf Bochmann   [ updated Mar 5, 2010, 8:01 AM ]

The field of nonlinear Dynamics is concerned with analysing Systems that can not be described by linear equations. Nonlinear equations - with the exception of soliton equations - do not follow the superposition principle. In general, they show much more complex behavior.

The systematic development of the theory of nonlinear dynamics is proposed as follows:
1. first order differential equation and their bifurcations
2. phase plane analysis
3. limit cycles and their bifurcations and Lorenz equations
4. chaos
5. iterated maps
6. period doubling
7. renormalization
8. fractals
9. strange attractors
Dynamics is the subject that deals with systems that change in time, e.g. it settles down to equilibrium, it repeats in cycles, and so on. This behavior can be found in mechanics, chemical kinetics, population biology, and others.
There are two main types of dynamical systems: differential equations and difference equations (iterates maps). Differential equations describe the evolution of a system in continuous time, whereas difference equations describe the evolution in discrete time. For differential equations, one can distinct between ordinary differential equations and partial differential equations. For instance, the equation for a damped harmonic oscillator or the swinging pendulum is an ordinary differential equation, because there is only one independent variable - time. The heat equation is an example of an partial differential equation. It has both, time and space as independent variables.
In linear systems, as for the damped harmonic oscillator, the right hand side of the general system of first order differential equation is a linear function of the states. These systems can be broken down into parts and the parts can be solveld separately. A linear system is precisely equal to the sum of its parts. This idea enables such methods as normal modes, Laplace transforms, superposition arguments, and Fourier analysis. However, most of everyday live e.g. the swinging pendulum is nonlinear (trigonometric function) and the above mentioned methods fail.

#### Maps

Maps are discrete time systems - time proceeds in clicks (maps). The modelling tool is difference equation.
In a dynamical system, a bifurcation is a period doubling, quadrupling, etc., that accompanies the onset of chaos. It represents the sudden appearance of a qualitatively different solution for a nonlinear system as some parameter is varied.
Iterates of the logistic map present periodic cycles. A period tree in a map implies regular cycles of every other length as well as completely chaotic cycles. You can see different periodic cycles on the diagram of bifurcation by observing the number of points for different values of the parameter . The complete diagram of bifurcation can also be seen.

#### Flows

Flows are continuous time systems - time procedes smoothly (flows). The modelling tool is differential equation.
"Finite systems of deterministic ordinary nonlinear differential equations may be designed to represent forced dissipative hydrodynamic flow. Solutions of these equations can be identified with trajectories in phase space. For those systems with bounded solutions, it is found that nonperiodic solutions are ordinarily unstable with respect to small modifications, so that slightly differing initial states can evolve into considerably different states. Systems with bounded solutions are shown to possess bounded numerical solutions.
A simple system representing cellular convection is solved numerically. All of the solutions are found to be unstable, and almost all of them are nonperiodic.
The feasibility of very-long-range weather prediction is examined in the light of these results." Edward N. Lorenz [32]

sdfg

#### Application and Examples

The idea of the Santa Fe time series competition [53] was to put out a bunch of data sets and invites the participants to predict their future. The data sets included laboratory laser, physiological data (sleep apnea), currency rate exchange, RK4 on some chaotic ODE, intensity of some star, and a Bach fugue. Other questions arising from this was what kind of systems produces the data and how much can we learn about the system. Quantitative answers permit direct comparisons.
In his competition entry T. Sauer [45] used a careful implementation of local-linear fitting that had five steps:
Low-pass embed the data to help reduce measurement and quantification noise (smoothing).
Generate more points in embedding space by Fourier interpolation between the points obtained in step 1 (increase the coverage in embedding space).
Find the k-nearest neighbors to the point of prediction (tradeoff between increasing bias and decreasing variance that comes from larger neighborhood).
Use a local SVD to project points onto the local surface.
Regress a linear model for the neighborhood and use it to generate the forecast.Filtering is another application. Don't use a linear filter for chaotic systems. For nonlinear filter use the stable and the unstable manifold structure on a chaotic attractor.
If you have a model of the system, you can simulate what happens to each point in forward and backward time. If your system has transverse stable and unstable manifolds, that does useful things to the noise balls. Since all three versions of that data should be identical at the middle time, can average them. Works best if manifolds are perpendicular, butrequires only transversality.
FRACTAL Villages by R. Eglash "I am a mathematician, and I would like to stand on your roof." That is how Ron Eglash greeted many African families he met while researching the fractal patterns hed noticed in villages across the continent.

#### Summary of Topics

(CSSS'09 by Liz Bradley)

Introduction; Dynamics of Maps chs 1 & 10 of [50]
a brief tour of nonlinear dynamics [32] (in [17])

an extended example: the logistic map

how to plot its behavior

initial conditions, transients, and ﬁxed points

bifurcations and attractors

chaos: sensitive dependence on initial conditions, $\lambda$, and all that

pitchforks, Feigenbaum, and universality [22] (in [17])

the connection between chaos and fractals [23], ch 11 of [50]

period-3, chaos, and the u-sequence [31, 34] (latter is in [17])

maybe: unstable periodic orbits [2, 25, 49]
Dynamics of Flows [50], sections 2.0-2.3, 2.8, 5, and 6 (except 6.6 and 6.8)
maps vs. ﬂows

time: discrete vs. continuous

axes: state/phase space [9]

an example: the simple harmonic oscillator
some math & physics review [8]

portraying & visualizing the dynamics [9]

trajectories, attractors, basins, and boundaries [9]
dissipation and attractors [42]
bifurcations
how sensitive dependence and the Lyapunov exponent manifest in ﬂows

anatomy of a chaotic attractor: [23]

stretching/folding and the un/stable manifolds

fractal structure and the fractal dimension ch 11 of [50]

unstable periodic orbits [2, 25, 49]

shadowing
symbol dynamics [26] (in [13]); [28]

Tools
ODE solvers and their dynamics [8, 33, 35, 44]

PDE solvers [8, 44]

Poincar´e sections [27]

stability, eigenstuﬀ, un/stable manifolds and a bit of control theory

embedology [29, 30, 39, 46, 47, 45, 52] ([39] is in [37] and [45] is in [53];)

calculating Lyapunov exponents and fractal dimensions [1, 9, 37, 40]

Applications
prediction [3, 4, 5, 14, 15, 53]

ﬁltering [20, 21, 24]

control [7, 6, 11, 36, 48] ([36] is in [37])

communication [16, 41]

classical mechanics [10, 43, 51, 54, 55]

music, dance, and image [12, 18, 19]

#### References

[1] H. Abarbanel. Analysis of Observed Chaotic Data. Springer, 1995.

[2] D. Auerbach, P. Cvitanovic, J.-P. Eckmann, G. Gunaratne, and I. Procaccia. Exploring chaotic motion through periodic orbits. Physical Review Letters, 58:2387–2389, 1987.

[3] T. Bass. The Eudaemonic Pie. Penguin, New York, 1992.

[4] T. Bass. The Predictors. Owl Books, 2000.

[5] D. Berreby. Chaos hits Wall Street. Discover, 14:76–84, March 1993.

[6] E. Bollt and J. Meiss. Targeting chaotic orbits to the moon. Physics Letters A, 204:373– 378, 1995.

[7] E. Bradley. Using chaos to broaden the capture range of a phase-locked loop. IEEE Transactions on Circuits and Systems, 40:808–818, 1993.

[8] E. Bradley. Numerical solution of diﬀerential equations. Research Report on Curricula and Teaching CT003-98, University of Colorado, Department of Computer Science, 1998. www.cs.colorado.edu/~lizb/na/ode-notes.ps.

[9] E. Bradley. Analysis of time series. In Intelligent Data Analysis: An Introduction, pages 199–226. Springer-Verlag, 2 edition, 2000. M. Berthold and D. Hand, eds.

[10] E. Bradley. Classical mechanics. Research Report on Curricula and Teaching CT007-00, University of Colorado, Department of Computer Science, 2000. www.cs.colorado.edu/~lizb/chaos/classmech.ps.

[11] E. Bradley and D. Straub. Using chaos to broaden the capture range of a phase-locked loop: Experimental veriﬁcation. IEEE Transactions on Circuits and Systems, 43:914–822, 1996.

[12] E. Bradley and J. Stuart. Using chaos to generate variations on movement sequences. Chaos, 8:800–807, 1998.

[13] D. Campbell, R. Ecke, and J. Hyman. Nonlinear Science: The Next Decade. M.I.T. Press, 1992.

[14] M. Casdagli. Nonlinear prediction of chaotic time series. Physica D, 35:335–356, 1989.

[15] M. Casdagli and S. Eubank, editors. Nonlinear Modeling and Forecasting. Addison Wesley, 1992.

[16] K. M. Cuomo and A. V. Oppenheim. Circuit implementation of synchronized chaos with applications to communications. Physical Review Letters, 71:65–68, 1993.

[17] P. Cvitanovic. Introduction. In Universality in Chaos. Adam Hilger, Bristol U.K., 1984.

[18] D. Dabby. Musical variations from a chaotic mapping. Chaos, 6:95–107, 1996.

[19] D. Dabby. A chaotic mapping for musical and image variation. In Proceedings of the Fourth Experimental Chaos Conference, 1997.

[20] J. Farmer and J. Sidorowich. Predicting chaotic time series. Physical Review Letters, 59:845–848, 1987.

[21] J. Farmer and J. Sidorowich. Exploiting chaos to predict the future and reduce noise. In Evolution, Learning and Cognition. World Scientiﬁc, 1988.

[22] M. J. Feigenbaum. Universal behavior in nonlinear systems. Los Alamos Science, 1:4–27, 1980.

[23] C. Grebogi, E. Ott, and J. A. Yorke. Chaos, strange attractors and fractal basin boundaries in nonlinear dynamics. Science, 238:632–638, 1987.

[24] J. Guckenheimer. Noise in chaotic systems. Nature, 298:358–361, 1982.

[25] G. Gunaratne, P. Linsay, and M. Vinson. Chaos beyond onset: A comparison of theory and experiment. Physical Review Letters, 63:1, 1989.

[26] B.-L. Hao. Symbolic dynamics and characterization of complexity. Physica D, 51:161–176, 1991.

[27] M. H´enon. On the numerical computation of Poincar´e maps. Physica D, 5:412–414, 1982.

[28] C. Hsu. Cell-to-Cell Mapping. Springer-Verlag, New York, 1987.

[29] J. Iwanski and E. Bradley. Recurrence plots of experimental data: To embed or not to embed? Chaos, 8:861–871, 1998.

[30] H. Kantz and T. Schreiber. Nonlinear Time Series Analysis. Cambridge University Press, Cambridge, 1997.

[31] T.-Y. Li and J. A. Yorke. Period three implies chaos. American Mathematical Monthly, 82:985–992, 1975.

[32] E. N. Lorenz. Deterministic nonperiodic ﬂow. Journal of the Atmospheric Sciences, 20:130–141, 1963.

[33] E. N. Lorenz. Computational chaos – A prelude to computational instability. Physica D, 35:229–317, 1989.

[34] N. Metropolis, M. Stein, and P. Stein. On ﬁnite limit sets for transformations of the unit interval. J. Combinational Theory, 15:25–44, 1973.

[35] R. Miller. A horror story about integration methods. J. Computational Physics, 93:469– 476, 1991.

[36] E. Ott, C. Grebogi, and J. A. Yorke. Controlling chaos. Physical Review Letters, 64:1196– 1199, 1990.

[37] E. Ott, T. Sauer, and J. Yorke. Coping with chaos. Wiley, 1994.

[38] J. M. Ottino. The Kinematics of Mixing: Stretching, Chaos, and Transport. Cambridge, Cambridge U.K., 1992.

[39] N. Packard, J. Crutchﬁeld, J. Farmer, and R. Shaw. Geometry from a time series. Physical Review Letters, 45:712, 1980.

[40] T. S. Parker and L. O. Chua. Practical Numerical Algorithms for Chaotic Systems. Springer-Verlag, New York, 1989.

[41] L. M. Pecora and T. L. Carroll. Synchronization in chaotic systems. Physical Review Letters, 64:821–824, 1990.

[42] I. Percival. Chaos in Hamiltonian systems. Proceedings of the Royal Society, London, 413:131–144, 1987.

[43] I. Peterson. Newton’s Clock: Chaos in the Solar System. W. H. Freeman, New York, 1993.

[44] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical Recipes: The Art of Scientiﬁc Computing. Cambridge University Press, Cambridge U.K., 1988.

[45] T. Sauer. Time-series prediction by using delay-coordinate embedding. In Time Series Prediction: Forecasting the Future and Understanding the Past. Santa Fe Institute Studies in the Sciences of Complexity, Santa Fe, NM, 1993.

[46] T. Sauer. Interspike interval embedding of chaotic signals. Chaos, 5:127, 1995.

[47] T. Sauer, J. Yorke, and M. Casdagli. Embedology. Journal of Statistical Physics, 65:579– 616, 1991.

[48] T. Shinbrot. Progress in the control of chaos. Advances in Physics, 44:73–111, 1995.

[49] P. So, E. Ott, S. Schiﬀ, D. Kaplan, T. Sauer, and C. Grebogi. Detecting unstable periodic orbits in chaotic experimental data. Physical Review Letters, 76:4705–4708, 1996.

[50] S. Strogatz. Nonlinear Dynamics and Chaos. Addison-Wesley, Reading, MA, 1994.

[51] G. J. Sussman and J. Wisdom. Chaotic evolution of the solar system. Science, 257:56–62, 1992.

[52] F. Takens. Detecting strange attractors in ﬂuid turbulence. In D. Rand and L.-S. Young, editors, Dynamical Systems and Turbulence, pages 366–381. Springer, Berlin, 1981.

[53] A. S. Weigend and N. S. Gershenfeld, editors. Time Series Prediction: Forecasting the Future and Understanding the Past. Santa Fe Institute Studies in the Sciences of Complexity, Santa Fe, NM, 1993.

[54] J. Wisdom. Chaotic behavior in the solar system. Nuclear Physics B, 2:391–414, 1987.

[55] J. Wisdom. Is the solar system stable? and Can we use chaos to make measurements? In D. K. Campbell, editor, Chaos/XAOC : Soviet-American perspectives on nonlinear science. American Institute of Physics, New York, 1990.

1-10 of 15