Information

08/06/04

Home
Up
Computer and Mind
Synergy
Serial Order
Timing
Information
Neurophysiology
Control
UCM Hypothesis
Equlibrium Point Hypothesis

 

  1. Shannon, C.E. (1949)

    1. Inventor of information theory (communication theory)

    2. Published at Bell Laboratory in 1948

    3. The theory based on signal transmission and processing for telecommunication, such as radio, television, computers, and other data processing devices

    4. Considered as the first to separate the problem of delivering a message from the meaning of the message

    5. Movement scientists adopted the concepts and transferred them to human system where nerves are information transmission channels

    6. Shannon-Hartley Theorem

      1. Information rate tells us how fast information can travel over a given channel

      2. When a source sends r message per second and the entropy (uncertainty) of each message is H bits per message, then the information rate is

        1. R = r*H     [1]

      3. It seems that errors would increase with the higher R, but it does not necessarily happen when R £ C (Channel capacity: a maximum information rate)

      4. Because if one transmits signals with an information rate R such that R £ C, then one can approach arbitrarily small error probabilities provided that the information is coded intelligently

      5. Shannon-Hartley Theorem
        C = W*log2(1+S/N)
             [2]
        where W is bandwidth (the frequency range where the signal power drops drown to half of maximum power), S is signal amplitude, and N is noise amplitude
         

  2. Basic ideas of information theory

    1. Information theory provides a yardstick for measuring organization

    2. A well-organized system is very predictable -> you do not learn very much which is new -> this system has little information

    3. On the other hand, a poorly organized system is not predictable -> you learn very much which is new -> this system have much information

    4. Logarithm (Ax=B -> logAB = x)

      1. Base 10, e, or 2 (binary)

      2. log(m) + log(n) =  log(m*n)

      3. log(m) - log(n) =  log(m/n)

      4. log(1) = 0

      5. n*log(m) = log(mn)

    5. Probability

      1. p(A and  B) = p(A) X p(B)

      2. p(A or B) = p(A) + p(B) - p(A and B)

      3. Probability of "not A": p(A') = 1-p(A)

      4. Conditional probability (probability of A with B given): pB(A)

    6. An amount of information depends on the reduction of Uncertainty

      1. Suppose there are 8 cards numbered from 1 to 8. I picked one in my mind and asked you to guess which one I picked. On the average, you would expect to ask 4 questions such as "Is it 2? (Strategy 1)

      2. If we based the unit of information on the average number of answers before the number is found, we arrive at a measure of difficulty

      3. You can use a different strategy of questioning, such as "Is the number smaller than 5?" (Strategy 2) to reduce the number of questions

      4. It seems reasonable to base the unit of difficulty on the number of questions (answers) required when the optimal strategy is required; this is the basis of the standard unit of selective information 

      5. Since the amount of information is closely related to the number of cards, we write an amount of information as a function of the number of cards as follows;
        H(8) = 3 units of information/uncertainty per number
        H(16) = 4 units of information/uncertainty per number

      6. The amount of information is called "Uncertainty" (Entropy) in this sense and denoted by H

      7. Norbert Wiener's point: use the name 'negative entropy' for H since 'Just as the amount of information in a system is a measure of its degree of organization, so the entropy of a system is a measure of its degree of disorganization; and the one is simply the negative of the other' (Wiener, 1948)

      8.  

      9. One way of measuring is taking ratios or halves; one unit of information is gained when half of the alternatives are eliminated as in Strategy 2;
        bits = log2(x), where x is the number of possible states
             [3]

      10. Message has a probability p, when it is one out of 1/p possibilities;
        information of message =
        log2(1/p)=-log2(p)     [4]

      11. Average amount of information = mean logarithmic probability for all messages from one source
        H(x) = mean of [
        -log2(p)] = ∑p[-log2(p)]     [5]

    7. Related sources X and Y

      1. Information of x + Information of y - codomainof Information of x and Information of y

      2. H(x,Y) = Hy(x) + Hx(y) - I(x:y)     [6]

      3. eg) H(x) is stimulus (S), H(y) is response (R), I(x:y) is degree of dependency between S and R; for perception I(x:y) can be considered as a measure of discrimination

    8. Channel capacity (C)

      1.  

      2. Upper limit of information transmission is I(x:y)

      3. C = I(x:y) = Hy(x) + Hx(y) - H(x,Y)     [7]

      4. C can be viewed as another version of Weber fraction

    9. Redundancy: successive occurrences are not independent; they have some redundancy captured by I(x:y)
       

  3. Information theory (communication theory)

    1. Invented by Shannon

    2. If there is a systematic relationship between two variables x and y, something about x can be known by the present state of y
                                         channel
                               x ---------------------> y

      1. What is the uncertainty of x ? (how much don't we know about x?)

      2. How much of the uncertainty about X does Y resolve?

    3. One random variable

      1. Let's consider a random variable, x

      2. Let's assume that x has a finite set of states: x0, x1, x2, ....., xn

      3. The probability of xi can be written px(xi)

      4. Sum of all state probability of x
        px(xi)=1

      5. Probability distribution

    4. Two random variables

      1. Let's consider two random variables x and y

      2. Probability distribution

      3. The probability of co-occurrence of xi and yi pairs = pxy(xi,yi)

      4. If there is no relationship between x and y and independent each other ,
        pxy(xi,yi) = px(xi)*py(yi)

      5. However, pxy(xi,yi) = px(xi)*py(yi) is not always true

    5. Uncertainty (McGill and Quastler, 1955) or Entropy (Shannon, 1948)

      1. Uncertainty of xi
        h
        (xi)=-log[px(xi)]

      2. Average uncertainty of all xi
        H(x)=∑px(xi)*h(xi)=-∑px(xi)*log[px(xi)]     [8]

      3. In two-variable situation
              [9]
         

    6. Therefore,
      I(x:y) = H(x) + H(y) - H(x,y)     [10]
      I(x:y)
      is how much, on average, you learn about xi, by seeing yi
       

  4. Example

    1. If we drop two balls (x and y) on a k X k grid pannel (k X k = N) independently
           [11]

       

    2. Let's assume that we are dropping two balls (x and y) on a k X k grid pannel (k X k = N) at the same time

    3. Two conditions

      1. Condition 1: Drop two balls at the same time

      2. Condition 2: Tie two balls with a string of length 2 and drop them at the same time

    4. Condition 1

      1.      [12]


      2.      [13]
         

    5. Condition 2

      1.      [14]


      2.      [15]

    6. Therefore, what we know about y from x is large in Condition 2 as compared to Condition 1
       

  5. Speed-Accuracy Trade-Off (Fitts' Law)

    1. Woodworth (1899)

      1. Drawing different distances of lines at different speeds

      2. Drawing movement was consisted of a ballistic phase (open loop, feedforward) and a current control (closed loop, feedback)

      3. Accuracy decreases with drawing speeds

    1. Fitts (1954)

      1. First application of information theory to motor system

      2. Isolation of motor processes using over-learnt movements and keeping stimulus conditions more or less constant reveals limitations of capacity of motor system

      3. Capacity is the ability to consistently produce one class of movements

      4. The greater the number of alternatives, the greater the information procession capacity required by the movement

      5. Information capacity can be inferred from variability of successive responses in constant performance: the variability reflects the channel noise in optimum movement

      6. The rate of information transmission can estimated from magnitude of noise to possible range of responses

      7. Channel capacity depends on not only the average amplitude, but also the tolerance

      8. Tasks

        1. Repeated tapping with a stylus between two rectangles with maximum speeds and different target widths (W, tolerance range, noise)

        2. Disc transfer from one pin to another with different diameter of pin and center hole

        3. Pin transfer from set of holes with different sizes of pins and movement amplitudes (A)

                    
         

      9. It was found that the movement time was a logarithmic function of movement amplitude when target width was constant and the movement time was also logarithmic function of target width with a constant movement amplitude

      10. MT (movement time) = a + b*log2(2A/W)     [16]

      11. ID (index of difficulty) = log2(2A/W) = -log2(W/2A)      [17]

      12. IP (index of performance) = -(1/MT)*log2(W/2A)      [18]

      13. The higher ID requires more decisions, which requires more channel capacity, therefore slowing down for difficult tasks

      14. When the rate of information processing is optimized (maximized by over-learning in this case), the speed has to be traded off against accuracy

      15. Fitts' Law JAVA demo (from http://ei.cs.vt.edu/~cs5724/g1/tap.html)

  6. Crossman and Goodeve (1963, 1983)

  7. Schmidt, Zelaznik, Hawkins, Frank, and Quinn (1979)

  8. Carlton (1994)

  9. Newell et al. (1993): Space-time accuracy during rapid movements

    1. Task: fast pre-programmed movement: elbow movement during 150-400 ms

    2. Results

      1. Timing error is a decreasing function of movement speed

      2. Spacial error is an increasing function of movement speed

      3. Coefficient of determination between timing error and spacial error within a subject: r2=-0.98

    3. The results are contrary to Schmidt's impulse-variability theory (the velocity has no effect on timing variability)

  10. Kelso, Southard, and Goodman (1979): Two handed movement coordination

    1. Task: reach one target or two targets with different index of difficulties as fast as possible

    2. Conditions

      1. Two target widths

      2. Two target amplitudes (distances)

      3. Single and two-handed performance

      4. However, the smaller target always had long distance and the larger target always had smaller distance from the starting position to the target


         

    3. Results

      1. One hand tasks satisfied Fitts' Law

      2. Two-hand movements were initiated at the same time and landed on the targets at the same time regardless of different widths and amplitudes

      3. MT (movement time = total response time - reaction time (RT)) of one hand and two hand movements with a same index of difficulty were not different

      4. When ID on two hands were different, MT of the hand with a lower ID increased, therefore, the difficult task determined time in the two hand performance

      5. Hands are not controlled as separate units, but they perform as a synergy

 

Home Computer and Mind Synergy Serial Order Timing Information Neurophysiology Control UCM Hypothesis Equlibrium Point Hypothesis

This site was last updated 02/12/04