IX. Toxo’s effect on the performance of our soldiers and pilots

Working with the psychology clinic at the Central Military Hospital in Prague, we tested in much greater detail the differences in performance between Toxo positive and negative people. Over the years, thousands of recruits in this workplace slipped through our hands, because those who were supposed to be military drivers in garrisons near Prague, part of patrol service or the Prague Castle Guard, had their psychological examination in this hospital. The psychologists of the army hospital gave them a number of tests, which determined psychological profile, psychical endurance and performance in stressful situations. Through the medical personnel we asked the studied recruits for permission include their data in our study. Those that agreed were examined for toxoplasmosis, and then we only had to compare the results of the psychological and performance tests of Toxo positive and negative people (Box 47 Informed consent). We used performance tests similar to those we used on the blood donors, as well as more complicated tests. The latter tests used not only data on reaction time, but also on several other physiological and psychological parameters, like short-term memory, resistance to fatigue, etc.

Box 47 Informed consent

When conducting research on people, we first need to get their informed consent. We need to familiarize them with the goal of the study, as well as with what we will ask them to do and how the results will be used. We must further inform them how and when they can withdraw their agreement to participate (usually it’s possible at any time, until the data is made anonymous). If experimenting on children or people with limited legal competence, we must obtain the agreement of their legal guardian. These rules, which were established in the Czech Republic through routine practice and adhere to the Convention of Human Rights and Biomedicine adopted by the Council of Europe, must be strictly followed; especially when the studies might disturb or endanger the test subject’s rights to physical or psychical integrity or dignity. If there is no such danger, like if we’re only observing how people behave in natural or constructed situations that don’t put them at risk, then the requirements for the informed agreement aren’t as strict – but the ethical committee of the respective institution should always decide this, not the individual researcher. In some cases a strict and mechanical request for informed agreement could take away the value of the entire study. For example, if we’re studying how trusting people are, and have them drink an unknown liquid or sign a blank sheet of paper, we clearly can’t tell them about the purpose of the experiment beforehand. If it’s drinking water, the danger is minimal, and the people can be informed about the purpose of the experiments in general terms “you will participate in a set of ethological experiments studying the effects of toxoplasmosis on human behavior and psyche” (so that the test subjects won’t spend half their time studying a detailed informed agreement, and to limit the risk of revealing the experimental purpose to other participants). But if we’re filming people in a similar experiment, we must subsequently inform them of this and offer them the choice to delete the recording. If we’re unsure about any ethical aspects of our experiments, we shouldn’t base our judgments solely on the recommendations of the ethical committee (though it’s good to bear them in mind). Rather, we should consult Immanuel Kant’s categorical imperative: “Act only according to that maxim whereby you can, at the same time, will that it should become a universal law.”

We had a similar problem with yet another group, which we initially had great hopes for. It included the military and civilian pilots, as well as the ground crew and students of flight school, which had to routinely undergo performance testing in the Institute of Aviation Medicine. We managed to begin cooperation with this workplace too, but the collaboration wasn’t nearly as close or productive as it was with the psychological department of the Central Military Hospital. None of our projects reached the publication phase; nevertheless, we

Box 48 One can never be too careful

Human experimental subjects, unlike lab mice or bacteria in test tubes, are happily inclined to cheat. When filling out the questionnaire, they often intentionally or unintentionally lie, to make themselves look better (to the evaluator or themselves) than they really are. If cheating and lies are not the topic of our study, we must try to make both very difficult. We will never completely succeed in preventing them, so we should at least have an idea of how much the test subjects are lying and cheating. Some psychological tests, for example, allow one to calculate their lying score, to determine how much the person, when filling out the questionnaire, tried to improve their image. At the least, we must ensure that cheating on the part of the test subjects does not distort the answer to our given question. Most important is that the cheating not give a false positive result. To prevent this from happening, the test subjects cannot know what group (experimental or control) they belong to when being tested; and the researchers must treat both groups of test subjects entirely the same. During the tests, we made sure that neither the subjects nor the person administrating the test knew who was or wasn’t infected (known as a double-blind study). Even so, cheating could have distorted the result of the study, usually by increasing the variability of the results achieved by the test subjects, thus preventing us from revealing the existing effect. It couldn’t, however, lead to a false positive result i.e. that we’d prove an effect that doesn’t really exist. Not only the test subjects, but also our coworkers, are capable of cheating. For example, if a nurse is paid according to the number of patients whose data she enters into the computer from index cards, she might invent part of the data. Usually we can discover this using the right statistical test. For example, if one invents a list of ones and zeros, such as when making a list of Toxo positives and negatives, then the ones and zeroes alternate much more regularly than if the order were determined by chance. This can be easily found out using the runs tests.

. Similarly, if one invents multiple digit numbers, then usually certain numbers repeat more than others, in a way that is not random. Usually we can reveal this easily, by having the computer draw us a graph of the frequency of individual digits (i.e. a histogram, see Fig. 26). In everyday life there exists the well-known saying, “Trust, but verify.” But in science it pays off to hold to the saying, “Don’t trust, and verify.”

Fig. 26 The difference in frequency of the digits in real (dark-colored bars) and in invented data (light-colored bars). The real numbers were created from averaged, standardized results of measuring the attractiveness of tiger urine odor and then removing the first significant figure. The invented numbers come from the attention test, in which tested persons were presented with a list of randomly alternating digits. Over the course of three minutes, they were to mark all pairs of subsequent digits that added to 10. The chi-squared (χ²) test demonstrates that the numbers from the real data have a homogenous distribution of digit frequency, whereas the invented numbers show a strong bias towards certain digits.

obtained several sets of data from people examined there. We found basically no difference between Toxo positives and negatives in the performance test. However, we did obtain one interesting result. In one examined group, numbering about 200 people, the total frequency of toxoplasmosis was only about 3%. In a normal population we would expect around 15-25% to be infected (depending on average age). Unfortunately, we were never able to obtain further information about them, for the data we got was anonymized. We can only guess that they may have been students or fresh graduates from flight school, rather than pilots or flight crew. We expect that these people recently took performance tests, and that those who didn’t pass either weren’t accepted into the school or didn’t finish it. In this way, a large part of the Toxo positive people would be eliminated, and so would have been missing in this group. In groups of pilots and airline dispatchers taking regular performance tests, the frequency of Toxo positive people was no longer significantly decreased – apparently, a number of them became infected since graduating flight school. Yet even in these groups, the differences between the infected and uninfected people were not significant. Our explanation is that the pilots’ motivation to perform well in the test to maintain his prestige and a well-paid job was high enough, that individuals with a worsened performance trained harder for the tests, perhaps even took it repeatedly. At a weak moment, the employees of the psychological department of the Institute revealed that almost everyone, if not the first time then in the retake, is able to pass even these very difficult tests (hence the Toxo positives weren’t removed from the test group, as was most likely the case with the flight school students). So if we couldn’t know whether a person was taking the test his first or second time, after intensive training or just “offhand,” it isn’t surprising that we couldn’t discern possible differences in performance of the infected and uninfected people.

Our experience from the Institute of Aviation Medicine, among other things, showed that the system for testing pilots probably doesn’t work as well as it should. Aside from this practical conclusion, which should probably be important to the people and institutions responsible for air transportation safety, our experience is also noteworthy to scientists with research projects similar to ours. It is a reminder that we cannot ever forget that our test subjects often pass through a variety of “sieves” before they get to our laboratory. So if we don’t find a difference in the observed parameters between the two test groups (be the subjects humans or animals), it doesn’t have to mean that there is no such difference within the general population. Thanks to a “sieve,” our experimental group may get a different ratio of individuals from each group we’re comparing. For example, we get only the people who scored at least a hundred points in the entrance exam for flight school, which could mean only the fastest 10% of the Toxo positives, and the fastest 50% of the Toxo negatives that applied. We may not find any difference in reaction time between the two groups, even though a difference exists within the general population.

So how much does Toxoplasma affect people’s performance? For a moment, let us look away from the above-mentioned problems of varying motivation and reaction speed, along with the unmentioned problem of distinguishing reaction speed and the ability to maintain maximum concentration over long periods of time. Apart from these, we can calculate the strength of Toxo’s effect on human performance in our simple reaction time test quite easily (Box 49 How to determine the effect size in statistics).

Box 49 How to determine the effect size in statistics

Laymen, and unfortunately some scientists, often inaccurately believe that effect size,which measures the strength of the relationship between two variables, is indicated by the P value, i.e. statistical significance.This value, however, only (indirectly) reflects the probability of a first order error,the probability that the effect we observe in the data (such as the worsened performance of Toxo positive people), is only a result of chance. The lower the P value obtained from the statistical test, the greater the probability that the observed phenomenon isn’t merely a turn of fate, but really exists. Statistical significance (P), however, says almost nothing about the effect size. For the P value is influenced not just by the strength of the studied effect (such as by what percent toxoplasmosis lowers one’s performance), but also by the number of subjects in the test group and the variability of the dependent variable. When we have a test group of several thousand individuals, we can prove even a very weak effect, which may not have any practical significance for a person’s life. Conversely, in a small test group, we may not find even a large effect to be statistically significant. The effect size of the studied variables is decided by other characteristics, such as the coefficient of determinationin the ANOVA (how much of the target variable’s variability is explained by the studied factor); eta² in the GLM (something quite similar); or Cohen’s d in Student’s t-test (the difference in the target variable’s means in the experimental and control groups, divided by the target variable’s standard deviation from all the data).

Our studies showed that in blood donors toxoplasmosis is responsible for about 7% of the differences in reaction time among individuals in the human population. In other words, people differ amongst each other in reaction times – some are faster, others slower – yet toxoplasmosis is responsible for only 7% of the differences. On first glance, 7% may seem like a small number. In reality, an effect of this size in biology is already considered to be of medium strength. The thing is, the strength of biological effects cumulates from generation to generation. If we measured the same size effect in engineering, for example, finding that the effect of the outside temperature causes 7% of the observed variability in manufactured screw size (the remaining variability due to the diligence of the workers, the quality of the material, etc.), then we might disregard such a small effect. But dealing with biological phenomena constitutes an entirely different situation. Let’s say that in the natural Toxoplasma population there appears a mutant, a protozoan able to manipulate its host’s behavior. In lowering that host’s reaction time, the parasite increases the probability that the host will be captured by a cat, the so-called final host of Toxoplasma. Even raising the probability of capture by just 1% is enough for this parasite to prevail in a couple dozen generations (a blink of an eye from an evolutionary stand-point) over the parasites unable to lower host reaction time. So 7% is fairly substantial – many of the effects we observe in evolutionary biology are responsible for 1-2% of total variability, which is quite enough for such an effect to manifest in evolution, for the organisms with the respective characteristics or abilities to prevail in nature. Hence even a 1% effect is strong for evolutionary biology, though most likely insignificant for the manufacturer of a screw.

The 7% effect that toxoplasmosis has on reaction time may be responsible for the fact that Toxo positives have a 2.65 times greater risk of traffic accidents, as we discovered in our pseudopredation studies. It’s clear that latent toxoplasmosis influences driving ability much less than, for example, alcohol or just the common flu. However, a person gets drunk only occasionally, and even then rarely gets behind the wheel in such a state. Similarly, the flu worsens our cognitive abilities and reaction time much more than does latent toxoplasmosis, but we catch the flu at most once or twice a year. In contrast, once infected by Toxoplasma, we’re stuck with latent toxoplasmosis for life. Whenever a Toxo positive gets behind the wheel or decides to cross the street, that person has a great chance of becoming part of a traffic accident, than does someone who is uninfected. And when the average person has a generally low risk of traffic accidents, a high prevalence of latent toxoplasmosis can add up to hundreds of thousands of deaths (Box 50 How many road traffic victims does the “harmless” toxoplasmosis have on its conscience?)

It seems that subconsciously the infected person does learn to allow for his worsened reaction time. In our experiments we discovered that though reaction time worsens with time after infection, and those infected the longest have the worst reaction times; the probability of a traffic accident gradually decreases with time after infection. When tracking the reaction times of soldiers, blood donors or students, we of course didn’t know how long a Toxo positive had been infected. Nevertheless, we found the worst reaction time always in people with low levels of antibodies – apparently infected long ago. On the other hand, people with the greatest risk of traffic accidents had the highest levels of antibodies, and so had apparently passed through acute toxoplasmosis relatively recently. In my opinion, infected drivers simply got used to their worsened reaction times. They discovered that they have a longer reaction time, and adapted their driving accordingly. Of course there exists a less optimistic explanation, that infected drivers who were already not very careful, got killed or at least had their driver’s license taken away. But as a Toxo positive person, I prefer to reject the second explanation ahead of time, and categorically spurn it as entirely absurd and definitely improbable.

Box 50 How many road traffic victims does the “harmless” toxoplasmosis have on its conscience

If we knew the percent by which toxoplasmosis increases the risk of a traffic accident, as well as the number of people who die each year from these accidents, we could approximately calculate how many traffic victims would be spared were it not for toxoplasmosis. This value is generally known as the population attributable risk which we can calculate by plugging into the respective formula. The problem is that mankind doesn’t form a single population with the same incidences of toxoplasmosis and traffic accidents throughout. In some countries, the prevalence of toxoplasmosis is low, and that of fatal traffic accidents is high (for example as result of poor transport infrastructure and an outdated make of vehicle); in different countries, it’s the other way around. Furthermore, data from various countries have different levels of reliability, and change from year to year. When I looked for data on the number of traffic fatalities ten years ago, the website of the WHO showed that were 3 million yearly. The same source indicates that there are currently 1.4 million yearly. While it’s not impossible, I’m not convinced that traffic fatalities decreased so much in 10 years. A further complication is that our data on the heighted risk of traffic accidents for infected people concerns only traffic accidents in which a participant was injured. It’s possible that the heightened risk of fatal accidents is lower or higher than this value. Hence, I prefer to stick with an approximate estimate of traffic fatalities which can be attributed to Toxoplasma – several hundred thousand of dead each year (29).

And what about the other way around? Shouldn’t Toxo positives enjoy some advantages; for example, if they cause a traffic accident, shouldn’t their Toxo positivity prove a mitigating circumstance? I think not. One should always adapt his behavior, including how he drives, to his current ability. The problem is that in Rh-negative people, the worsening of reaction time due to toxoplasmosis can be very sudden, so the driver may not find out until the accident itself. Maybe I’m re-inventing the light bulb, but I think a good solution would be to install in every car, aside from seatbelts, a system to measure simple reaction time. The driver could only start the car if, for example, he pressed a button within 300 milliseconds of an auditory signal or a light in the dashboard turning on. The system could record the measured reaction times, so the driver could be immediately alerted if his psychomotor abilities are deteriorating over a long period of time, or if he is suspiciously indisposed just that day. In comparison with similar existing systems that monitor the presence of alcohol in one’s breath, this system would be significantly cheaper, and I think even more useful.

The possibility of using Toxo positivity as a defense has apparently already occurred to some clever lawyers. Very soon after we published our results, I received a letter from Germany from a lawyer, asking for additional details regarding Toxo’s effect. It seems to me that he was trying to dig out one of his clients who may have caused a traffic accident. In any case, the knowledge whether a person is or isn’t Toxo positive, could be important in the insurance industry. On one hand, insurance could give benefits to its uninfected clients, for apparently successfully avoiding possible sources of infection; on the other hand, the insured could select an appropriate insurance policy according to whether or not he’s Toxo positive, and thus whether or not he has a heightened risk of traffic accidents. If I know, that as a Toxo positive I statistically have a 2.65 times higher probability of a crash than an uninfected person, I will most likely consider a better, though maybe more expensive insurance.

Years ago I tried to patent such a use for our findings, but my first attempt, as well as further appeals, was unsuccessful. I’m not sure whether the decision of the patent office was justified. It is clear that usage of the patent, especially by insurance companies, could be seen as ethically controversial. Nevertheless, I regret neither my unsuccessful t efforts nor the expenses I invested. At least, when dealing with the good woman of the patent office, I understand why the Czech Republic is so far behind the rest of the world in number of awarded patents.

Frozen Evolution. Or, that’s not the way it is, Mr. Darwin. A Farewell to Selfish Gene.