More Annotations

Favourite Annotations

Text

STATISTICAL MODELING, CAUSAL INFERENCE, AND SOCIAL SCIENCE

Skip to content

* Home

* Books

* Blogroll

* Sponsors

* Authors

* Feed

BISHOPS OF THE HOLY CHURCH OF EMBODIED COGNITION AND EDITORS OF THE PROCEEDINGS OF THE NATIONAL ACADEMY OF CHRIST

Posted by Andrew

on 27 November

2020, 9:23 am

Paul Alper points to a recent New York Times article about astrology as a sign that the world is going to hell in a

handbasket.

My reply:

Astrology don’t bug me so much cos it doesn’t pretend to be science. I’m more bothered by PNAS-style fake science because it pretends to be real science. Same thing with religion. I don’t get so worked up about Biblical fundamentalists. If someone wants to believe that someone parted the Red Sea, whatever. Similarly, if Prof. Susan Fiske and Prof. Robert Sternberg were Rev. Susan Fiske and Rev. Robert Sternberg, bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ, I’d be less annoyed, because their experiments would be reported in the religion section of the paper, not the science section.

Alper adds:

> If you post on this, be sure to include a reference to truffle > French fries, an item not normally found in the upper midwest. My > guess is the stars will be aligned and many comments will be > forthcoming whether or not Venus is retrograding.

We’ll see.

AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis1 Filed under Zombies

. 14

Comments

A VERY SHORT STATISTICAL CONSULTING STORY

Posted by Andrew

on 26 November

2020, 9:40 am

I received the following email: > Professor Gelman,

>

> My firm represents ** (Defendant) in a case pending in the U.S. > District Court for the District of **. This case concerns .

>

> I’ve reviewed your background and think that your research and > interests, in particular your statistical background, may offer a > valuable perspective in this matter.

>

> I’ve attached a report drafted by Plaintiffs’ expert, **. The > Plaintiffs have submitted this report in support of their Motion for > Preliminary Injunction. Our response to the Plaintiffs’ Motion is > due on **. This is the same date by which we would need to submit > any rebuttal expert reports.

>

> Do you have a few moments when I could discuss this case in further > detail with you? If so, please let me know when I could give you a > call and the best number to reach you.

>

> Thank you,

> **

I replied:

> Hi—I took a look at **’s report and it looks pretty good. So I > don’t know that you should be contesting it. He seems to have done > a solid analysis.

> Andrew

That was an easy consulting job—actually, not a job at all, as I declined the opportunity to take this one on. People send me so many bad analyses to look at; it’s refreshing when they send me solid

work for a change.

P.S. This all happened a year ago and appeared just now because of a combination of usual blog delay and bumping due to one of our coronavirus posts this spring. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis2 Filed under Political Science

.

9 Comments

2 PHD STUDENT POSITIONS ON BAYESIAN WORKFLOW! WITH PAUL BÜRKNER!

Posted by Andrew

on 26 November

2020, 3:00 am

Paul Bürkner writes: > The newly established work group for Bayesian Statistics of Dr. > Paul-Christian Bürkner at the Cluster of Excellence SimTech, > University of Stuttgart (Germany), is looking for 2 PhD students to > work on Bayesian workflow and Stan-related topics. The positions are > fully funded for at least 3 years and people with a Master’s > degree in any quantitative field can apply. > All details on the two positions can be found at

>

> https://www.simtech.uni-stuttgart.de/documents/Jobs/Doctoral-Researcher-Position_IJP4BMM_SimTech.pdf

>

> and

>

> https://www.simtech.uni-stuttgart.de/documents/Jobs/Doctoral-Researcher-Position_ML4BMB_SimTech.pdf This sounds great! Some of our ideas on workflow are here

.

AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis2 Filed under Bayesian Statistics

,

Jobs , Stan

, Statistical

computing

.

Comment

IS CAUSALITY AS EXPLICIT IN FAKE DATA SIMULATION AS IT SHOULD BE? Posted by Keith O’Rourke

on 25

November 2020, 3:00 pm Sander Greenland recently published a paper

with a very

clear and thoughtful exposition on why causality, logic and context need full consideration in any statistical analysis, even strictly descriptive or predictive analysis. For instance, in the concluding section – “Statistical science (as opposed to mathematical statistics) involves far more than data – it requires realistic causal models for the generation of that data and the deduction of their empirical consequences. Evaluating the realism of those models in turn requires immersion in the subject matter (context) under study.” Now, when I was reading the paper I started to think how these three ingredients are or should be included in most or all fake data simulation. Whether one is simulating fake data for a randomized experiment or a non-randomized comparative study, the simulations need to adequately represent the likely underlying realities of the actual study. Only have to add simulation to this excerpt from the paper “ must deal with causation if it is to represent adequately the underlying reality of how we came to observe what was seen – that is, the causal network leading to the data”. For instance, it is obvious that sex is determined before treatment assignment or selection (and should be in the simulations), but some features may not be so obvious. Once someone offered me a proof that the simulated censored survival times they generated where the censoring time was set before the survival time (or some weird variation on that) would be meet the definition of non-informative censoring. Perhaps there was a flaw in the proof, but the assessed properties of repeated trials we wanted to understand, were noticeably different than when survival times were first generated and then censoring times generated and then applied. In that way, simulations likely better reflect the underlying reality as we understand it. And others (including future selves) more likely to raise criticisms about this. So I then worried about how clear I had been in my seminars and talks on using fake data simulation to better understand statistical inference, both frequentist and Bayes. At first, I thought I had, but on further thought I am not so sure. One possibly misleading footnote on the bootstrap and cross-validation I gave likely needs revision, as that did not reflect causation at all. Continue reading ‘Is causality as explicit in fake data simulation as it should be?’ » AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis2 Filed under Miscellaneous Science

,

Miscellaneous Statistics

,

Teaching .

22 Comments

A NEW HOT HAND PARADOX

Posted by Andrew

on 25 November

2020, 9:28 am

1. Effect sizes of just about everything are overestimated. Selection on statistical significance, motivation to find big effects to support favorite theories, researcher degrees of freedom, looking under the lamp-post, and various other biases. The Edlin factor is usually less than 1. (See here for a recent example.) 2. For the hot hand, it’s the opposite. Correlations between successive shots are low, but, along with Josh Miller and just about everybody else who’s played sports, I think the real effect is

large.

How to reconcile 1 and 2? The answer has little to do with the conditional probability paradox that Miller and Sanjurjo discovered, and everything to do with measurement error. Here’s how it goes. Suppose you are “hot” half the time and “cold” half the time, with Pr(success) equal to 0.6 in your hot spells and 0.4 in your cold spells. Then the probability of two successive shots having the same result is 0.6^2 + 0.4^2 = 0.52. So if you define the hot hand as the probability of success conditional on a previous success, minus the probability of success conditional on a previous failure, you’ll think the effect is only 0.04, even though in this simple model the true effect is 0.20. This is known as attenuation bias in statistics and econometrics and is a well-known effect of conditioning on a background variable that is measured with error. The attenuation bias is particularly large here because a binary outcome is about the noisiest thing there is. This application of attenuation bias to the hot hand is not new (it’s in some of the hot hand literature

that

predates Miller and Sanjurjo , and they cite it); I’m focusing on it here because of its relevant to effect

sizes.

So one message here is that it’s a mistake to _define_ the hot hand in terms of serial correlation (so I disagree with Uri Simonsohn here

).

Fundamentally, the hot hand hypothesis is that sometimes you’re hot and sometimes you’re not, and that this difference corresponds to some real aspect of your ability (i.e., you’re not just retroactively declaring yourself “hot” just because you made a shot). Serial correlation can be an effect of the hot hand, but it would be a mistake to define serial correlation _as_ the hot hand. One thing that’s often left open in hot hand discussions is to what extent the “hot hand” represents a latent state (sometimes you’re hot and sometimes you’re not, with this state unaffected by your shot) and to what extent it’s causal (you make a shot, or more generally you are playing well, and this temporarily increases your ability, whether because of better confidence or muscle memory or whatever). I guess it’s both things; that’s what Miller and

Sanjurjo say too.

Also, remember our discussion from a couple years ago

:

> The null model is that each player j has a probability p_j of making > a given shot, and that p_j is constant for the player (considering > only shots of some particular difficulty level). But where does p_j > come from? Obviously players improve with practice, with game > experience, with coaching, etc. So p_j isn’t really a constant. > But if “p” varies among players, and “p” varies over the > time scale of years or months for individual players, why > shouldn’t “p” vary over shorter time scales too? In what sense > is “constant probability” a sensible null model at all?

>

> I can see that “constant probability for any given player during a > one-year period” is a better model than “p varies wildly from > 0.2 to 0.8 for any player during the game.” But that’s a > different story. Ability varies during a game, during a season, and during a career. So it seems strange to think of constant p_j as a reasonable model. OK, fine. The hot hand exists, and estimates based on correlations will dramatically underestimate it because attenuation bias. But then, what about point 1 above, that the psychology and economics research literature (not about the hot hand, I’m talking here about applied estimates of causal effects more generally) typically overestimates effect size, sometimes by a huge amount. How is the hot hand problem different from all other problems? In all other problems, published estimates are overestimates. But in this problem, the published estimates are too small. Attenuation bias happens in other problems, no? Indeed, I suspect that one reason econometricians have been so slow to recognize the importance of type M errors and the Edlin factor is that they’ve been taught about attenuation bias and they’ve been trained to believe that noisy estimates are too low. From econometrics training, it’s natural to believe that your published estimates are “if anything, too conservative.” The difference, I think, is that in most problems of policy analysis and causal inference, the parameter to be estimated is clearly defined, or can be clearly defined. In the hot hand, we’re trying to estimate something latent. To put it another way, suppose the “true” hot hand effect really is a large 0.2, with your probability going from 40% to 60% when you go from cold to hot. There’s not so much that can be done with this in practice, given that you never really know your hot or cold state. So a large underlying hot hand effect would not necessarily be accessible. That doesn’t mean the hot hand is unimportant, just that it’s elusive. Concentration, flow, etc., these definitely seem real. It’s the difference between estimating a particular treatment effect (which is likely to be small) and an entire underlying phenomenon (which can be huge). AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis2

Filed under Sports

, Zombies

. 34

Comments

FURTHER FORMALIZATION OF THE “MULTIVERSE” IDEA IN STATISTICAL

MODELING

Posted by Andrew

on 24 November

2020, 9:36 am

Cristobal Young and Sheridan Stewart write

:

> Social scientists face a dual problem of model uncertainty and > methodological abundance. . . . This ‘uncertainty among > abundance’ offers spiraling opportunities to discover a > statistically significant result. The problem is acute when models > with significant results are published, while those with > non-significant results go unmentioned. Multiverse analysis > addresses this by recognizing ‘many worlds’ of modeling > assumptions, using computational tools to show the full set of > plausible estimates. . . . Our empirical cases examine racial > disparity in mortgage lending, the role of education in voting for > Donald Trump, and the effect of unemployment on subjective > wellbeing. Estimating over 4,300 unique model specifications, we > find that OLS, logit, and probit are close substitutes, but matching > is much more unstable. . . . My quick thought is that the multiverse is more conceptual than precise. Or, to put it another way, I don’t think the multiverse can be ever really defined. For example in our multiverse paper we considered 168 possible analyses, but there were many other researcher degrees of freedom that we did not even consider. One guiding idea we had in defining the multverse for any particular analysis was to consider other papers in the same subfield. Quite often, if you look different papers in a subfield, or different papers by a single author, or even different studies in a single paper, you’ll see alternative analytical choices. So these represent a sort of minimal multiverse. This has some similarities to research in diplomatic history, where historians use documentary evidence to consider what alternatives courses of action might have been considered by poilicymakers. Also, regarding that last bit above on matching estimation, let me emphasize, following Rubin (1970), that it’s not matching or regression, it’s matching and regression

(see also here

).

AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis3 Filed under Miscellaneous Statistics

,

Multilevel Modeling

.

8 Comments

GREEK STATISTICIAN IS IN TROUBLE FOR . . . TELLING THE TRUTH!

Posted by Andrew

on 23 November

2020, 9:10 am

Paul Alper points us to this news article by Catherine Rampell, which tells this story: > Georgiou is not a mobster. He’s not a hit man or a spy. He’s a > statistician. And the sin at the heart of his supposed crimes was > publishing correct budget numbers.

>

> The government has brought a relentless series of criminal > prosecutions against him. His countrymen have sought their own > vengeance by hacking his emails, dragging him into court, even > threatening his life. His lawyers in Greece are now preparing for > his latest trial, which begins this month . . .

>

> Politicians accused

>

> him of being a “Trojan horse” for international interests that > wanted to place Greece under “foreign occupation.” It didn’t > matter that his numbers were repeatedly validated by outside > experts. Or that the deficit his agency calculated precisely matched

>

> the net amount Greece borrowed from capital markets in 2009.

>

> The government prosecuted, cleared and re-prosecuted him anyway, for > causing “extraordinary damage” to the Greek state and for > “violation of duty.” In one case, he was given a suspended > prison sentence of two years. Two criminal investigations remain

> open.

I’m reminded of this story

,

The Commissar for Traffic Presents the Latest Five-Year Plan. There sometimes seem to be incentives to give inaccurate forecasts that tell people what they want to hear

.

Getting back to the Greek story, Alper writes: > Consider yourself personally lucky—Wansink, Brooks, Bem, > etc.—that you don’t live in Greece because:

>

>> In layman’s terms, a court said he made statements that were >> true but that hurt someone’s reputation. (Yes, this is an actual >> crime in Greece.) If his appeal fails, he’ll be forced to pay >> and publicly apologize to his predecessor. This means the person >> who restored the credibility of Greek statistics will have to >> apologize to a person who had been fudging the data. Wow. I guess whistleblowers have it hard there too. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis2 Filed under Miscellaneous Statistics

,

Political Science

.

18 Comments

THE 200-YEAR-OLD MENTOR

Posted by Andrew

on 22 November

2020, 9:21 am

Carl Reiner died just this year and Mel Brooks is, amazingly, still alive. But in any case their torch will be carried forward, as long as there are social scientists who are not in full control of their data. The background is the much-discussed paper, “The association between early career informal mentorship in academic collaborations and junior author performance.” Dan Weeks decided to look into the data from this study. He reports

:

> I think there are a number of problematic aspects with the > data used in this paper.

>

> See Section 13 ‘Summary’ of > https://danieleweeks.github.io/Mentorship/#summary

>

> How can one have a set of mentors with an average age > 200? How can > one have 91 mentors?

>

> Always always graph your data! Now whenever people discuss mentoring, I’m gonna hear that scratchy Mel Brooks voice in the back of my head. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis2 Filed under Zombies

. 12

Comments

BEST COMICS OF 2010-2019?

Posted by Andrew

on 21 November

2020, 9:41 am

X linked to this list by Sam Thielman of the best comics of the decade. The praise is a bit over the top (“brimming with wit and pathos” . . . “Every page in Ferris’s enormous debut is a wonder” . . . “An astounding feat of craftsmanship and patience” . . . “never has an artist created a world so vivid without a single word spoken” etc.), but that’s been the style in pop-music criticism for a few decades, so I’m not surprised to see it in other pop-cultural criticism as well: the critic is juicing up the positivity because he’s promoting the

entire genre.

It’s interesting how different these are than Franco-Belgian BD’s

.

Lately I’ve been continuing to read Emile Bravo and Riad Sattouf,

among others.

U.S. comics are like indie movies, Franco-Belgian BD’s are like Hollywood productions. Even the BD’s written and drawn by a single person have certain production values, in contrast to the DIY attitude from independent comics in English. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis

Filed under Art

, Literature

. 1

Comment

TODAY IN SPAM

Posted by Andrew

on 20 November

2020, 9:36 am

1. From “William Jessup,” subject line “Invitation: Would you like to join GlobalWonks?”:

> Dear Richard,

>

> I wanted to follow up one last time about my invitation to join our

> expert-network.

>

> We are happy to compensate you for up to $900 per hour for our > client engagements. If you would like to join us, you may do so by > signing up here

> .

>

> If you already signed up, please ignore this email. Hey, for $900/hour, you can call me Richard, no problem. Whatever you

say, William!

I’ve kept in the link above in case any Richards in our readership would like to get in on this sweet, sweet deal. Just click and join; I’m sure the $900 checks will start rolling in. 2. From “Christina,” subject line “Re: Regarding Andrew

Gelman’s Book”:

Dear Dr. Andrew Gelman, > I am Christina Batchelor, Editorial assistant from > Index of Sciences Ltd. contacting you with the reference from our

> editorial

> department. Basing on your outstanding contribution to the

> scientific

> community, we would like to write a book for you.

>

> Many Researchers like you wanted to write and > publish a book to show their scientific achievements. But only a few > researchers have published their books and yet there are researchers

> who

> still have the thought of writing a book and publishing it, but due

> to

> their busy schedule, they never get the time to write the book by > themselves and publish it.

>

> If you are one of those researchers who are very > busy but still want to write a book and publish it? we can help you

> with

> the writing and publishing of your book.

>

> With our book writing service, we can convert your > research contributions or papers into common man’s language and

> draft

> it like a book. . . .

Dear Christina:

If you really want to hook me for this sort of scam, try calling me Richard. That’ll get my attention. Also, if this Index of Sciences Ltd. thing ever stops working out, you should look around for other opportunities. Maybe Wolfram Research

is hiring?

AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis Filed under Economics

, Zombies

. 12

Comments

ARE FEMALE SCIENTISTS WORSE MENTORS? THIS STUDY PRETENDS TO KNOW Posted by Jessica Hullman

on 19

November 2020, 5:54 pm A new paper in Nature communications, The association between early career informal mentorship in academic collaborations and junior

author performance

, by

AlShebli, Makovi, and Rahwan, caught my attention. There are a number of issues but what bothered me the most is the post-hoc speculation about what might be driving the associations. Here’s the abstract: > _We study mentorship in scientific collaborations, where a junior > scientist is supported by potentially multiple senior collaborators, > without them necessarily having formal supervisory roles. We > identify 3 million mentor–protégé pairs and survey a random > sample, verifying that their relationship involved some form of > mentorship. We find that mentorship quality predicts the scientific > impact of the papers written by protégés post mentorship without > their mentors. We also find that increasing the proportion of female > mentors is associated not only with a reduction in post-mentorship > impact of female protégés, but also a reduction in the gain of > female mentors. While current diversity policies encourage > same-gender mentorships to retain women in academia, our findings > raise the possibility that opposite-gender mentorship may actually > increase the impact of women who pursue a scientific career. These > findings add a new perspective to the policy debate on how to best > elevate the status of women in science._ To find these mentor-protégé pairs, they first do gender disambiguation on names in their dataset of 222 million papers from the Microsoft Academic Graph, then define a junior scholar as anyone within 7 years of their first publication in the set, and a senior scholar as anyone past 7 years. They argue that this looser definition of mentorship, as anyone that a junior person published with who had passed the senior mark at the time, is okay because a lot of time there is informal mentorship from those other than one’s advisor, in the form of somehow helping or giving advice, and one could interpret the co-authorship of the paper itself as helping. It seems a little silly that after saying this they present results of a survey sample of 167 authors to argue that their assumption is good. But beyond the potential for dichotomizing experience to introduce researcher degrees of freedom I don’t really have a problem with these assumptions. To analyze the data, they define two measures of mentor quality as independent variables. First the “big shot” measure, which is the average impact of the mentors prior to mentorship, operationalized as “their average number of citations per annum up to the year of their first publication with the protégé.” Then the hub experience, defined as the average degree of the mentors in the network of scientific collaborations up to the year of their first publication

with the protégé.

They measure mentorship outcome, conceptualized as “the scientific impact of the protégé during their senior years without their mentors”, by calculating the average number of citations accumulated 5 years post publication of all the papers published when the academic age of the protégé was greater than 7 years which included none of the scientists who were identified as their mentors. I have some slight issues with their introduction of terminology like mentorship quality here…. Should we really call a citation-based measure of impact mentorship quality? Yes, it’s easy to remember what they are trying to get at when they call average citations per year “big shot” experience, but at the same time, gender is known to have a robust effect on citations. So defining mentorship quality based on average citations per year essentially bakes gender bias into the definition of quality – I would expect women to have lower big shot scores and lower mentorship outcomes on average based on their definitions. But whatever, this is mostly annoying labeling at this

point.

They then do ‘coarsened exact matching’, matching groups of protégés who received a certain level of mentorship quality with another group with lower mentorship quality but comparable in terms of other characteristics like the number of mentors, year they first published, discipline, gender, rank of affiliation on their first mentored publication, number of years active post mentorship, and average academic age of their mentors, and hub experience or big shot experience, whichever one they are not analyzing at the time. To motivate this, they say “While this technique does not establish the existence of a causal effect, it is commonly used to infer causality from observational data.” Um, what? They compare quintiles separately for big shot and hub where treatment and control are the _Qith_+1 and _Qith _quintile. They do a bunch of significance tests, finding “an increase in big-shot experience is significantly associated with an increase in the post-mentorship impact of protégés by up to 35%. Similarly, the hub experience is associated with an increase the post-mentorship impact of protégés, although the increase never exceeds 13%”. They conclude there’s a stronger association between mentorship outcome and big-shot experience than with hub experience since changes to big shot experience have more impact given their quintile comparison

approach.

Their main takeaways are about gender though, which involves matching sets of protégés where everything is comparable except for the number of female mentors. They present some heatmaps, one for male protégés and one for female protégés, where given a certain number of mentors, one can see how increasing the proportion of female mentors generally decreases the protégés’ outcomes (recall that’s citations on papers with none of the mentors once the protégé reaches senior status). Many “*” for the significance tests. Graph b is more red overall, implying that the association between having more female mentors and having less citations is weaker

for males.

They also look at what mentoring a particular protégé does for the mentor, captured by the average impact (citations 5 years post publication) of the papers the mentor and protégé co-authored together during the mentorship period. They match male and female protégés on discipline, affiliation rank, number of mentors, and the year in which they published their first mentored paper, then compare separately the gains from male versus female protégés for male and female mentors. The downward extending bar chart shows that mentors of both genders see less citations for papers with female proteges, and would seem to suggest there’s a bigger difference between the citations a female mentor gets for papers with a female versus male protégé than that which a male mentor gets for papers with female versus male protégés. These associations are kind of interesting. The supplemental material includes a bunch of versions of the charts broken down by discipline and where the authors vary their definitions of senior versus junior and of impact, by way of arguing that the patterns are robust. Based purely on my own experience, I can buy that there’s less payoff in terms of citations from co-authoring with females; I’ve come to generally expect that my papers with males, whether they are my PhD students or collaborators, will get more citations. But to what extent are these associations redundant with known gender effects in citations? Could, for example, the fact that someone had a female mentor mean they are more likely to collaborate later in their career with females who, according to past studies

,

tend to receive less citations on papers where they are in prominent author positions? The measures here are noisy, making it hard to ascertain what might be driving them more specifically. However, that doesn’t stop the authors from speculating what might

be going on here:

> _Our study … suggests that female protégés who remain in > academia reap more benefits when mentored by males rather than > equally-impactful females. The specific drivers underlying this > empirical fact could be multifold, such as female mentors serving on > more committees, thereby reducing the time they are able to invest > in their protégés, or women taking on less recognized topics that > their protégés emulate, but these potential drivers are out of the > scope of current study._ Seems like the authors are exercising their permission to draw some causal inferences here, because, hey, as they implied above, everybody else is doing it. Serving on more committees seems like grasping for straws – I have no reason to believe that women don’t get asked to do more service, but it seems implausible that inequity in time spent on service could be extreme enough to affect the citation counts of their mentees years later, given all the variation in a dataset like this. The possibility of “women taking on less recognized topics“ seems less crazy implausible (see for instance this linguistic analysis of nearly all US PhD-recipients and their dissertations across three decades

). Though

I’d prefer to be spared these speculations. > _Our findings also suggest that mentors benefit more when working > with male protégés rather than working with comparable female > protégés, especially if the mentor is female. These conclusions > are all deduced from careful comparisons between protégés who > published their first mentored paper in the same discipline, in the > same cohort, and at the very same institution. Having said that, it > should be noted that there are societal aspects that are not > captured by our observational data, and the specific mechanisms > behind these findings are yet to be uncovered. One potential > explanation could be that, historically, male scientists had enjoyed > more privileges and access to resources than their female > counterparts, and thus were able to provide more support to their > protégés. Alternatively, these findings may be attributed to > sorting mechanisms within programs based on the quality of > protégés and the gender of mentors._ So, again we jump to the conclusion that because there are associations with lower citations and working with female mentors or protégés, women must be doing a worse job somehow? What set of reviewers felt comfortable with these sudden jumps to causal inference? The dataset used here has some value, and the associations are interesting as an exploratory analysis, but seriously, I would expect more of undergrads or masters students I teach data science to. I’m with Sander Greenland here on the fact that what science often needs most from a study is its data

, not for the

authors to naively expound on the implications. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis3 Filed under Miscellaneous Science

.

62 Comments

MISTER P FOR THE 2020 PRESIDENTIAL ELECTION IN BELARUS

Posted by Andrew

on 19 November

2020, 9:59 am

An anonymous group of authors writes

:

> Political situation

>

> Belarus is often called the “last dictatorship” in Europe. > Rightly so, Aliaskandr Lukashenka has served as the country’s > president since 1994. In the 26 years of his rule, Lukashenka has > consolidated and extended his power, which is today absolute. > Rigging referendums has been an effective means of consolidating > power. His re-elections have been no better — he has claimed about > 80% of the vote in all of them, while none of them has been > acknowledged by the international community as free and fair. > Lukashenka’s dictatorial rule seemed unshakeable a mere half a > year ago. Right now, all of this is history as Lukashenka is > scrambling to prop up his regime under the stress of 100,000- to > 250,000-strong protest rallies every weekend since August, 9th. So > what happened? In this post, we are discussing a preprint > that we under the pseudonym of > Ales Zahorski wrote to analyze the actual support levels for > Lukashenka coming into the Presidential election on August 9th,

> 2020.

>

> The 2020 presidential campaign proved to be unique for Belarus in > many ways: The nonchalant approach of President Aliaksandr > Lukashenka to the Covid-19 pandemic caused voluntary civil > engagement in countering the threat of Covid-19. In turn, this led > to increased political activity, providing fertile soil for the > emergence of new political leaders, some of whom became presidential > contenders. These new political leaders did not come from the > conventional opposition and they had no obvious orientation towards > Russia or the West. In addition, the new opposition leaders came > from different backgrounds and had experience from a wide variety of > professional fields. Thus they appealed to a much broader audience > than their earlier counterparts.

>

> Lukashenka, however, eliminated the three strongest candidates from > the presidential race. To his dismay, the teams of those candidates > united around Sviatlana Tsikhanouskaya, the wife of Siarhei > Tsikhanouski, who was the third most popular candidate according to > Internet surveys (see Table 1). She registered as a stand-in for her > husband after his arrest. The Central Electoral Committee (CEC), a > puppet body meant to oversee elections, allowed her to enter the > race, probably because Lukashenka did not consider her a real > threat. Otherwise, CEC registered the three representatives from the > conventional opposition: Siarhei Cherachen, Andrei Dmitriyeu and > Hanna Kanapatskaya. None of them had any visible support in the > population according to the media polls (see Table 1). From the > early stages of the 2020 presidential campaign, it was clear that > the fairness of the election would be in question. Independent > candidates were barred from entering local election committees which > hinted at the planned ballot stuffing. The Belarusian Ministry of > Foreign Affairs did not invite any credible international observers.

>

> Sociology on political topics is banned

>

> Since independent sociology and independent surveys are banned in > Belarus, we had to be inventive in order to obtain data on the > popularity of each presidential candidate. There are some online > polls performed by the media (which were as of June 1, 2020 also > forbidden), but these can not be trusted as they lack sound > scientific rigour.

>

> The absence of independent polling institutes and extremely > contradictory results coming from different sources, provided the > impetus for the current study. The results of media polls are > summarized in Table 1, while the Ecoom (a company hired by > Belarusian authorities) polls are presented in Table 2. As one can > see, these polls contradict each other. Thus, we came up with an > initiative to carry out a national poll, and based on these data, we > used the multilevel regression with poststratification (MRP) > methodology to estimate the popularity of each candidate. With this > study, it was our sincere aim to provide a politically unbiased > account of what the presidential election results in a > counterfactual world – a Belarus with free and fair elections – > would likely have been.

>

> Data

>

> We employed two different methods for polling: (1) An online poll > using Viber – the most popular messenger application in Belarus; > and (2) a street poll taking place at different locations across the > country. The questionnaires contained questions about what candidate > the respondents intended to vote for, as well as questions about > socio-economic and demographic status of the participants including > age, gender, education level, region of residence and type of area > of residence that correspond to the national census data. The latter > allowed us to employ poststratification. We further added questions > of common research interest. There were two additional questions in > both the Viber and the street surveys about the family’s total > monthly income and whether the respondent was willing to participate > in early voting. The invitation to participate in the Viber poll was > advertised in various communities on social media and was also sent > via SMS to random Belarusian phone numbers (see details in the > paper). As a result, we obtained around 45,000 answers. After > disregarding answers from persons younger than 18 years old, people > without Belarusian citizenship, and responses from phone numbers > outside of Belarus (in the clean-up) 32,108 answers were kept. For > the street poll, we aimed at collecting at least 500 responses to > cover all possible categories of citizens with respect to gender, > age, region, and type of area of residence. We used the official > annual report for 2019 from Belstat (National Statistical Committee > of the Republic of Belarus) to calculate the representative size of > the statistical group for each category surveyed. As a result, we > collected 1124 responses, providing a decent representativeness of > the Belarusian population as compared to the official Belstat census > data Demographic biases in the collected samples against the > official 2009 census and 2019 annual report are presented in Figure

> 1.

>

> After preprocessing the data from the Viber and street polls, we > joined the two samples as follows: The filtered Viber sample was > randomly divided into two parts consisting of 50% of the data each. > One of these parts was kept as a holdout set for testing the > predictive uncertainty handling of our MRP model, whilst the other > one was merged with the street sample into a training set, where the > street data was uniformly upsampled to the size of 50% of the whole > Viber sample. By means of doing this kind of preprocessing we > equalize the importance of the street and Viber data in the training > set, whilst keeping approximately the same amount of information as > in the Viber poll data.

>

> The scripts used for merging the data are implemented in R as a part > of statistical modelling pipeline and are also freely available on > the GitHub page of the project.

>

> MRP

>

> In short, the methodology we employ involves building a statistical > model that attempts to atone for the fact that our survey > respondents are not representative of the population as a whole. By > properly weighting the predictions of our multilevel regression > model, we generalise from the sample to the entire population. The > procedure is called multilevel regression with poststratification > (MRP). The inference was performed in INLA.

>

> We also adopted several recently published advancements from Gao et > al. to improve MRP. In particular, the random effects > corresponding to the ordinal categorical predictors (age and > education) are assumed to have a latent AR1 structure between the > categories, whilst other factors as well as the intercept term have > an i.i.d. latent structure. Additionally, a latent Gaussian > Besag-York-Mollie (BYM2) field is included into the model in order > to account for the spatial dependence of the probabilities between > the regions and the variance which is neither explained by the > covariates nor by the common latent factors included into the random

> intercept.

>

> We also employed model selection using criteria including WAIC and > MLIK to compare the suggested model to the baselines. The baselines > were models without a latent AR1 structure between the categories > and additionally without BYM2. The model with both AR1 and BYM2 > included was found optimal with respect to these criteria.

>

> Results

>

> We found that the results of the election announced by CEC and the > results of the pro-governmental BRSM (BRSM here stands for > Belarusian Republican Youth Union) poll strongly disagree with the > estimated pre-election ratings of the candidates, whilst the results > of the independent polls are much more consistent with our estimated > ratings. In particular, we found that both the officially announced > results of the election and the officially reported early voting > rates are improbable according to the estimates we obtained from the > merged Viber and street poll data.

>

> As shown in the following figure, both the officially announced > results of the election and early voting rates are highly > improbable. With a probability of at least 95%, Sviatlana > Tikhanouskaya’s rating lies between 75% and 80%, whereas > Aliaksandr Lukashenka’s rating lies between 13% and 18% and early > voting rate predicted by the method ranges from 9% to 13% of those > who took part in the election. These results contradict the > officially announced outcomes, which are 10.12%, 80.11%, and 49.54% > respectively and lie far outside even the 99.9% credible intervals > predicted by our model. The ratings of other candidates and voting > “Against all” are insignificant and correspond to the official > results. The same conclusions are valid when comparing the > pre-election ratings to the pro-governmental BRSM poll.

>

> As shown below, the only groups of people where the upper bounds of > the 99.9% credible intervals of the rating of Lukashenka predicted > by MRP are above 50% are people older than 60 and uneducated people.

>

> For all other subgroups, including rural residents, even the upper > bounds of 99.9% credible intervals for Lukashenka are far below 50%. > The same is true for the population as a whole. Thus, with a > probability of at least 99.9%, as predicted by MRP, Lukashenka could > not have had enough electoral support to win the 2020 presidential > election in Belarus.

>

> Criticism and our responses > Important assumptions that must hold for our conclusions to be valid > are discussed by Daniel Simpson in his scientific blogpost: > Assumption 1: The demographic composition of the population is > known. Assumption 2: The people who did not answer the survey in > subgroup j correspond to a random sample of subgroup j and to a > random sample of the people who were asked.

>

> Regarding Assumption 1, we used precise survey data from the 2009 > Belstat census. We had to assume, however, that the demographics of > Belarus have not changed significantly since then. In the first > figure presented in this blogpost, we show this to be true at least > marginally for four groups of the addressed demographic variables > (when compared to the 2019 annual report), but the data on the fifth > group (education levels) from 2019 is not yet available. Assumption > 1 will also get an additional check when the results of the 2019 > census in Belarus are published. Then, we will have the possibility > to restratify the results if some significant changes in the > demographics appear. > Regarding Assumption 2, we agree with Simpson that this sort of > missing at random assumption is almost impossible to verify in > practice. Simpson mentions that there are various things one can do > to relax this assumption, but generally this is the assumption that > we are making. This assumption is quite likely met for the street > survey. Nevertheless, there is room for several sources of bias: (1) > selection of respondents by interviewers – tendency to select more > approachable/friendly-looking people, although we gave the explicit > instructions to select random people; (2) response/refusal of > respondents when approached (those in a hurry, those afraid to > answer because of their pro-opposition views, possibly > pro-governmental respondents who are not eager to answer due to > their distrust in polls and other activities around the election); > (3) item non-response, i.e. respondents not answering specific > questions (some respondents did not want to report their income > levels). The net effect of (1)-(3) is, of course, unknown. > Validating Assumption 2 in the Viber poll is much more difficult. > According to Simpson’s blogpost, one option is to assess how well > the prediction works on some left out data in each subgroup. This is > useful because poststratification explicitly estimates the response > in the unobserved population. This viewpoint suggests that our goal > is not necessarily unbiasedness but rather a good prediction of the > population. It also means that if we can accept a reasonable bias, > we will get the benefit of much tighter credible bounds of the > population quantity than the survey weights can give. Hence, we > return to the famous bias against variance trade-off. We have tried > to approach this assumption from several perspectives. First of all, > in the Viber poll, we used sampling of random phone numbers to > invite respondents and advertised at different venues frequented by > people with various demographic and political backgrounds. Secondly, > in the attempt to obtain better results in the bias-variance > trade-off sense and to assess predictive properties of the > underlying Bayesian regression, we uniformly upsampled the street > data to the size of 50% of the Viber data and randomly divided the > Viber data into two halves: One half was merged with the upsampled > street data to form the training sample. The other one was left as a > hold-out set to test predictive uncertainty handling by the > introduced in the paper modified Brier score. Here, we aimed at > reducing the variance by possibly introducing some bias and at > testing predictive qualities of the model. Lastly, to assess and > confirm our findings on the joint sample, we performed the same > analysis based on MRP fitted on the street data only. This analysis > is much more likely to have no violations of Assumption 2 above, > however the sample is significantly smaller, and in the sense of a > bias-variance trade-off, we are likely to have a significantly > increased variation in the posterior distributions of the focus > parameters. At the same time, we can validate the results obtained > by MRP on the joint sample. As a result we get the posterior > quantiles of interest presented in Figure 4. In short, one can see > that even though the level of uncertainty is significantly increased > due to the reduced sample size, ultimately all of the conclusions > are equivalent to those presented above for the MRP trained on the > joint sample. Though for some important conclusions the level of > significance drops from 99.9% to 99% or 95%. Moreover, the 99.9%, > 99%, 95%, and 90% credible intervals of the MRP trained on the joint > sample are almost always inside the corresponding credible intervals > obtained on the street data. This allows us to conclude that we have > obtained a very reasonable bias-variance trade-off on the joint > data, corroborating the conclusions we have drawn from the joint

> sample.

The full article is here

.

It contains some tables and graphs. I have not checked this analysis myself, and of course all conclusions depend on assumptions, but I like the general approach of adjusting survey data in this way, and even if this analysis has its imperfections it can be the starting point for further work and it cam motivate similar studies in other countries. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis4 Filed under Bayesian Statistics

,

Multilevel Modeling

,

Political Science

,

Stan . 6

Comments

IS VS. OUGHT IN THE STUDY OF PUBLIC OPINION: CORONAVIRUS “OPENING

UP” EDITION

Posted by Andrew

on 18 November

2020, 9:00 am

I came across this argument between two of my former co-bloggers which illustrates a general difficulty when thinking about political attitudes, which is confusion between two things: (a) public opinion, and (b) what we want public opinion to be. This is something I’ve been thinking about for many years, ever since our Red State Blue State project. Longtime blog readers might recall our criticism of political reporter Michael Barone, who told his readers that richer people voted for Democrats and poorer people voted for Republicans—even though the data showed the opposite. And Barone was a data guy! I coined a phrase, “second-order availability bias,” just to to try to understand this way of thinking. The latest example is the debate over how fast to open up the economy. A clear description that I’ve seen of the confusion comes in this op-ed by Michelle Goldberg, who writes

:

> Lately some commentators have suggested that the coronavirus > lockdowns pit an affluent professional class comfortable staying > home indefinitely against a working class more willing to take risks > to do their jobs. . . . Writing in The Post, Fareed Zakaria

>

> tried to make sense of the partisan split over coronavirus > restrictions, describing a “class divide” with pro-lockdown > experts on one side and those who work with their hands on the > other. . . . The Wall Street Journal’s Peggy Noonan wrote

> :

> “Here’s a generalization based on a lifetime of experience and > observation. The working-class people who are pushing back have had > harder lives than those now determining their fate.” But, no, it seems that Zakaria and Noonan are wrong. Goldberg

continues:

> The assumptions underlying this generalization, however, are not > based on even a cursory look at actual data. In a recent Washington > Post/Ipsos survey

> ,

> 74 percent of respondents agreed that the “U.S. should keep trying > to slow the spread of the coronavirus, even if that means keeping > many businesses closed.” Agreement was slightly higher — 79 > percent — among respondents who’d been laid off or furloughed. .

> . .

Goldberg can also do storytelling: > Meatpacking workers have been sickened with coronavirus at wildly > disproportionate rates, and all over the country there have been > protests outside of meatpacking plants demanding that they be > temporarily closed, sometimes by the workers’ own children. > Perhaps because those demonstrators have been unarmed, they’ve > received far less coverage than those opposed to lockdown orders. . > . . Meanwhile, financial elites are eager for everyone else to > resume powering the economy. . . . when it comes to the coronavirus, > willingness to ignore public health authorities isn’t a sign of > flinty working-class realism. Often it’s the ultimate mark of

> privilege.

OK, that’s just a story too. But I was curious about the people who Goldberg cited at the beginning of her article, who so confidently got things wrong. So I clicked on each story. First, Zakaria. He does a David Brooks-style shtick, with lines like, “Imagine you are an American who works with his hands — a truck driver, a construction worker, an oil rig mechanic — and you have just lost your job because of the lockdowns, as have more than 36 million people. You turn on the television and hear medical experts, academics, technocrats and journalists explain that we must keep the economy closed — in other words, keep you unemployed — because public health is important. . . .” In this riff, Zakaria is exhibiting a failure of imagination. He talks about truck drivers who want to go back to work, but not about meatpacking workers who don’t want to be exposed to coronavirus. He talks about various experts who want to “keep you unemployed” but does not talk about the financial elites, not to mention “academics, technocrats and journalists” such as himself who are eager to see everyone else get back to work—even though they can keep working from home as long as they want. Do they just miss going into the TV

studo?

There’s also a gender dimension to Zakaria’s article, in that he listed about three stereotypically male occupations. In general, men are less concerned about health and safety than women are. So he’s stacking the deck by talking about truck drivers, construction workers, and oil rig mechanics, rather than, say, nurse’s aides, housecleaners, and preschool teachers. Zakaria is making an error, imputing a statement that lower-social-class Americans want to open up the economy, even though the data don’t show this, and even though there are lots of logical reasons to understand why comfortable work-at-home pundits could be just fine with opening up, given that they get to pick and choose when and where to go to work. Next, Noonan. Unfortunately this link is paywalled, but I do see the sub-headline, “Those who are anxious to open up the economy have led harder lives than those holding out for safety.” Perhaps someone with a Wall Street Journal subscription can tell me what data she

cites on this one.

It could be that Noonan is right and Goldberg is wrong here. Goldberg cited this one survey, but that’s just one survey, and it was from 27 Apr to 4 May, and opinions have surely changed since then. For now I’ll go with Goldberg’s take because she brought data to the

table.

The analyst I really trust for this sort of thing is sociologist David Weakliem. Let’s go to his blog and see if he wrote anything on this . . . yeah! Here it is

:

> Some people have said that the coronavirus epidemic will bring > Americans together, uniting us behind a goal that transcends > political differences. It doesn’t seem to be working out that > way–whether to ease restrictions has become a political issue, > with Republicans more in favor of a quick end and Democrats more in > favor of keeping restrictions. There have been some claims that > it’s also a class issue. The more common version is that the > “elites” can work at home, so they are happy to keep going on > that way, but most ordinary people can’t, so they want to get back > to work (see this article

> for an

> entertainingly unhinged example). But you could also argue it the > other way—affluent people are getting fed up with online meetings, > and tend to have jobs that would let them keep more space from their > co-workers, so they want to get back to normal; less affluent people > have jobs that would expose them to infection, so they want to stay > safe. I couldn’t find individual-level data for any survey, but I > did find one report

>

> that breaks opinions down by some demographic variables. It’s a Washington Post – University of Maryland survey from 21-26

Apr.

Here’s what Weakliem found: > The most relevant question is “Do you think current restrictions > on how restaurants, stores and other businesses operate in your > state are appropriate, are too restrictive or are they not > restrictive enough?”

>

> Too restrictive Appropriate Not enough > Republicans 29% 60% 11% > Democrats 8% 72% 19%

>

> Although majorities of both parties say (or said—the survey was > April 21-26) they were appropriate, there is a pretty big

> difference.

>

> By education:

>

> College grads 15% 72% 12% > Others 18% 63% 18%

>

> or restricting it to whites:

>

> College grads 17% 72% 10% > Others 20% 64% 15%

>

> To the extent there is a difference, it’s that less educated > people are more likely to have “extreme” opinions of both kinds. > Maybe that’s because more educated people tend to have more trust > in the authorities. But basically, it’s not a major factor.

>

> A few other variables: income is similar to education, with lower > income people more likely to take both “extreme” positions; > non-whites, women, and younger people more likely to say “not > restrictive enough” and less likely to say “too restrictive”. > All of those differences are considerably smaller than the party > differences. Region and urban/rural residence seem relevant in > principle, but aren’t included in the report. Interesting about moreless educated or higherlower-income people taking more extreme positions, which gives a slightly different twist on Zakaria and Noonan. As with red state blue state, pundits love talking about the working class, but many of the most intense battles are happening within the elite. But I promised I’d talk with you about my former co-bloggers . . . Here’s Robin Hanson

from 5 May:

> The public is feeling the accumulated pain, and itching to break > out. . . . Elites are now loudly and consistently saying that this > is not time to open; we must stay closed and try harder to contain. > . . . So while the public will uniformly push for more opening, > elites and experts push in a dozen different directions. . . . > elites and experts don’t speak with a unified voice, while the

> public does.

This makes no sense to me. To the extent that the polls were capturing public opinion, the public was speaking with a uniform voice _in favor_ of restrictions—the exact opposite of what Hanson was saying. My guess is that Hanson was frustrated that “experts” and “elites” (in his words) did not agree with his opening-up policy preferences, so he was enlisting “the public” to be on his side. Unfortunately, the public did not hold his position either. Hanson continues, “Many are reading me as claiming that the public is unified in the sense of agreeing on everything. But I only said that the public pushes will will tend to be correlated in a particular direction, in contrast with the elite pushes which are much more diverse. Some also read me as claiming that strong majorities of the public support fast opening, but again that’s not what I said.” I can’t figure out what he’s getting at here. He said, “The public is feeling the accumulated pain, and itching to break out” . . . but the polls didn’t support that take. He also said that the public “speaks with a unified voice”—but, to the extent that was true, the voice was the opposite of what Hanson was saying. Maybe now things have changed and the public is more divided on their policy preferences regarding restrictions or openings—but, if so, that’s really the opposite of a unified voice. Hanson also cites a couple of polls he did on twitter, but he uses these incoherently, first as evidence of opinions of elites and experts, then second as evidence of public opinion. I don’t think twitter polls really represent elite opinion, expert opinion, or public opinion, but I guess it all depends on who responds. And here’s Henry Farrell from 5 May, saying pretty much what I said above, but in a more

structured way:

> There is indeed survey evidence to suggest that the public has > strong preferences on re-opening. The problem is that that evidence > (or, at least, the evidence that I am aware of), is that large > majorities of people don’t want to reopen anytime soon. . . . the > best empirical evidence I know of as to what individual members of > the public want runs exactly contrary to the claims made by public > choice scholars (who are presumably methodological individualists) > about what the public wants. It’s not clear to me that Farrell should be taking the blogs of two people (Robin Hanson and Tyler Cowen, who linked to Hansen’s post) as representative of “public choice scholars” more generally. But Farrell does acknowledge they may be “talking about ‘what the public will inevitably end up wanting in the long run as the costs of freezing much economic activity become clear.'” The trouble with this sort of in-the-long-term-the-public-will-agree-with-me attitude, as Farrell points out, is (a) people might agree with you for the wrong reasons (maybe for reasons of partisanship rather than policy), and (b) “the problem with such loosely expressed arguments about what ‘the public wants’ is that they’re likely to blur together ideological priors and empirical claims in a manner that makes them impossible to distinguish.” I agree. This is the public-opinion version of the difficulties that arise when people make empirical statements without the data. If you’re interested, Farrell and Hanson continue the discussion

here and here

.

The conversation goes in a different direction than my focus here: Farrell is focusing on the whole public-choice thing and Hanson starts talking about how communism can’t work. Farrell might be wrong on the economics, but I think Hanson makes Farrell’s point for him on the public opinion question, pretty much admitting that the belief that “the public” agrees with him, despite what the polls might say, is based on his (Hanson’s) reading of economic theory. All this does not say that Hanson’s economic analysis and policy preferences are wrong (or that they’re right). That’s a separate question from the study of public opinion, although public opinion is relevant to the question. If you claim to be in agreement with the people, it helps if the people are in agreement with you. Also, opinions can change. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis4 Filed under Decision Theory

,

Economics

,

Political Science

.

95 Comments

AUTHORS REPEAT SAME ERROR IN 2019 THAT THEY ACKNOWLEDGED AND ADMITTED

WAS WRONG IN 2015

Posted by Andrew

on 17 November

2020, 9:59 am

David Allison points to this story

:

> Kobel et al. (2019) report results of a cluster randomized trial > examining the effectiveness of the “Join the Healthy Boat” > kindergarten intervention on BMI percentile, physical activity, and > several exploratory outcomes. The authors pre-registered their study > and described the outcomes and analysis plan in detail previously, > which are to be commended. However, we noted four issues that some > of us recently outlined in a paper on childhood obesity > interventions: 1) ignoring clustering in studies that randomize > groups of children, 2) changing the outcomes, 3) emphasizing results > that were statistically significant from a host of analyses, and 4) > using self-reported outcomes that are part of the intervention.

>

> First and most critically, the statistical analyses reported in the > article were inadequate and deviated from the analysis plan in the > study’s methods article – an error the authors are aware of and > had acknowledged after some of us identified it in one of their > prior publications about this same program. . . .

>

> Second, the authors switched their primary and secondary outcomes > from their original plan. . . .

>

> Third, while the authors focus on an effect of the intervention of p > ≤ 0.04 in the abstract, controlling for migration background in > their full model raised this to p = 0.153. Because inclusion or > exclusion of migration background does not appear to be a > pre-specified analytical decision, this selective reporting in the > abstract amounts to spinning of the results to favor the

> intervention.

>

> Fourth, “physical activity and other health behaviours … were > assessed using a parental questionnaire.” Given that these > variables were also part of the intervention itself, with the > control having “no contact during that year,” subjective > evaluation may have resulted in differential, social-desirability > bias, which may be of particular concern in family research. > Although the authors mention this in the limitations, the body of > literature demonstrating the likelihood of these biases invalidating > the measurements raises the question of whether they should be used

> at all.

This is a big deal. The authors of the cited paper knew about these problems—to the extent of previously acknowledging them in print—but then did them again. They authors did this thing of making a strong claim and then hedging it in their limitations. That’s bad. From the abstract of the linked

paper

:

> Children in the IG spent significantly more > days in sufficient PA than children in the CG > (3.1 ± 2.1 days vs. 2.5 ± 1.9 days; > p ≤ 0.005). Then, deep within the paper: > Nonetheless, this study is not without limitations, which need to be > considered when interpreting these results. Although this study has > an acceptable sample size and body composition and endurance > capacity were assessed objectively, the use of subjective measures > (parental report) of physical activity and the associated recall > biases is a limitation of this study. Furthermore, participating in > this study may have led to an increased social desirability and > potential over-reporting bias with regards to the measured variables > as awareness was raised for the importance of physical activity and > other health behaviours. This is a limitation that the authors judge to be worth mentioning in the paper but not in the abstract or in the conclusion, where the authors write that their intervention “should become an integral part of all kindergartens” and is “ideal for integrating health promotion more intensively into the everyday life of children and into the education of kindergarten teachers.” The point here is not to slam this particular research paper but rather to talk about a general problem with science communication, involving over-claiming of results and deliberate use of methods that are problematic but offer the short-term advantage of allowing researchers to make stronger claims and get published. P.S. Allison follows up by pointing to this Pubpeer thread

.

AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis Filed under Public Health

,

Zombies . 4

Comments

ESTIMATING EFFICACY OF THE VACCINE FROM 95 TRUE INFECTIONS

Posted by Andrew

on 16 November

2020, 5:23 pm

Gaurav writes:

> The 94.5% efficacy announcement

>

> is based on comparing 5 of 15k to 90 of 15k:

>

>> On Sunday, an independent monitoring board broke the code to >> examine 95 infections that were recorded starting two weeks after >> volunteers’ second dose — and discovered all but five >> illnesses occurred in participants who got the placebo.

>

> Similar stuff from Pfizer etc., of course.

>

> Unlikely to happen by chance but low baselines.

>

> My guess is that the final numbers will be a lot lower

> than 95%

> .

He expands:

> The data = control group is 5 out of 15k and the treatment group is > 90 out of 15k. The base rate (control group) is 0.6%. When the base > rate is so low, it is generally hard to be confident about the ratio > (1 – (5/95)). But noise is not the same as bias. One reason to > think why 94.5% is an overestimate is simply that 94.5% is pretty > close to the maximum point on the scale.

>

> The other reason to worry about 94.5% is that the efficacy of a Flu > vaccine is dramatically lower. (There is a difference in the time > horizons over which effectiveness is measured for Flu for Covid, > with Covid being much shorter, but useful to take that as a caveat > when trying to project the effectiveness of Covid vaccine.) AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis1 Filed under Public Health

. 77

Comments

WHAT WENT WRONG WITH THE POLLS IN 2020? ANOTHER EXAMPLE.

Posted by Andrew

on 16 November

2020, 9:19 am

Shortly before the election the New York Times ran this article

, “The

One Pollster in America Who Is Sure Trump Is Going to Win,” featuring Robert Cahaly, who on election day forecast

Biden to

win 235 electoral votes. As you may have heard, Biden actually won 306. Our Economist model gave a final prediction

of 356.

356 isn’t 306. We were off by 50 electoral votes, and that was kind of embarrassing. We discussed what went wrong, and the NYT ran an article on “why political polling missed the mark.” Fine. We were off by 50 electoral votes (and approximately 2.5 percentage points on the popular vote, as we predicted Biden with 54.4% of the two-party vote and he received about 52%). We take our lumps, and we try to do better next time. But . . . they were off by 71 electoral votes! So I think they should assess what went wrong with their polls, even more so. The Times article ends with this quote from Cahaly: > “I think we’ve developed something that’s very different from > what other people do, and I really am not interested in telling > people how we do it,” he said. “Just judge us by whether we get

> it right.”

Fair enough: you run a business, and it’s your call whether to make your methods public. Trafalgar Group polling keeps their methods secret, as does Fivethirtyeight with their poll aggregation procedure. As long as things go well, it’s kinda fun to maintain that air of

mystery.

But “judge us by whether we get it right” is tricky. Shift 1% of the vote from the Democrats to the Republicans, and Biden still wins the popular vote but he loses the electoral college. Shift 1% of the vote from the Republicans to the Democrats, and Biden wins one more state and the Democrats grab another seat in the Senate. From the news articles about Cahaly’s polling, it seems that a key aspect of their method is to measure intensity of preferences, and it seems that Republicans won the voter turnout battle this year. So, looking forward, it seems that there could be some benefit to using some of these ideas—but without getting carried away and declaring victory after your forecast was off by 71 electoral votes. Remember

item 3 on our list

.

AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis7 Filed under Miscellaneous Statistics

,

Political Science

.

70 Comments

NONPARAMETRIC BAYES WEBINAR Posted by Eric Novik

on 15

November 2020, 3:05 pm This post is by Eric. A few months ago we started running monthly webinars focusing on Bayes and uncertainty. Next week, we will be hosting Arman Oganisian, a 5th-year biostatistics PhD candidate at the University of Pennsylvania and Associate Fellow at the Leonard Davis Institute for Health Economics. His research focuses on developing Bayesian nonparametric methods for solving complicated estimation problems that arise in causal inference. His application areas of interest include health economics and, more recently, cancer therapies.

ABSTRACT

Bayesian nonparametrics combines the flexibility often associated with machine learning with principled uncertainty quantification required for inference. Popular priors in this class include Gaussian Processes, Bayesian Additive Regression Trees, Chinese Restaurant Processes, and more. But what exactly are “nonparametric” priors? How can we compute posteriors under such priors? And how can we use them for flexible modeling? This talk will explore these questions by introducing nonparametric Bayes at a conceptual level and walking through a few common priors, with a particular focus on the Dirichlet Process prior for regression. If this sounds interesting to you, please join us this Wednesday, 18

November

at 12

noon ET.

P.S. Last month we had Matthew Kay from Northwestern University discussing his research on visualizing and communicating uncertainty. Here is the link to the video. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis3 Filed under Bayesian Statistics

,

Teaching .

5 Comments

YOU DON’T NEED A RETINA SPECIALIST TO KNOW WHICH WAY THE WIND BLOWS

Posted by Andrew

on 15 November

2020, 9:12 am

Jayakrishna Ambati writes: > I am a retina specialist and vision scientist at the University of > Virginia. I am writing to you with a question on Bayesian

> statistics.

>

> I am performing a meta analysis of 5 clinical studies. In addition > to a random effects meta analysis model, I am running Bayesian meta > analysis models using half normal priors. I’ve seen scales of 0.5 > or 1.0 being used. What determines this choice? Why can’t it be > 0.1 or 0.2, for example? Can I use the value of the heterogeneity > tau (obtained from the random effect meta model) to calculate sigma > and make that or a multiple of it to be the value of the scale?

My reply:

With only 5 groups, it can help to use an informative prior on the group-level variance. What’s a good prior to use? It depends on your prior information! How large are the effects that you might see? You can play it safe and use a weak prior, even a uniform prior on the group-level scale parameter: this will, on average, lead to an overestimate of the group-level scale, which in turn will yield to an overstatement of uncertainty. Regarding the specific question of why you’ll see normal+(0,1) or normal+(0,0.5): This depends on the problem under study, but we can get some insight by thinking about scaling. Consider two important special cases: 1. Continuous outcome, multilevel linear regression with predictors and outcomes scaled to have sd’s equal to 0.5 (this is our default

choice

because a binary variable coded to 0 and 1 will have sd of approx 0.5): we’d expect coefficients to be less than 1 in absolute value, hence a normal+(0,1) prior on the sd of a set of coefs should be

weakly informative.

2. Binary outcome, multilevel logistic regression, again scaling predictors to have sd’s equal to 0.5: again, we’d expect coefs to be less than 2 in absolute value (a shift of 2 on the logit scale is pretty big), hence a normal+(0,0.5) prior on the sd of a set of coefs should be weakly informative. In many cases, normal+(0.0.2) or normal+(0,0.1) will be fine too, in examples such as policy analysis and some areas of biomedicine where we would not expect huge effects. A related question came up last month regarding priors for non-hierarchical regression coefficients. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis1 Filed under Bayesian Statistics

,

Multilevel Modeling

.

2 Comments

THE RISE AND FALL AND RISE OF RANDOMIZED CONTROLLED TRIALS (RCTS) IN INTERNATIONAL DEVELOPMENT

Posted by Andrew

on 14 November

2020, 9:24 am

Gil Eyal sends along this fascinating paper coauthored with Luciana de Souza Leão, “The rise of randomized controlled trials (RCTs) in international development in historical perspective.” Here’s the story: > Although the buzz around RCT evaluations dates from the 2000s, we > show that what we are witnessing now is a second wave of RCTs, while > a first wave began in the 1960s and ended by the early 1980s. > Drawing on content analysis of 123 RCTs, participant observation, > and secondary sources, we compare the two waves in terms of the > participants in the network of expertise required to carry out field > experiments and the characteristics of the projects evaluated. The > comparison demonstrates that researchers in the second wave were > better positioned to navigate the political difficulties caused by

> randomization.

What were the key differences between the two waves? Leão and Eyal start with the most available explanation: > What could explain the rise of RCTs in international development? > Randomistas tend to present it as due to the intrinsic merits of > their method, its ability to produce “hard” evidence as compared > with the “softer” evidence provided by case studies or > regressions. They compare development RCTs to clinical trials in > medicine, implying that their success is due to the same “gold > standard” status in the hierarchy of evidence: “It’s not the > Middle Ages anymore, it’s the 21st century … RCTs have > revolutionized medicine by allowing us to distinguish between drugs > that work and drugs that don’t work. And you can do the same > randomized controlled trial for social policy” (Duflo 2010). But they don’t buy it: > This explanation does not pass muster and need not detain us for > very long. Econometricians have convincingly challenged the claim > that RCTs produce better, “harder” evidence than other methods. > Their skepticism is amply supported by evidence that medical RCTs > suffer from numerous methodological shortcomings, and that political > considerations played a key role in their adoption. These objections > accord with the basic insight of science studies, namely, that the > success of innovations cannot be explained by their prima facie > superiority over others, because in the early phases of adoption > such superiority is not yet evident. I’d like to unpack this argument, because I agree with some but not

all of it.

I agree that medical randomized controlled trials have been oversold;

and even if

I accept the the idea of RCT as a gold standard, I have to admit that almost all my own research is observational. I also respect Leão and Eyal’s point that methodological innovations typically start with some external motivation, and it can take some time before their performance is clearly superior. On the other hand, we _can_ port useful ideas from other fields of research, and sometimes new ideas really are better. So it’s

complicated.

Consider an example that I’m familiar with: Mister P

.

We published the first MRP article in 1997, and I knew right away that it was a big deal—but it indeed took something like 20 years for it to become standard practice. I remember in fall, 2000, standing up in front of a bunch of people from the exit poll consortium, telling them about MRP and related ideas, and they just didn’t see the point. It made me want to scream—they were so tied into classical sampling theory, they seemed to have no idea that something could be learned by studying the precinct-by-precinct swing between elections. It’s hard for me to see why two decades were necessary to get the point across, but there you have it. My point here is that my MRP story is consistent with the randomistas’ story and also with the sociologists’. On one hand, yes, this was a game-changing innovation that ultimately was adopted because it could do the job better than what came before. (With MRP, the job was adjusting for survey nonresponse; with RCT, the job was estimating causal effects; in both cases, the big and increasing concern was unmeasured bias.) On the other hand, why did the methods become popular when they did? That’s for the sociologists to answer, and I think they’re right that the answer has to depend on the social structure of science, not just on the inherent merit or drawbacks of the methods. As Leão and Eyal put it, any explanation of the recent success of RCTs within economics must “recognize that the key problem is to explain the creation of an enduring link between fields” and address “the resistance faced by those who attempt to build this link,” while avoiding “too much of the explanatory burden on the foresight and interested strategizing of the actors.” Indeed, if I consider the example of MRP, the method itself was developed by putting together two existing ideas in survey research (multilevel modeling for small area estimation, and poststratification to adjust for nonresponse bias), and when we came up with it, yes I thought it was the thing to do, but I also thought the idea was clear enough that it would pretty much catch on right away. It’s not like we had any strategy for global domination. THE FIRST WAVE OF RCT FOR SOCIAL INTERVENTIONS Where Leão and Eyal’s article really gets interesting, though, is when they talk about the earlier push for RCTs, several decades ago: > While the buzz around RCTs certainly dates from the 2000s, the > assumption—implicit in both the randomistas’ and their > critics’ accounts—that the experimental approach is new to the > field of international development—is wrong. In reality, we are > witnessing now a second wave of RCTs in international development, > while a first wave of experiments in family planning, public health, > and education in developing countries began in the 1960s and ended > by the early 1980s. In between the two periods, development programs > were evaluated by other means. Just as an aside—I love that above sentence with three dashes. Dashes are great punctuation, way underused in my opinion. Anyway, they now set up the stylized fact, the puzzle: > Instead of asking, “why are RCTs increasing now?” we ask, “why > didn’t RCTs spread to the same extent in the 1970s, and why were > they discontinued?” In other words, how we explain the success of > the second wave must be consistent with how we explain the failure

> of the first.

Good question, illustrating an interesting interaction between historical facts and social science theorizing. Leão and Eyal continue: > The comparison demonstrates that the recent widespread adoption of > RCTs is not due to their inherent technical merits nor to rhetorical > and organizational strategies. Instead, it reflects the ability of > actors in the second wave to overcome the political resistance to > randomized assignment, which has bedeviled the first wave, and to > forge an enduring link between the fields of development aid and > academic economics.

As they put it:

> The problem common to both the first and second waves of RCTs was > how to turn foreign aid into a “science” of development. Since > foreign aid is about the allocation of scarce resources, the > decisions of donors and policy-makers need to be legitimized. They argue that a key aspect of the success of the second wave of RCTs was the connection to academic economics.

WHERE NEXT?

I think RCTs and causal inference in economics and political science and international development are moving in the right direction, in that there’s an increasing awareness of variation in treatment effects, and an increasing awareness that doing an RCT is not enough in itself. Also, Leão and Eyal talk a lot about “nudges,” but I think the whole nudge thing is dead

,

and serious economists are way past that whole nudging thing. The nudge people can keep themselves busy with Ted talks, book tours, and TV appearances while the rest of us get on with the real work. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis1 Filed under Causal Inference

,

Political Science

,

Sociology

. 20

Comments

HOW TO DESCRIBE PFIZER’S BETA(0.7, 1) PRIOR ON VACCINE EFFECT? Posted by Bob Carpenter

on 13

November 2020, 3:00 pm Now it’s time for some statistical semantics. Specifically, how do we describe the prior that Pfizer is using for their COVID-19 study? Here’s a link to the report. * A PHASE 1/2/3, PLACEBO-CONTROLLED, RANDOMIZED, OBSERVER-BLIND, DOSE-FINDING STUDY TO EVALUATE THE SAFETY, TOLERABILITY, IMMUNOGENICITY, AND EFFICACY OF SARS-COV-2 RNA VACCINE CANDIDATES AGAINST COVID-19 IN HEALTHY INDIVIDUALS Way down on page 101–102, they say (my emphasis), > A MINIMALLY INFORMATIVE beta prior, beta (0.700102, 1), is proposed > for θ = (1-VE)/(2-VE). The prior is centered at θ = 0.4118 > (VE=30%) which can be considered pessimistic. The prior allows > considerable uncertainty; the 95% interval for θ is (0.005, 0.964) > and the corresponding 95% interval for VE is (-26.2, 0.995). I think “VE” stands for vaccine effect. Here’s the definition from page 92 of the report. > VE = 100 × (1 – IRR). IRR is calculated as the ratio of first > confirmed COVID-19 illness rate in the vaccine group to the > corresponding illness rate in the placebo group. In Phase 2/3, the > assessment of VE will be based on posterior probabilities of VE1 > > 30% and VE2 > 30%.

>

> VE1 represents VE for prophylactic BNT162b2 against confirmed > COVID-19 in participants without evidence of infection before > vaccination, and VE2 represents VE for prophylactic BNT162b2 against > confirmed COVID-19 in all participants after vaccination. I’m unclear on why they’d want to impose a prior on (1 – VE) / (2 – VE), or even how to interpret that quantity, but that’s not what I’m writing about. But the internet’s great and Sebastian Kranz walks us through it in a blog post, A look at Biontech/Pfizer’s Bayesian analysis of their COVID-19 vaccine trial

.

It turns out that the prior is on the quantity where are, in Kranz’s words, “population probabilities that a vaccinated subject or a subject in the control group, respectively, fall ill to Covid-19.” I’m afraid I still don’t get it. Is the time frame restricted to the trial? What does “fall ill” mean, a positive PCR test or something more definitive. (The answers may be in the report—I didn’t read it.) WHAT IS A WEAKLY INFORMATIVE PRIOR? It’s the description “minimially informative” and subsequent results calling it “weakly informative” that got my attention. For instance, Ian Fellow’s post (which Andrew summarized in his own post here), The Pfizer-Biontech vaccine may be a lot more effective than you think that Andrew just reported on, Fellows calls it “a Bayesian analysis using a beta binomial model with a weakly-informative prior.” What we mean by weakly informative is that the prior determines the scale of the answer. For example a standard normal prior (normal(0, 1)), imposes a unit scale, whereas a normal(0, 100) would impose a scale of 100 (like Stan and R, I’m using a scale or standard deviation parameterization of the normal so that the two parameters have the same units). WEAKLY INFORMATIVE IN WHICH PARAMETERIZATION? Thinking about proportions is tricky, because they’re constrained to fall in the interval (0, 1). The maximum standard deviation achievable with a beta distribution is 0.5 as alpha and beta -> 0, whereas a uniform distribution on (0, 1) has standard deviation 0.28, and a beta(100, 100) has standard deviation 0.03. It helps to transform using logit so we can consider the log odds, mapping a proportion to . A uniform distribution on theta in (0, 1) results in a standard logistic(0, 1) distribution on logit(theta) in (-inf, inf). So even a uniform distribution on the proportion leads to a unit scale distribution on the log odds. In that sense, a uniform distribution is weakly informative in the sense that we mean it when we recommend weakly informative priors in Stan. All on its own, it’ll control the scale of the unconstrained parameter. (By the way, I think transforming theta in (0, 1) to logit(theta) in (-inf, inf) is the easiest way to get a handle on Jacobian adjustments—it’s easy to see the transformed variable no longer has a uniform distribution, and it’s the Jacobian of the inverse transform that defines the logistic distribution’s density.) Fellows is not alone. In the post, Warpspeed confidence — what is credible?, which relates Pfizer’s methodology to more traditional frequentist methods, Chuck Powell says, “For purposes of this post I’m going to use a flat, uninformed prior in all cases.” Sure, it’s flat on the (0, 1) scale, but not on the log odds scale. Flat is relative to parameterization. If you work with a logistic prior on the log odds scale and then transform with inverse logit, you get exactly the same answer with a prior that is far from flat—it’s centered at 0 and has a standard deviation of pi / 3, or

about 1.

HOW MUCH INFORMATION IS IN A BETA PRIOR? It helps to reparameterize the beta with a mean and

“count”

The beta distribution is conjugate to the Bernoulli (and more generally, the binomial), which is what makes it a popular choice. What this means in practice is that it’s an exponential family distribution that can be treated as pseudodata for a Bernoulli

distribution.

Because beta(1, 1) is a uniform distribution, we think of that as having no prior data, or a total of zero pseudo-observations. From this perspective, beta(1, 1) really is uninformative in the sense that it’s equivalent to starting uniform and seeing no prior data. In the beta2 parameterization, the uniform distribution on (0, 1) is beta2(0.5, 2). This corresponds to pseudodata with count 0, not 2—we need to subtract 2 from to get the pseudocount! Where does that leave us with the beta(0.7, 1)? Using our preferred parameterization, that’s beta2(0.4117647, 1.7). That means a prior pseudocount of -0.3 observations! That means we start with negative pseudodata when the prior count parameter kappa is less than 2. Spoiler alert—that negative pseudocount is going to be swamped by

the actual data.

What about Pfizer’s beta(0.700102, 1) prior? That’s beta2(0.4118, 1.700102). If you plot beta(theta | 0.7, 1) vs. theta, you’ll see that the log density tends to infinity as theta goes to 0. That makes it look like it’s going to be somewhat or maybe even highly informative. There’s a nice density plot in Kranz’s post

.

Of course, the difference between beta(0.700102, 1) and beta(0.7, 1) is negligible—1/10,000th on prior mean and 1/1000-th of a patient in prior pseudocount. They must’ve derived the number from a formula somehow and then didn’t want to round. The only harm in using 0.700102 rather than 0.7 or even 1 is that someone may assume a false

sense of precision.

Let’s look at the effect on the prior, in terms of how it affects the posterior. That is, differences between beta(n + 0.7, N – n + 1) vs. beta(n + 1, N – n + 1) for a trial with n out of N successes. I’m really surprised they’re only looking at N = 200 and expecting something like n = 30. Binomial data is super noisy and thus N = 200 is a small data size unless the effect is huge. Is that 0.00102 in prior pseudocount going to matter? Of course not. Will the difference between beta(1, 1) and beta(0.7, 1) going to matter? Nope. matter? If we compare the posteriors beta(30 + 0.7, 170 + 1) and beta(30 + 1, 170 + 1), their posterior 95% central intervals are (0.107, 0.206) and (0.106, 0.205). So I guess it’s like Andrew’s injunction to vote. It might make a difference on the edge if we impose a three-digit threshold somewhere and just manage to cross it in the last digit. BETA-BINOMIAL AND JEFFREY’S PRIORS I’ll leave it to the Bayes theory wonks to talk about why beta(0.5, 0.5) is the Jeffrey’s prior for the beta-binomial model

. I’ve

never dug into the theory enough to understand why anyone cares about these priors other than scale invariance. AddThis Sharing Buttons Share to FacebookFacebookShare to TwitterTwitterShare to PrintPrintShare to EmailEmailShare to MoreAddThis4 Filed under Bayesian Statistics

.

24 Comments

� Older Entries

*

Search for:

*

Christ

* Martha (Smith) on Is causality as explicit in fake data simulation

as it should be?

* Martha (Smith) on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * Ron Kenett on Is causality as explicit in fake data simulation as

it should be?

* Sander Greenland on Is causality as explicit in fake data simulation as it should be? * Ron Kenett on Is causality as explicit in fake data simulation as

it should be?

* Mike on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * Andrew on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * James Mcmanus on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * Anoneuoid on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * paul alper on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * Dzhaughn on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * Sameera Daniels on Is causality as explicit in fake data simulation as it should be? * Martha (Smith) on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * Martha (Smith) on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ

* Keith O'Rourke

on Is

causality as explicit in fake data simulation as it should be? * Andrew on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * gec on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * John Williams on Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ * Dale Lehman on What went wrong with the polls in 2020? Another

More Annotations

Christopher Perez

2021-06-07 02:53:15

Christopher Perez

2021-06-07 02:53:15

Christopher Perez

2021-06-07 02:53:16

Christopher Perez

2021-06-07 02:53:16

Christopher Perez

2021-06-07 02:53:17

Christopher Perez

2021-06-07 02:53:17

Christopher Perez

2021-06-07 02:53:17

Christopher Perez

2021-06-07 02:53:18

Christopher Perez

2021-06-07 02:53:19

Christopher Perez

2021-06-07 02:53:19

Christopher Perez

2021-06-07 02:53:20

Christopher Perez

2021-06-07 02:53:20

Favourite Annotations

Christopher Perez

2020-05-19 10:33:06

Christopher Perez

2020-05-19 10:33:07

Christopher Perez

2020-05-19 10:33:22

Christopher Perez

2020-05-19 10:34:19

Christopher Perez

2020-05-19 10:35:02

Christopher Perez

2020-05-19 10:35:37

Christopher Perez

2020-05-19 10:35:49

Christopher Perez

2020-05-19 10:35:50

Christopher Perez

2020-05-19 10:36:00

Christopher Perez

2020-05-19 10:36:20

Christopher Perez

2020-05-19 10:36:26

Christopher Perez

2020-05-19 10:36:34

Text

Skip to content

* Home

* Books

* Blogroll

* Sponsors

* Authors

* Feed

Posted by Andrew

on 27 November

2020, 9:23 am

handbasket.

My reply:

Alper adds:

We’ll see.

. 14

Comments

Posted by Andrew

on 26 November

2020, 9:40 am

>

>

>

>

>

> Thank you,

> **

I replied:

> Andrew

work for a change.