Our books

Become a Fan

« Advising students on grad school statements of purpose | Main | What it is like to be at a terminal MA program »



Feed You can follow this conversation by subscribing to the comment feed for this post.

Derek Bowman

What you say here sounds much stronger than the claim you made in the comments of your earlier post:

"My claim is not that the entire hiring process should be automated--that we should merely use algorithms to hire people. I don't know any I-O Psychologist who would advocate that. What I am claiming is that the empirical research indicates that we should use algorithmic means in early stages of selection to whittle down candidates to semi-finalists or finalists. That's all."

That would still leave the final selection in the hands of "subjective human judgers," particularly at the finalist stage where interviews are likely to play a bigger role. But - taken in context - you seem here to be advocating the removal of subjective judgment even at this late stage of decision making.

I'm left wondering both whether you really do think there is any sensible role for individual judgment and, if so, why, given your confidence in its unreliability as compared to more mechanical decision procedures.

Marcus Arvan

Hi Derek: Thanks for your comment! I'm not sure why you think what I'm saying here is stronger than what I previously wrote, including in the passage you quote.

In the present post, I argue that the process of *narrowing down* applicants should be more algorithmic. Nothing I have said here indicates that the final stages of a hire shouldn't include subjective judgments. And indeed, or so my spouse tells me, the kind of actual, directly work-related demonstration that occurs late in a hiring process (e.g. a teaching demo) have been found to have some predictive power (though here too, I think, candidates should be scored by judgers, and scorings should probably be normalized to correct for whatever gender, race, etc., biases are known to influence such judgments).

Derek Bowman

The reason it seems stronger is that this is in the context of a discussion of interviews, which often play a larger role at the later end of selection for which you previously claimed not to support the substitution of algorithms for individual decisions. How else could these features of late-stage-selection bear on your commitments to the preferability of algorithms?

anon prime

I've followed this discussion across posts and comments -- and I read Marcus in the same way Derek does.

I truly have no idea how an "algorithm" for philosophy job hiring is supposed to work. There is no widely agreed upon standard for what counts as good or even interesting philosophy, or even sometimes what counts as philosophy (e.g., dismissals of work related to gender or race as "sociology"). Nor is there any widely agreed upon standard for how to value and weight the various considerations that go into hiring, including quantity publications.

For example, some philosophers attach value and weight to academic pedigree, for various reasons that seem legitimate to them -- which many younger philosophers active in the blogosphere seem to regard as a "bias." Other philosophers attach little value and weight, in effect, to gender and racial diversity, inclusion, or balance. Then there are area "bias," publication venue "bias," research vs. teaching "bias," etc.

In short, who would write the algorithm that is supposed to make philosophy job hiring not merely "more objective" but critically so? Even something as modest as countering gender or racial bias early in a search process (viz., through "blind" initial review, which requires no algorithm) is going to run into those same realities once the candidates are unmasked, so to speak. I'm not getting it.

Marcus Arvan

Hi Derek: In all honesty, I waver here. On the one hand, as I understand it, the empirical literature suggests that algorithmic approaches outperform human judgers on predictive reliability in *general*. Insofar as this is the case (to the best of my understanding), I would indeed advocate an algorithm-only approach. But, human beings being what they are, and difficulties setting up algorithms to measure some things (such as "collegiality") being what they are, I suspect that some amount of human judgment will always be with us.

My general contention, therefore, is that it seems especially important to go the algorithm approach early on in the narrowing/selection process, and (at best) only let subjective judgments into the game at the very end, when judging differences between say, two finalists. And indeed, as I mentioned before, my understanding (speaking to my spouse) is that actual *performances* (e.g. a teaching demo) have predictive power (power that early-stage interviews do not).

Derek Bowman

Marcus, thanks for the reply.

I wonder if you understand just how strange your position sounds. It is one thing to say that we should take advantage of the best research in empirical psychology to help us create conditions under which the exercise of our own judgment can operate more reliably. But you seem to be saying that - in general - we would be better off doing away with human judgment in making human decisions entirely, if only we could. The only reason you waver - you seem to say - is that you think, sadly, we may be stuck relying on our own judgment in some areas (at least until we get better measures). Perhaps you mean the "in general" to be scope limited to hiring decisions, but it's hard to see what plausible cognitive model would make that one kind of decision special in this way.

This is not an argument that your position is wrong, but it does make me wonder if I've misunderstood you. It is a very radical claim, but you act surprised when people respond to it as such.


Hi Marcus,

Like others, I am confused about how far you want to take this. What do you think of the following:

"Human biases and irrationality inevitably play a role in hiring decisions, at least, under our current system. If possible, we should take steps to minimize this bias and irrationality. One way to do this is with the help algorithms that would narrow down the applicant pool. For example, we might use an algorithm to select, say, 30 promising candidates from the original pool. This can work alongside subjective review. For instance, search committee members might argue that a candidate who was not chosen by the algorithm is exceptional in unique ways, and therefore merits further review. If algorithmic selections are obviously problematic to the subjective eye, those applicants can be quickly eliminated."

I think that what is described above would help eliminate current problematic practices, while also leaving ample room for needed human judgement. Of course, there are some worries about the slippery slope of search committees making too many exceptions. Nonetheless, I would rather live with this worry than ban the possibility of live humans overruling an algorithm.

I get the impression you want to go much further than what is suggested above. Is that correct? If so why? Why couldn't we just make use of an algorithm without giving it overriding power?

Marcus Arvan

Hi Derek: Sorry for taking so long to respond (and apologies to others for taking so long as well). I am swamped with work this week, but I *will* respond to everyone's comments as quickly as I can!

To answer your query, yes, I am well aware of "how strange my position sounds." However, I do not think I act surprised when people respond to it as such. Far from it! This is a fight that *empirical psychologists* who work on this have been fighting for decades. The kinds of studies I reference (e.g. http://psycnet.apa.org/journals/law/2/2/293/ ) have explicitly, and repeatedly, addressed "widespread resistance" to these conclusions, rebutting the very kinds of arguments people are raising in these threads!

The conclusions I am defending may sound strange--but they are (in my view) the conclusions empirical findings actually support. It's not the job of science to be "intuitive." The idea that space and time are relative once sounded outrageous, and was rejected wholesale by people (including leading physicists) who found it too "strange." We need to not dismiss empirical findings as counterintuitive, but instead learn from them.

In 2002, Billy Beane of the Oakland A's baseball team radically revamped their drafting process utilizing the actuarial methods I am describing. They fired their scouts, and selected players merely on the basis of on-base percentage and runs-generated--calculating precisely how many runs they would need to make the playoffs. Their scouts called Beane's plan absurd, saying that (obviously!) only they--in their infinite scouting wisdom, with 25+ years (or whatever) of experience--could select the best ballplayers. The scouts then scoffed at some of the players the A's drafts, such as Kevin Youkilis, who scouts considered too heavy and too slow to draft.

Well, what do you know: with one of the lowest payrolls in the entire major leagues, Bean's purely statistical strategy paid off. The A's made the playoffs every year *just* as their algorithms predicted, and Kevin Youkilis is now a 3 time all-star and 2 time World Series champion. Since then, other major league baseball teams have adopted a similar approach.

Now, of course, this is only baseball--and "baseball achievements" (walks, runs, hits, etc.) are easily quantifiable. But...here's the thing: the empirical literature I'm pointing to shows that, generally speaking, whatever you want to predict, to the extent that you can carefully draw up an algorithm (which does take time and careful empirical work to do), that algorithm will tend to be as good or better at predicting *that thing* than human judgers.

This, in brief, is why Industrial-Organizational psychology is the single, top-growing field in North America (growing at 53% a year http://abcnews.go.com/Business/americas-20-fastest-growing-jobs-surprise/story?id=22364716). Their field is increasingly transforming hiring/selection from a haphazard, empirically-unsupported process into a genuine science based on measuring the predictive accuracy of different approaches. And, by and large, as I understand it, the evidence broadly supports the use of algorithms, as far as we can

An increasing number of governmental industries (NSA, CIA, etc.) and Fortune 500 companies (Kellogg, etc.) are hiring I-O psychologists because the methods demonstratively work, leading to better hiring outcomes, better productivity, etc. It is time, in my view, for academia to broadly follow suit, adopting hiring process that actually have empirical support and predictive power.

Finally, though, I should add--as I have a few times--that in my understanding actual work-performances (e.g. things like teaching demos, not interviews) have been found to have some predictive value, so this is one place (late in the hiring process) where human judgment probably have a good, legitimate role to play.

anon prime

"But...here's the thing: the empirical literature I'm pointing to shows that, generally speaking, whatever you want to predict, to the extent that you can carefully draw up an algorithm (which does take time and careful empirical work to do), that algorithm will tend to be as good or better at predicting *that thing* than human judgers."

So, if I'm now getting it, your view is largely about procedure. The aim is to better predict the best job candidates given whatever values and priorities an algorithm is designed to reflect. Of course, this would require that an algorithm's designers have in advance a clear idea about those values and priorities.

I'm not sure how well this might address what some of us would think of as deeper, substantive issues of fairness in hiring, especially when not driven by statistics, grant money, awards, or other undisputed measurables. But I do understand how such an algorithm could help a philosophy department, whatever its values and priorities, better satisfy the clear hiring goals it happens to have.

Derek Bowman

Marcus, thanks for the reply (and no rush!).

"We need to not dismiss empirical findings as counterintuitive, but instead learn from them."

This is a very dangerous attitude to have when it comes to moral issues of individual and group choice. One need only consider the many pernicious scientific empirical findings about the relative mental and moral powers of men over women, of white Europeans over other "races" or "civilizations", etc. (See for example Mill's discussion of the science of "brain size" in Chapter 3 of the Subjection of Women). To automatically surrender our judgment in the name of "empirical findings" is a grave abdication of our responsibility to critically incorporate such findings into our own best thinking about the world and our place in it.

"to the extent that you can carefully draw up an algorithm (which does take time and careful empirical work to do), that algorithm will tend to be as good or better at predicting *that thing* than human judgers."

I don't think anyone here has expressed doubts about this claim. What we doubt is that the thing measured by the algorithm will be identical with - or a reliable proxy for - what makes someone a good philosopher-scholar-teacher-colleague. I don't doubt that well-designed algorithms are very good at predicting the things that algorithms can measure and predict. What I doubt is the wisdom of forcing ourselves and our judgments about philosophical merit into such a Procrustean bed.

"in my understanding actual work-performances (e.g. things like teaching demos, not interviews) have been found to have some predictive value, so this is one place (late in the hiring process) where human judgment probably have a good, legitimate role to play."

But we have no plausible cognitive model for why this should be so. By hypothesis, wouldn't we expect to be better off using an algorithm to assess teaching demos, etc? If we're only relying on empirical generalizations, we have no plausible model of human thinking into which to fit these results.

Stacey Goguen

I share Derek's doubt that "the thing measured by the algorithm will be identical with - or a reliable proxy for - what makes someone a good philosopher-scholar-teacher-colleague"--at least given how things stand currently.

My impression is that most philosophers have a hard time articulating what they think makes for a good colleague, or operationalizing it, or coming to consensus or compromise about it with others in their department.

However, I share some of Marcus' support for trying to get to a place where we can have such algorithms. I think it's possible to get to a place where they are reliable--at least for some aspects of hiring in academia. I suspect that we may always want to have human judgment as at least a kind of oversight at the end of the process, but I don't think that what we're looking for is for mystical, or nuanced, that we couldn't offload some of it onto an algorithm.

Again, I worry about us falling into the temptation to think the numbers are more reliable than they are. Or thinking that the only things that exist (or that matter) are the things that can be easily measured.

But even if to only push ourselves to articulate more clearly exactly what it is that we want, I think attempts to quantify aspects of hiring may be beneficial. And I think we can use such quantification without "automatically surrendering" our judgment or shirking our moral responsibility. (Though agreed--that is a danger that must be accounted for.)

Marcus Arvan

Anon prime: That's exactly right. The point is one of procedure. I care very much about the kind of deeper, substantive questions about fairness (etc.) that you mention. By all means, we should examine these issues carefully. However, in the meantime, as Allen Wood's post illustrates, academic searches utilize poor *processes* -- ones that fail to reliably predict outcomes (as well as algorithms) or counteract biases at all (leaving the hiring process entirely up to the whims of individual search committee members and committees). As everyone points out, under prevailing conditions, the job market is little more than a "crapshoot." My point is, yes of course, there are deeper issues to discuss, debate, and improve--but one place we can start is with better processes.

Marcus Arvan

Hi Derek: I think I largely agree with you.

We should not just "defer to science", as science itself can be biased and put to bad social-political uses. We should also not just assume that "the thing measured by the algorithm will be identical with - or a reliable proxy for - what makes someone a good philosopher-scholar-teacher-colleague." These are things that we should -- by all means-- think about carefully.

My points are merely that we need to (A) stop *ignoring* the science, and (B) think carefully about whether prevailing selection/hiring procedures in academia withstand critical empirical scrutiny. Just like we shouldn't ignore climate science in favor of intuitions about the weather [viz. "I don't see global warming. We've had such a cold winter"], we shouldn't ignore the science of selection--especially when, as I'm told, there is such consensus in the field.

I think Wood's posts illustrate just how absurd prevailing hiring procedures are, as well as why algorithms are generally better. As Wood's posts illustrate, prevailing hiring procedures in academic philosophy contain *no* real controls for any form bias whatsoever (prestige bias, gender bias, etc.). The process tends to be determined by the mere whims of search committee members (and, yes, committee politics, etc.). This, in my view, is why the market is such a "crapshoot", and why it seems to many people not even remotely meritocratic. My point is: if you wanted to come up with a reliable hiring process (in terms of predicting anything), the process we currently have is woefully poor -- not only upon reflection, but given empirical science. And we should at least look to the science to see if we can do better (which, I think, it suggests we can).

My point is that, however one defines "good scholar", "good colleague", or whatever, one should use the best, most validated methods for measuring those things. The empirical literature, as I understand it, strongly suggests that to whatever extent we *can* operationalize things with algorithms, we should--as algorithms are the most reliable way to control for/counteract pernicious biases (biases that don't track merit, but also don't track truth for that matter -- since a lot of the things one looks for in a colleague, "collegiality", can be faked in an interview).

In short, I'm mostly just trying to point out just how problematic prevailing academic hiring methods are, and how they might be improved. I don't pretend to know "The God's Honest Truth" about this stuff...but I do think it is important to discuss and debate it, paying attention to (rather than ignoring) what science there is.

Lovely Colleague


At this point I, (and possible others), would love to know how you conducted your search this year - by "you" I mean both your department and you personally.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Comments are moderated, and will not appear until the author has approved them.

Your Information

(Name and email address are required. Email address will not be displayed with the comment.)

Job-market reporting thread

Current Job-Market Discussion Thread

Job ads crowdsourcing thread

Philosophers in Industry Directory