Minimum Pubs for a PhD in CS?


Some of the faculty in my department would prefer that we don’t award a PhD to any candidate who hasn’t published at least three good papers. I’m curious if this is common and if people generally have strong feelings either way about this kind of requirement? Some web searching turned up not much information: UConn requires 3 papersCornell does not have a requirement, and finally Phil Guo reports that Stanford CS has an informal ~3 paper requirement. It’s pretty easy to put forth both pros and cons for this kind of requirement. I’ll omit my own views on this and maybe write about them later on.

Also see: Discussion at Hacker News.

,

22 responses to “Minimum Pubs for a PhD in CS?”

  1. I suppose this is put forth as a necessary rather than sufficient requirement. I have a few reactions against it:

    * It may reduce the intellectual authority and accountability of the thesis committee. If the student has published one paper proving P /= NP, and the thesis committee believe the work to be correct, shouldn’t the student be granted a PhD? Conversely, a thesis committee might put less effort into verifying the dissertation if it believes the community at large has already vetted it. Note that even for top conferences/journals, reviews often come from grad students.

    * It may discourage good systems development work. For example, Andrew Tridgell authored a very useful tool, rsync, for his dissertation. I don’t actually know how many papers he published about it, but suppose it were less than three, and suppose he were faced with choosing between refining the implementation or making additional (perhaps nominal) research contributions. Instead of having an important widely-used tool, we might have a poorly-implemented research prototype (that has enough generalities for three publications).

  2. Here in France we don’t have any requisite for a number of publications. Granted, the studies are structured differently and the time assigned to PhD specifically is much shorter (three years, sometimes four), so requesting 3 publications over this time period would probably be insane for most students (or use a fairly more generous criterion to give co-author status to students around).

    We are lucky enough to have PhD evaluation committees that do know what they’re reading about, so a thesis manuscript is judged on its own merit when evaluating a thesis, not surrounding publications if they exist. In fact it is generally the case that the manuscript contains a lot of unpublished results, and post-doc students sometimes invest non-neglectible amount of time in publishing chunks of their thesis work (and translating it, if the manuscript was written in french) as conference or journal articles.

    Of course having a pool of committed domain experts can’t last forever, and criterions such as “how many articles published” are taken into account later in the career process, generally when applying for a junior researcher position. But my subjective opinion should be that we should enjoy it while it lasts, and that putting pressure on PhD students to get their papers published before writing their manuscript is not necessarily the most productive strategy.

  3. To complete gasche’s comment, in France *in CS* there is no such requirement (but most universities state “The quality and impact of a PhD project, as that of any research project, should be measured through publications and patents that it produces.”)

    In other domains (for instance a bioinformatics PhD which ended up being administratively in the biology department) I’ve seen such restrictions (but AFAIK mostly demanding *one* publication) and they unfortunately often led to bad publications (unfinished work submitted to a low-level journal/conference just in order to fulfill the requirements).

    So… no a good idea in my opinion

  4. There is no publication requirement at the University of Chicago’s CS department:
    http://cs.uchicago.edu/info/phd_prog/dissertation

    That said, some of the theory students joke about the staple thesis approach. And certainly there’s a push to publish on it since, as one of my committee members half-joked, “it will be obvious later which sections of your thesis went through peer review at a top-tier conference and which your committee tried to look at before your defense.”

  5. I graduated from the University of Toronto, and there was no such formal requirement there. But my experience was that advisors and committees felt much more comfortable graduating students that had had been “vetted by the community” by having papers (2-3?) published in good conferences. My impression when talking to students in other North American schools is that this effect is almost universal.

    But there is something more important than granting, or not, a PhD, with respect to the “magic number” of good papers. It has to do with life after your PhD. If you do an informal poll of PhDs who get academic jobs at good universities and/or go to established research labs, there _does_ seem to be a minimum requirement. And that minimum requirement may be close to 3 top papers (as a lead author, nonetheless).

    The ability to land a job after your PhD has been a major driving force among some PhD candidates and/or their advisors to prolong PhD until their CVs look stellar. Hiring committees seem to be forgiving of the number of years spent on the PhD, but unforgiving in the number of top publications.

    Furthermore, I’ve been getting the impression that this “3” number is becoming outdated. And now the advisable thing do is to jump into a post-doc so that you can turn 3 top papers into 5 or 6, to raise the chances of standing out next time you are on the job market.

    So while I don’t think it makes sense to have any sort of requirement on number of top/good papers (as someone mentioned, a single P != NP paper can be more than PhD worthy), it ends up being an informal way other institutions judge the quality of graduates. So, because everyone wants their department to be considered great, this pressure will likely always be there.

    One would need to fix the issue of paper counting that is endemic to our community as a whole, before they have a chance having a meaningful debate of whether $x numbers of papers means that one’s work is worthy of a PhD.

  6. I believe these norms are also dependent on the advisor sometimes. Some professors prefer to keep the bars high.

    On a side note, another thing that sort of bothers me which is specific to CS (atleast the subarea in which i work in) is that conference papers get way too much importance. They involve the submission of 8-10 page papers and peer-review and rebattle which takes over 3 months. And on top, conferences have deadlines which causes a lot of stress each year and eats up a lot of valuable time. A bad side effect of all this is that people seem to be doing research just to publish. And, sometimes people end up making sub-optimal scientific compromises in their methods just to meet the conference deadlines.

    In my opinion, one should publish only when one has done something phenominal or he/she has found/discovered something really interesting to share so that the community at large can benefit from it. If not they just shouldnt bother publishing.

    In other scientific fields, the point of conferences is only to network and meet others in your community which exactly is the motivation behind organizing them. People just send abstracts or a poster and then go to network. And all the real work is published in journals which dont have deadlines.

    Also, i always wondered why the journals in CS (atleast in my subarea of research — computer vision) have way too low impact factors compared to journals of biology, physics, or chemistry. I think the impact factor is decided based on the average number of citations for papers in the journal per year. So, i would guess the reason behind a low impact factor could be that there is a lot of reinventing the wheel going on? Correct me if my guess is wrong. And reinventing the wheel means at large people in the community are not building up on each others work. This means they are always trying to come up with some novel way to solve a problem each time whether or not it solves the problem better when compared to approaches that have already been published. End of the day, as the time progresses, we as a community should focus on solving a problem in a better way (improved accuracy) rather than coming up with more and more novel ways. On a philosophical note, I feel, currently, we have a tendency to put more and more plants (and let them die) rather than nourishing/nurturing the existing plants and grow them into large magnificient trees.

  7. If you don’t have 3 good papers it shouldn’t disqualify you, but it’s kind of like “code smell” – it raises questions that you’d better be able to answer while defending your thesis.

    The first question if the candidate doesn’t have papers is why? (a) didn’t want to – that should be discouraged, publishing your research is valuable; (b) didn’t have anything to publish – then you don’t have a valid thesis; (c) couldn’t/wasn’t accepted – a whole set of other improtant questions come to mind.

    Is your topic noteworthy and useful? If your paper’s aren’t accepted anywhere (or anywhere good) then it suggests that maybe the world doesn’t consider your topic particularly important.
    Are your results/methods/whatever valid? Peer review and citations of your papers is some evidence towards that; the whole process does help people improve their research and fix the weak points.

    Of course, the committee may and will make their own opinions, but in all these cases publications (or lack of them) provide a valuable opinion from outside about the quality, quantity and importance of your research.

  8. @Sam

    I feel your suitemate’s pain on publishing a GC-related paper. After bouncing through several venues, I ended up with mine at a workshop, having given up on a top-tier conference. The reviews often ended up requesting things that were _years_ of work — reimplementing several other types of GC and porting the GC to run in the Jikes RVM to evaluate against them were two of the biggest ones (never mind that our GC relied on language-specific semantic guarantees).

    The only way that I or any of my students (if I am so lucky!) will ever try to publish something GC-related again is if we can concurrently do implement it in some larger GC testing framework. I just don’t know how you’d publish at a Serious Venue otherwise, unless you are the very first GC implementation on a new hardware architecture.

  9. I think the way you’ve summarized Phil’s comments about # of papers to graduate is inaccurate. The point made in that essay is that there are a number of ways to get to graduation with a varying number of publications. Using the raw # of pubs just ignores the quality, whereas Phil’s suggesting its something closer to quality * quantity. A single top-tier best paper could graduate you, or 0 top-tier but a few second tier papers.

    In my opinion, trying to standardize this is just silly since there isn’t a real quantitative measure to work with. I’m sure nobody at UConn would have a hard time getting 3 publications if they’re willing to submit to the crappiest conferences. It just puts a lower bar in.

    It also, unfortunately, encourages grad students to aim lower: if I had a minimum of 3 papers, I’d submit earlier, less complete work to second-tier journals to ensure I hit the required number of publications in a reasonable time. Since so few grad students actually stay in academia, it wouldn’t be worth their time to aim for top-tier conference.

    Of course, this entire discussion was missing the consideration of the topic. Do those papers even need to be in the same area of research, or is it just a raw publication count? Do they even need to be related to your dissertation?

  10. I graduated from University of Southern California in Computer Science. We didn’t have any such requirement either. I guess the discussion here is what is the bare minimum requirement to grant someone a PhD. And underlying that question is a generic judgement being passed out by the department as a whole which I find very unfair. There is a great disparity between research areas in terms of how easy it is to make a contribution and get a paper published in the top conference in that field. Many fields (such as robotics) might have conferences that accept short papers with the understanding that the completed work will go to a journal. Turn around times for journals are much longer than most conferences. Also, in a given conference, we can find a range of papers from those that make significant contribution to those that are marginal improvements over known techniques. Finally, reiterating another point others have made, I believe that there is a lot of learning in engineering things that don’t really lead to papers. Building systems, engineering them to work correctly and efficiently, maintaining them for use by others etc. are indirectly disincentivized in this system.

    Bottom line is that a number of publications is specific to the area and style of research that is probably best left to the judgement of the advisor.

    Karthik.

  11. I went to Georgia and we did have an unofficial rule about papers. I have mixed feelings on this one. Impact of research is certainly not measured by the quantity of papers, but rather by their quality and how fundamental is the work itself. That said, I have seen students who slack (grad school is a great place to be for someone who doesnt want to really work) and really contribute nothing. The publication rule pushes them. In my mind, a solution would be to require students to write technical reports and have them reviewed by faculty in the department or by their friends outside (if such work is not yet ready for publication, but will still measure the progress the student is making). You get a Ph.D as long it is clear you have made strides in your area and have a good path to solution. Research is not about solving problems, but rather clearing the weeds on way and someone else can come and lay the road.

  12. Thanks for the responses, folks! It makes me happy to see that hard requirements for publications are uncommon. The tendency to over-publish (mostly, but not entirely, at 2nd-tier and weaker venues) is already doing our field a disservice, and giving students a firm requirement doesn’t seem like a useful way to make this situation better. Unfortunately, at present the people who want to get hired into research faculty jobs need to be paper machines and I think this is probably also hurting the field since it will tend to select the wrong people (or perhaps it’s better to say that it deselects some of the right people).

  13. From my experiences in software engineering I have a couple of observations:

    1. Australian universities are adopting the “Stanford criteria”, yet having 3 yr (4 yr max) PhDs. So the required output is higher for the available time.

    2. Publishing is the key to grants and tenure so PhD students are competing with experienced researchers to get published. When you have 16%-18% acceptances for the “suitable conferences” it can be very hard to get published, even if the material is good.

    3. The requirement to publish early and often favours incremental work instead of truly novel advances. This in turn favours PhD candidates who work on their supervisor’s pet project(s).

    4. PhD research is seen as a stepping stone to a career in research and academia. The system, as it currently stands, disadvantages experienced engineers returning to pursue ideas which are too advanced to be of interest to industry and yet could significantly advance the state-of-the-art.

  14. It completely depends on the subfield, in my opinion.

    But a hard limit is silly, other than advisors warning students with certain job goals they need to jump through certain hoops to have a chance.

  15. AFAIK, CMU has no hard limit, and in some sub areas getting to 3 is hard, but most students who actually graduate do seem to pass the three bar with lots of room to spare.

  16. In Switzerland, at the universities with which I am accustomed with, the rule of thumb is that three papers is a good number for a PhD. But making that requirement formal is a different thing. It would mean that the entire intellectual contribution of a PhD can be made equivalent to a simple metric: 3 papers? In what venues? And why 3? And should those papers be cited by anybody or not?

    This is going to put pressure on the students to publish, and will lower the bar. I know people that didn’t publish anything during their PhD but their work has now a very large impact.

    If you must have a rule, then maybe ask for 1 paper. You don’t have to defend the number of papers, but only the fact that the community has to accept at least part of their work. And that the student must learn about the publication process.

    Cheers,
    M

  17. At the University of Manchester (UK) there is no limit, formal or informal, per se, although some students believe there is (in spite of what we tell them :)). In addition to the fact that students have different shapes to their course of study, different fields have different publication patterns. (E.g., certain very mathematical areas publish very little.)

    I think publications are potentially useful evidence, but no more. They are neither necessary nor sufficient for a PhD.

    Manchester does have a route for employees to achieve a PhD via publication of three (connected, high quality, journal) papers. I’d favor this as a general alternative to the PhD.

    Publishing in high acceptance venues such as workshops is generally valuable (forces students to package things up, puts the work out there and gets independent feedback, etc.). That’s a bit in tension with using publications as a pass marker.

  18. “Papers where you are not listed as the first author usually don’t help you to graduate, but they can still be valuable learning experiences.”

    (from Guo’s article)

    This is something that (slightly) troubles me — now, and as a member of a hiring committee. Academic evaluation of candidates seems to really focus (perhaps to scale up) on author order, but I know from experience that the group I was in at CMU almost always did pure alphabetical ordering, unless there was someone who essentially contributed nothing beyond a proofread and some very minor idea. Ed Clarke’s group isn’t the only place this happens, either — I know some labs where this is at least frequent if not always done. There are sub-communities in CS where this is more popular — in particular, formal verification seems to have a fairly significant inclination to alphabetical ordering.

  19. In response to Karthik G.

    “In my mind, a solution would be to require students to write technical reports and have them reviewed by faculty in the department or by their friends outside.”

    As a faculty member, the last thing I need is more work on my plate that does not yield any CV-relevant gains (e.g., papers, grants, etc.)

    On a cynical note, I think we should give students credit for writing grants. If a student can write a competitive grant for me that gets accepted, it should count for 5 publications, at least in terms of lowering my blood pressure. (Don’t take this seriously…).

  20. Hi Alex,

    Yeah, inferring from order is annoying. I’m in a sort of mixed group (in my prior position, order was agnoized over; the people I came too generally went alpha; now we have some weird hybrid system).

    It’s probably worth encouraging the practice that the bio and medical folks do of including a description of the actual contributions.