Anatomy of a Paper: Part I, Inspiration

How does theoretical physics get done? I had my first exposure to research doing observational astronomy as an undergrad; it was fascinating, following the process all the way from spending freezing nights at the telescope collecting photons, to reducing the data, seeing what the light curves taught you about the stars, to finally writing a paper. But I knew all along that I really wanted to be a theorist. Looking at those papers with their incomprehensible Greek indices filled me with anticipation for the day it would all finally make sense. (Eventually you realize that more and more of it does make sense, but it never all makes sense, or anywhere close. Most of your time is spent thinking about the parts you don’t understand.)

But I had no idea how such papers were actually produced — where did you start? When I was looking at grad schools, I took the train up to Princeton to visit the physics department and knock on people’s doors — rather less planned out than I would advise anyone else to do. (You couldn’t Google people back then.) I found one guy who was sitting in his office, a faint smell of cigar smoke in the background, scribbling equations on a legal pad. Looked promising. I introduced myself and asked a few silly questions, among which was “How do you do research?” He leaned back, propping his sneaker-clad feet onto the desk, fixed me with a look and said “I don’t know. You just have an idea, and then do research about it.” As advice goes, it was more Delphic than practical. I didn’t know at the time that this guy would later be my boss for a while, and eventually win the Nobel Prize.

So I thought it would be fun to describe the process in a bit more detail, using a worked example. It is no exaggeration to say that every paper is different, but there might be some useful lessons in there somewhere. I recently finished a paper with Lotty Ackerman and Mark Wise that is a pretty canonical example — a solid paper, not something earth-shattering that will change the face of science as we know it, but a meaningful contribution with some good ideas and some useful equations. Well, it was “recently finished” when I began writing this monstrously long post, which by now was many months ago. So I’ve decided to divide it into pieces — this will be the first of a three-part series.

Lotty is a grad student here at Caltech; she had previously worked with Mark, who is a respectable particle theorist in the office next to mine. He knew that she was cosmologically inclined, so introduced Lotty and me to each other even before I officially arrived. I suggested to Lotty that we begin to think about density perturbations in inflation (the hypothetical period of accelerated expansion in the early universe), as much because I wanted to learn more about the subject as for any more focused research goal. I’m not the best advisor in the world; I have lots of ideas, but they inevitably start out rather ill-formed, and most of them stay that way. Occasionally one of them coalesces out of the fog into something substantial, and a paper gets written. It’s a harrowing way to operate, especially from the grad-student perspective.

One day Lotty was having lunch with Jonathan Pritchard, another grad student here, and they wondered out loud what would happen if inflation didn’t happen the same way in every direction in space. That is, what the consequences would be if there were some direction picked out throughout the universe, so that inflation occurred at a different rate (or something) parallel to that direction than perpendicular to it. Presumably there would be something different we could observe about the density fluctuations if we looked along that particular direction than if we looked in another direction, but what exactly? How could we tell? And is there some physical mechanism we could imagine introducing that would actually pick out a direction during inflation, and then (just to keep things simple) disappear afterwards so that we wouldn’t notice it today? Don’t ask me why they thought of it. Just the kind of thing you chat about at lunch all the time, if you happen to be a theoretical cosmologist.

This kind of meandering speculation is one way papers get started. You (if you’re like me — I can’t speak for other people) never sit down and say, “Let’s have an idea.” Some people are fortunate enough to have programmatic, focused research agendas — when I was a postdoc at MIT in the early Nineties, Ed Bertschinger had collected around him an amazing set of postdocs and grad students, all focused on understanding temperature anisotropies in the cosmic microwave background and what they could tell us about the universe. It was a great moment to be thinking about those issues, and a lot of those students are now high-powered faculty members with groups of their own. But most theorists are not quite so systematic. You noodle over problems, talk to other people with similar interests (or complementary skill sets), make connections between different ideas. Occasionally a flash of insight will hit just before you fall asleep, or while you’re waiting for the barista to make your latte.

(I should make clear that this particular “What if?” question is not completely unmotivated speculation. Inflation is a great theory, and is likely to be “right” in some yet-to-be-defined sense, but it’s not something that anyone should think we more or less understand. We’re extrapolating well beyond known physics, so it pays to keep an open mind. One way of forcing yourself to keep an open mind is to ask specific and testable questions about the space of possibilities encompassed by your ideas.)

We’re passing over pretty quickly what is the key step in the whole process of paper-writing — asking a good question, and equally importantly, realizing that it’s an interesting question, and that there is a way to answer it. The rest is the straightforward part, staying up late and solving equations. Unfortunately, there’s no known way to formalize this process of recognizing good questions. You’d be surprised at how often, once you’ve had your basic training, you read someone else’s paper and think “I totally should have thought of that first.” But you didn’t.

A quick glance at papers appearing on shows that the vast majority of them are collaborative, rather than single-author. That’s a reflection both of how ideas arise in the first place — friends chit-chatting over lunch, or via email, or at conferences — and how the work actually gets done — often enough, one person will have an idea but someone else will have the expertise necessary to bring it to fruition. I will never understand how people can suggest replacing conferences or seminar visits with talks broadcast over the internet. That’s like trying to improve a restaurant experience by making sure the plates and cutlery are really shiny, and doing away with the food entirely. Conferences aren’t about talks, although those are occasionally interesting. (David Lodge, in Small World, holds up the ideal conference as one in which there aren’t any talks at all.) They’re about the ongoing low-level interaction between the participants at meals and coffee breaks. That’s where the ideas get created! Then you can each go home and apply yourself to the nitty-gritty work of turning those ideas into papers.

Speaking of which — the answer to the inflation-with-a-preferred-direction question wasn’t obvious, so Lotty asked Mark about it. (Who knows where I was — off traveling, probably.) He didn’t know either, but it sounded like an interesting question. So (as one will do) he started scribbling down some models of inflation that might behave that way. Basically, trying to invent a way to allow the negative pressure associated with the inflaton field (the hypothetical field whose energy drives the hypothetical accelerated expansion) to be direction-dependent. We have some general pre-existing ideas about how inflation might conceivably work, and a good field theorist has a bag full of models that can be shaped into different forms depending on the problem under consideration, so it was a matter of asking how easy it would be to tweak those models to give them a preferred direction.

When I did eventually drop by my office, Mark mentioned the idea to me. It sounded interesting, but I didn’t have anything insightful to add off the top of my head. But that afternoon there was a physics colloquium, during which my mind wandered, and I started thinking of different ways the inflaton might get a direction-dependent pressure. After the talk, I went to Mark’s office to say “Your idea is crazy, but here’s an idea that might work.” The next day, Mark gathered Lotty and me into his office to explain why my idea was crazy, but he had a new idea that might work. That process continued for a while, back and forth between the three of us; suggesting models, finding reasons why they should be discarded, realizing that a previously-discarded model might be able to sidestep the previous objections, and so on.

Along the way, we had the good idea of thinking about the problem in a completely phenomenological, model-independent way. In other words, there is one particular final product that we get out of inflation: a power spectrum that tells us how strong the density fluctuations are at any given length scale. From this there is a well-understood procedure to predict temperature fluctuations in the cosmic microwave background, which are the most directly observable consequences of the primordial density fluctuations. So, the phenomenological approach is to forget about particular models of inflation, and simply ask what kind of impact a preferred direction could possibly have on the power spectrum (and thus the CMB).

In principle, the answer is “all sorts of impacts.” The perturbations are generally described in terms of an amplitude defined at each length scale and each direction on the sky. (More technically, we express the power spectrum in Fourier space as a function of the wavevector.) In the usual description, every direction is deep down the same as every other, so really the power spectrum is just a function of the length scale. Even better, to a good approximation inflation predicts that the amplitude of the fluctuations should be scale invariant — a constant value, the same at every length. So really the complete power spectrum is specified by just one number! That’s the “amplitude of the primordial density fluctuations.” Having only one parameter makes your theory extremely predictive, which is why we can squeeze so much information out of the data we get from WMAP, for example. (Of course, we immediately start adding new parameters, but that’s another story.)

But now that we have a preferred direction, we can imagine that the perturbation amplitude really does depend on the direction we look in on the sky. (The fluctuations might be a bit stronger [or weaker] if we happen to be looking exactly along the direction that was picked out as special during inflation, in other words.) Furthermore, in principle it could have a different impact at every different length scale! So, not very predictive.

On the other hand, there’s a good physical reason why the perturbations from ordinary inflation are scale-invariant; the process of inflation itself is basically the same during most of its duration. While inflation is going on, the universe is expanding at an approximately constant rate, and stretching tiny quantum fluctuations into large-scale density perturbations. Because the process of inflation is uniform, the amplitude of the resulting perturbations is (basically) uniform.

Therefore, we should (even in the absence of any particular model) be able to apply similar reasoning once we stick in a preferred direction. Our idea was that there would be some violation of rotational invariance — either inflation would happen slightly more rapidly along some particular direction, or the decay of the inflaton field into ordinary matter and radiation would be more efficient along some direction, or whatever — but it would have a constant magnitude while inflation was happening. So we should expect the new effect (which we were imagining to be small, given that it’s certainly not bloody obvious in the existing CMB observations) would also be scale-invariant! That makes life much simpler. We’re now suggesting that, instead of the primordial perturbations just having a single amplitude that is independent of both direction and length scale, there is a tiny extra modulation of the amplitude that depends on one new pure number (to specify how big the effect is) and one direction on the sky (corresponding to the preferred direction). In other words, three new parameters: one magnitude of the effect, and one position on the sky (specified by right ascension and declination, for example).

Can we make any money out of this bold assumption? Or have we been led by the drive toward simplicity down an ultimately futile line of speculation? Stay tuned for the exciting Part Two! And then for Part Three!

  1. I’m glad to see discussion and support for talking about asking questions to get started, not just focusing on presenting and discussing fully formed theories and critiques. That is the sort of thing a gifted amateur or outsider can do sometimes, as I have argued here recently. Then the experts can finish what was started in a way the amateur likely could not, possibly leading to new understanding. I realize we may not have found much substance in all the reflections printed in Foundations of Physics, Physics Essays or Speculations in Science and Technology, etc, but those publications give or gave a forum for thinkers (less restrictively filtered than say, at Physical Review) to ruminate in a more speculative way.

  2. I absolutely loved the post. As a freshman in college with possible aspirations to go into physics, I really love reading first hand accounts of what doing physics is like. I can’t wait for part 2!

  3. Hi Sean,

    The breaking of rotational invariance in the dynamics of the perturbations should also generate a ton of vector modes, which will produce lots of B and E polarization modes on the CMB, which should give you a strong constraint on the vector norm. I wanted to do a quick estimate of it one day and email it to you when the paper came out, but I got sidetracked with holidays and other such fun stuff at that time and completely forgot about it. So I figured here is as good a place to mention it….

  4. Pingback: How research is done? « Entertaining Research

  5. Eugene, we’re imagining that the vector expectation value magically goes to zero after inflation. So my guess was that any purported vector perturbations would rapidly decay. We didn’t have to make that assumption, of course, it just seemed to make our lives easier.

    If there is something to say, you should write a paper! Or we should write one together. (See how the wise elder colleague tries to coast on the ideas of their [former] students?)

  6. Sean, you have some great posts for aspiring undergraduates. Especially those who contacted more professors then they should have as an undergraduate (I have Google search), technically did Astronomy research as an undergraduate (will be posted to astro-ph any day now, I swear), would like to move on to theoretical cosmology in grad school, and look forward when their advisor wins the Nobel Prize.

  7. sean! very much looking forward to that completely necessary book “letters to a young physicist” by sean carroll (and whoever else wants to jump on board). seriously. also in all seriousness, thanks for this post.

    recently, i have been thinking a lot about this process, and it is useful to hear about it from outside my normal realm at PI. Specifically, I think one of the challenges for theory students is that all of the guide books about how to do research really target people who are grappling with data as well as ideas. When all you’ve got are ideas and the attempt to stretch them, things get … murky. so, i think we can take all the advice we can get about how to unmurk as much as possible.

    Having said all that, I have a set of questions that have been coming up a lot for me recently: How do you evaluate whether a problem is worth pursuing? When do you decide to give up? As a graduate student with the attending pressures, how are these evaluations made? How do you adjust this decision making as you move up the career ladder?

    I feel as though it is hard to get thoughtful answers to these questions, but I know we can rely on your willingness to “waste” time on pedagogy to give them some real thought if you have a moment to do so. 🙂

  8. Pingback: Anatomy of a Paper: Part II, Calculation | Cosmic Variance

  9. Chanda, these are good questions, of course. Some problems you think are interesting; some problems you think are ones to which you could make a useful contribution; some problems are the kind of things you should work on to advance your career because other people agree that they are interesting. The right thing to do is to work on problems in the intersection of these sets. If the intersection is empty, then you might be in trouble.

    How to find such problems, and when to give up, etc., are all perfectly reasonable questions that don’t have straightforward algorithmic answers. My motivation behind these posts was to provide a close-up look at one specific example of a successful paper. It’s only by considering examples that anyone can hope to develop judgment about how to choose good problems. Unfortunately, every problem is different, and at some point you have to take the plunge and hope for the best.

  10. fair enough. I certainly like the idea of explaining how an idea is generated, studied and published, and it would be nice if more people did something similar. Hopefully other bloggers will feel inspired. Incidentally there was something similarly directed today on asymptotia, although perhaps not purposefully. But again it was a post about how an idea came to be and yet another example which I appreciated seeing.

    cheers and thanks 🙂

  11. That’s the trick; you have to see a variety of examples, and work through some of your own. And then people will be asking you how you do it, and you won’t be able to give them quite a satisfactory answer.

  12. As a graduate student I should say it is a common manner to add your advisor into the list of authors of your paper, whether or not he/she has anything thing to do with it. But this is not true. You wouldn’t have the chance to finish your paper as a student without your advisor. What is my point? I wanna say I get confused that who should enter the author list and whose name should be put into the acknowledgement part.

  13. Hi Sean

    I’m reading your post while I’m working in a paper for a conference, which I should deliver tomorrow. I found this kind of posting very inspirational because remembers me why I decided, after a successful career in the industry, dedicate all my energies in basic research.

    I think that is that indescribable excitement of being closer and closer to a good idea that keeps you going on, even when things don’t follow a clear path from the initial “what if” to the final paper (if they are published at all!)..

    Maybe next time you could also talk about the unsuccessful papers or about those ideas that make sense only after been in the background of your mind for many years, which in my experience are two common cases.

    Now, Im going back to my paper . . .


  14. A very interesting post, to see a close-up of what a physicist (a speculative ambition, I suppose…) is like. Let me see the next part…
    By the way, there are quite a few things that I don’t understand in this article… Guess I’m still far from knowing “true” physics.

  15. Think about Einstein, Dirac, Feynman, best theoretical physicists dont
    really need to write many-author papers! Nowadays, people get full
    credit for contributing 1/3 or 1/4 to a paper, how many people have
    claimed he wrote a 1/4 paper this year????????

  16. Pingback: Dude! Where's my baryons? | Cosmic Variance

  17. Pingback: A Lengthy Exercise | Cosmic Variance

  18. Sean:
    I have always wandered how the inflation field ( scalar field) is conveyed over space as space expands faster than the speed of light? A typical answer is: not to worry because you cannot communicate any information via this field. Hmmm. No information. Is not the expansion rate dictated by this field? Is this not information? This question has been a source of concern ever since I read about inflation back in the mid eighties. I have yet to read more than the info scenario as the reason for the acceptance of the expansion rate can be greater than the speed of light. Now you want to suggest some asymmetry in the expansion!! What exactly is the mechanism that transmits this inflation field over space as the space is expanding?

  19. I have always wandered how the inflation field ( scalar field) is conveyed over space as space expands faster than the speed of light? A typical answer is: not to worry because you cannot communicate any information via this field. Hmmm. No information. Is not the expansion rate dictated by this field? Is this not information? This question has been a source of concern ever since I read about inflation back in the mid eighties. I have yet to read more than the info scenario as the reason for the acceptance of the expansion rate can be greater than the speed of light. Now you want to suggest some asymmetry in the expansion!! What exactly is the mechanism that transmits this inflation field over space as the space is expanding?

    I am not sure I understand the question.

    The evolution of the inflaton field (and the metric) is governed by causal local equations. So there’s clearly no issue of superluminal communication there.

    There is a bit of a mystery about the initial conditions for inflation to start. They require that the inflaton field (and the metric) be fairly homogeneous over a region somewhat larger than horizon at the start of inflation. That’s hardly impossible, but it is a bit disappointing.

    However, from the text of your message, it doesn’t seem to be what you are asking about. You don’t seem to be worried about the initial conditions, but rather about the subsequent evolution during inflation.

    But I don’t see why.

  20. Jacques:
    Thanks for the reply. OK let me try to explain why I have difficult time grasping this expanding scalar field during inflation. If we want to study the propagation of a temperature field (scalar) we solve a PDE. The scalar field propagation speed is clearly limited. What I had in mind were similar equations for the inflaton field. But it seems that the propagation speed of the inflaton field is directly tied to the expansion speed of space which is also tied to the value of the inflaton field.

    If this is a quantum field and obeys all the QM rules how can it travel faster than the speed of light? I have read all of the layman’s books like Guth’s and others and this question never seems to get answered satisfactorily, at least to me.

    Well if you have a readily available reference I will just refer to that and thank you for your time.

  21. If we want to study the propagation of a temperature field (scalar) we solve a PDE. The scalar field propagation speed is clearly limited. … If this is a quantum field and obeys all the QM rules how can it travel faster than the speed of light?

    Quantum, schmontum! The question is a purely classical one.

    The evolution of the inflaton and metric are governed by a well-known set of coupled PDE’s. In fact, because we are interested in spatially homogeneous solutions, we can reduce these to coupled ODE’s.

    I really don’t grasp what it is about those equations that makes you think their solutions should be acausal.

  22. Nice post Sean!

    In my case rather timely in fact, as I’ve just learnt a little about density perturbations in my Cosmology class.

    One question I have (that my lecturer couldn’t answer and indicated that no one really knows):

    Why did inflation stop?

  23. Pingback: Ask a String Theorist! Or an Atomic Physicist. | Cosmic Variance