Dr. Philipp Hennig

Address: Spemannstr. 38
72076 Tübingen
Room number: 223
Phone: +49 7071 601 572
Fax: +49 7071 601 552
E-Mail: phennig
Print page    
Picture of Hennig, Philipp, Dr.

Philipp Hennig

Position: Group Leader  Unit: Schölkopf


See also my CV for more information on myself.

My work concerns

Probabilistic Inference for the elementary level of Intelligence

Intelligence is the ability to act under uncertainty. It exists on a broad range, not just of physical, but also of computational scales: From simplistic ideas like gradient descent, which may be a microbe's strategy to get closer to a source of nutrients, to an adult human's reasoning about career goals. Much of modern research in machine learning and artificial intelligence aims for the top of this hierarchy: algorithms capable of building highly structured models, and taking complicated decisions, at high computational cost. I believe that there is still plenty of room for improvement left at the bottom, too.


Algorithms for the bottom end of the intelligence hierarchy are those constructed by numerical mathematics. They are methods that take as input a function and return elementary properties of that function that are not tractable from the analytic form alone: Optimizers return the location of (local or global) extrema. Quadrature methods return the values of integrals. Sampling methods interpret the function as an unnormalised probability distribution to draw random numbers from. Differential equation solvers And control algorithms treat the function as describing a dynamical system to simulate. It is not a new, but still a little-known idea that all these methods can be seen as performing inference: Making statements about an uncertain quantities given certain observations of related quantities.


These algorithms are the building blocks for the more complex, expensive, fancy top level intelligence. So they have to be modular, to be re-usable. They have to be robust, because their failure may cause big problems upstream. And of course they have to be cheap. In my work, I try to address theses requirements. Here is a selection of some of it. See "publications" for pdfs and detailed citations, and my CV (link above) for more information.:


quadratic optimization under noise

Stochastic gradient descent is still the dominant algorithm for the training of many online learning algorithms, like neural networks. All just because more elaborate ideas, like quasi-Newton methods, cannot deal with noise? See what can be done about that: Hennig. "Fast Probabilistic Optimization from Noisy Gradients". ICML 2013


nonparametric quasi-Newton methods

Did you know that BFGS is a least-squares regressor? See what happens when you make it nonparametric: Hennig & Kiefel. "Quasi-Newton methods, a new direction". ICML 2012


information-efficient experimental design

When optimizing experimental parameters in search of a global optimum, algorithms shouldn't try evaluating close to the optimum. They should try to evaluate where they expect to learn most about the optimum. Hennig & Schuler, "Entropy Search for Information Efficient Global Optimization". JMLR 13 (2012).


optimal Bayesian reinforcement learning

Probability theory offers a uniquely coherent view on the infamous exploration/exploitation tradeoff: From the Bayesian view, reinforcement learning is about modelling the effect of possible future observations on the optimality of decisions taken in the present. In general, this decision process is intractable. But under Gaussian process assumptions (which, depending how on look on it, is either a quite general, or a quite limited set of assumptions), the right answer moves within reach of numerical analysis. Hennig, "Optimal Reinforcement Learning for Gaussian Systems", NIPS 2011


kernel topic models: fast inference in dependent Dirichlet models

Topic modelling is a very popular area of machine learning at the moment. Documents come with metadata, and topics change over time, and from document to document depending on the author, the subject, and many other features. The probabilistic extension of topic models that allows modelling such effects requires an algorithmic link between discrete distributions and continuous domains, often realised as a set of "dependent Dirichlets". We pointed out how to do this, in a numerically extremely efficient way. Hennig, Stern, Herbrich and Graepel, "Kernel Topic Models". AISTATS 2011.


Bayesian tree search

Tree search, finding the optimal leaf of a tree, is exponentially hard in the depth of the tree, because trees are exponentially big in their depth. But what happens during that exponentially long search? If you have a probabilistic belief over the value and location of the optimal leaf, and get one more observation of one individual leaf's values? Shouldn't updating the belief cost only linear time? It does. Hennig, Stern and Grapel. "Coherent Inference on Optimal Play in Game Trees". AISTATS 2010

References per page: Year: Medium:

Show abstracts

Articles (5):

Hennig P Person (2014) Probabilistic Interpretation of Linear Solvers . in revision
Bangert M , Hennig P Person and Oelfke U (2013) Analytical probabilistic modeling for radiation therapy treatment planning Physics in Medicine and Biology 58(16) 5401-5419.
Hennig P Person and Kiefel M Person (2013) Quasi-Newton Methods: A New Direction Journal of Machine Learning Research 14 807-829.
Hennig P Person and Schuler CJ Person (2012) Entropy Search for Information-Efficient Global Optimization Journal of Machine Learning Research 13 1809-1837.
Hennig P Person and Denk W (2007) Point-spread functions for backscattered imaging in the scanning electron microscope Journal of Applied Physics 102(12) 1-8.

Conference papers (15):

Garnett R , Osborne M and Hennig P Person (2014) Active Learning of Linear Embeddings for Gaussian Processes In: Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, (Ed) NL Zhang and J Tian, UAI2014, AUAI Press, Corvallis, Oregon, 230-239.
Hennig P Person and Hauberg S (2014) Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, JMLR W&CP volume 33, (Ed) S Kaski and J Corander, AISTATS 2014,, 347-355.
Kiefel M Person, Schuler CH Person and Hennig P Person (2014) Probabilistic Progress Bars 36th German Conference on Pattern Recognition (GCPR) 2014. accepted
Bangert M , Hennig P Person and Oelfke U (2013) Analytical probabilistic proton dose calculation and range uncertainties International Conference on the Use of Computers in Radiation Therapy.
Hennig P Person (2013) Fast Probabilistic Optimization from Noisy Gradients In: Proceedings of The 30th International Conference on Machine Learning, JMLR W&CP 28(1), (Ed) Sanjoy Dasgupta and David McAllester, ICML, 62–70.
Klenske E Person, Zeilinger M , Schölkopf B Person and Hennig P Person (2013) Nonparametric dynamics estimation for time periodic systems In: Proceedings of the 51st Annual Allerton Conference on Communication, Control, and Computing, , 486 - 493.
Lopez-Paz D Person, Hennig P Person and Schölkopf B Person (2013) The Randomized Dependence Coefficient In: Advances in Neural Information Processing Systems 26, (Ed) C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K.Q. Weinberger, 27th Annual Conference on Neural Information Processing Systems (NIPS 2013), 1--9.
Meier F , Hennig P Person and Schaal S (2013) Local Gaussian Regression . submitted
Hennig P Person and Kiefel M Person (2012) Quasi-Newton Methods: A New Direction 29th International Conference on Machine Learning (ICML 2012), 1-8.
Bócsi B Person, Hennig P Person, Csató L Person and Peters J Person (2012) Learning Tracking Control with Forward Models IEEE International Conference on Robotics and Automation (ICRA 2012), 259 -264.
Hennig P Person, Stern D , Herbrich R and Graepel T (2012) Kernel Topic Models Fifteenth International Conference on Artificial Intelligence and Statistics (AI & Statistics 2012), 1-9.
Cunningham JP , Hennig P Person and Lacoste-Julien S (2012) Approximate Gaussian Integration using Expectation Propagation -, 1-11. submitted
Hennig P Person (2011) Optimal Reinforcement Learning for Gaussian Systems In: Advances in Neural Information Processing Systems 24, (Ed) J Shawe-Taylor, RS Zemel, P Bartlett, F Pereira and KQ Weinberger, Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS 2011), 325-333.
Bangert M , Hennig P Person and Oelfke U (2010) Using an Infinite Von Mises-Fisher Mixture Model to Cluster Treatment Beam Directions in External Radiation Therapy (Ed) Draghici, S. , T.M. Khoshgoftaar, V. Palade, W. Pedrycz, M.A. Wani, X. Zhu, Ninth International Conference on Machine Learning and Applications (ICMLA 2010), IEEE, Piscataway, NJ, USA, 746-751.
Hennig P Person, Stern D and Graepel T (2010) Coherent Inference on Optimal Play in Game Trees In: JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, (Ed) Teh, Y.W. , M. Titterington, Thirteenth International Conference on Artificial Intelligence and Statistics, JMLR, Cambridge, MA, USA, 326-333.

Technical reports (1):

Hennig P Person: Expectation Propagation on the Maximum of Correlated Normal Variables, Cavendish Laboratory: University of Cambridge, (2009).

Posters (2):

Lopez-Paz D Person, Hennig P Person and Schölkopf B Person (2013): The Randomized Dependence Coefficient, Neural Information Processing Systems (NIPS 2013).
Hennig P Person, Stern D and Graepel T (2009): Bayesian Quadratic Reinforcement Learning, NIPS 2009 Workshop on Probabilistic Approaches for Robotics and Control, Whistler, BC, Canada.

Theses (1):

Hennig P Person: Approximate Inference in Graphical Models, University of Cambridge, (2010). PhD thesis

Export as:
BibTeX, XML, Pubman, Edoc