Speaker: Chris Schwiegelshohn

Title: On Coresets for Logistic Regression

Abstract:
Coresets are one of the central methods to facilitate the analysis of large data sets. We continue a recent line of research applying the theory of coresets to logistic regression.
First, we show a negative result, namely, that no strongly sublinear sized coresets exist for logistic regression.
To deal with intractable worst-case instances, we introduce a complexity measure $\mu(X)$, which quantifies the hardness of compressing a data set for logistic regression. $\mu(X)$ has an intuitive statistical interpretation that may be of independent interest.
For data sets with bounded $\mu(X)$-complexity, we show that a novel sensitivity sampling scheme produces the first provably sublinear $(1\pm\eps)$-coreset.
Our algorithms are viable in practise, comparing favorably to uniform sampling as well as to state of the art methods in the area.

Joint work with Alexander Munteanu, Christian Sohler, and David Woodruff. To appear at NIPS 2018.

Bio:
Chris Schwiegelshohn is currently a Researcher in Sapienza, University
of Rome. He did his Phd in Dortmund with a thesis on "Algorithms for
Large-Scale Graph and Clustering Problems". Chris' research interests
include streaming and approximation algorithms as well as machine
learning.

Date: 26th of September 2018, 12:00-13:00

Location: Manno, Galleria 1, 2nd floor, room G1-204

Registration: Pizza and drinks will be offered at the end of the talk.
If you plan to attend, please register in a timely fashion at the
following link so that we will have no shortage of food:
https://doodle.com/poll/sqa6idxyhf83ugba