Speaker: Chris Schwiegelshohn
Title: On Coresets for Logistic Regression
Abstract:
Coresets are one of the central methods to facilitate the analysis of large
data sets. We continue a recent line of research applying the theory of
coresets to logistic regression.
First, we show a negative result, namely, that no strongly sublinear sized
coresets exist for logistic regression.
To deal with intractable worst-case instances, we introduce a complexity
measure $\mu(X)$, which quantifies the hardness of compressing a data set
for logistic regression. $\mu(X)$ has an intuitive statistical
interpretation that may be of independent interest.
For data sets with bounded $\mu(X)$-complexity, we show that a novel
sensitivity sampling scheme produces the first provably sublinear
$(1\pm\eps)$-coreset.
Our algorithms are viable in practise, comparing favorably to uniform
sampling as well as to state of the art methods in the area.
Joint work with Alexander Munteanu, Christian Sohler, and David Woodruff.
To appear at NIPS 2018.
Bio:
Chris Schwiegelshohn is currently a Researcher in Sapienza, University
of Rome. He did his Phd in Dortmund with a thesis on "Algorithms for
Large-Scale Graph and Clustering Problems". Chris' research interests
include streaming and approximation algorithms as well as machine
learning.
Date: 26th of September 2018, 12:00-13:00
Location: Manno, Galleria 1, 2nd floor, room G1-204
Registration: Pizza and drinks will be offered at the end of the talk.
If you plan to attend, please register in a timely fashion at the
following link so that we will have no shortage of food:
https://doodle.com/poll/sqa6idxyhf83ugba