The level of granularity of open data often conflicts the benefits it ca...
We adapt image inpainting techniques to impute large, irregular missing
...
Nine language-vision AI models trained on web scrapes with the Contrasti...
Differential privacy mechanisms are increasingly used to enable public
r...
Machine learning algorithms are commonly specified in linear algebra (LA...
Fairness is increasingly recognized as a critical component of machine
l...
Figures are an important channel for scientific communication, used to
e...
Emerging transportation modes, including car-sharing, bike-sharing, and
...
Synthetic datasets have long been thought of as second-rate, to be used ...
Fairness is increasingly recognized as a critical component of machine
l...
In this paper, we propose a method for clustering image-caption pairs by...
We present a system to support generalized SQL workload analysis and
man...
Techniques to deliver privacy-preserving synthetic datasets take a sensi...
We describe customized synthetic datasets for publishing mobility data.
...
Algorithmic decisions often result in scoring and ranking individuals to...
We consider methods for learning vector representations of SQL queries t...
We propose methods for learning vector representations of SQL workloads ...
Data for good implies unfettered access to data. But data owners must be...
Many real-world applications require large-scale data annotation, such a...
Google BigTable's scale-out design for distributed key-value storage ins...