At Scribd (pronounced “scribbed”), we believe reading is more important than ever. Join our cast of characters as we build the world’s largest and most fascinating digital library: giving subscribers access to a growing collection of ebooks, audiobooks, magazines, documents, and more. In addition to works from major publishers and top authors, we also create our own original content exclusively for Scribd users. Our community includes over 1M subscribers in more than 190 countries. Join us in turning screen time into quality time!
What you'll do
Data quality and integrity are two areas of focus for your work in our existing, organically-grown data infrastructure. You will be in charge of building tools and technology to ensure that downstream customers can have faith in the data they're consuming. Based on the project, this might involve cross-functional work with the Data Science and Content Engineering teams to repartition or optimize business-critical Hive tables, or working with Core Platform to implement better processing jobs for scaling our consumption of streaming data sets. Almost everything you would be working on would be to increase the "customer satisfaction" for internal customers of Scribd data.