At OCLC, we believe you'll do the best work of your life when you're living the best life possible.
We work hard to build the technology that connects thousands of today's libraries. But we also work hard to make a job at OCLC a meaningful part of a balanced life- not a substitute for one.
The Job Details are as follows:
Discover. Innovate. Collaborate. Inform. A few words we use to describe a career at OCLC.
Technology with a Purpose. OCLC supports thousands of libraries in making information more accessible and more useful to people around the world. OCLC provides shared technology services, original research and community programs that help libraries meet the ever-evolving needs of their users, institutions and communities. With office locations around the globe, OCLC employees are dedicated to offering premier services and software to help libraries cut costs while keeping pace with the demands of our information-driven society.
The OCLC Data Services organization is looking for a SeniorData Scientist. This team and role will be the center of excellence for machine learning and model development throughout the organization. The right candidate should comfortable and even enthusiastic about working in a greenfield environment. We haven’t figured everything out yet, that’s why we need you!
What you will do:
Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions
Mine and analyze data from our multi-petabyte ecosystem of data to drive optimization and improvement of product development, marketing techniques and business strategies
Assess the effectiveness and accuracy of new data sources and data gathering techniques.
Develop custom data models and algorithms to apply to data sets
Use predictive modeling to increase and optimize customer experiences, revenue generation, ad targeting and other business outcomes
Coordinate with different functional teams to implement models and monitor outcomes
Develop the standardized ML toolchains and pipelines needed to drive model development at scale throughout the organization
Develop processes and tools to monitor and analyze model performance and data accuracy
Qualifications
Master’s degree in Statistics, Mathematics, Computer Science or similar field or equivalent combination of education and experience
6+ years demonstrated experience in big data, statistics, model development working with statistical packages and machine learning tools (R, Pandas, PyTorch, Tensorflow, etc.)
Software development experience in Python, Java, Scala, Golang, or similar
Strong problem-solving skills with an emphasis on product development
Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks
Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience applying those techniques to build new product features
Experience using cloud-based platforms such as AWS Sagemaker, Databricks, and Snowflake
Experience pulling together and analyzing data from a mix of internal sources and 3rd party providers such as Google Analytics
Experience with Hadoop ecosystem: HDFS, HBase, Map/Reduce, Hive, Spark, etc is a plus
Excellent written and verbal communication skills for coordinating across teams