Senior Data Scientist

Full time

Employment Information

The position you were interested in has been filled or expired, but we invite you to explore other exciting job openings on our platform to find your next career opportunity.

About Ancestry:

When you join Ancestry, you join a human-centered company where every person’s story is important. Ancestry®, the global leader in family history, empowers journeys of personal discovery to enrich lives. With our unparalleled collection of more than 40 billion records, over 3 million subscribers and over 23 million people in our growing DNA network, customers can discover their family story and gain a new level of understanding about their lives. Over the past 40 years, we’ve built trusted relationships with millions of people who have chosen us as the platform for discovering, preserving and sharing the most important information about themselves and their families.

We are committed to our location flexible work approach, allowing you to choose to work in the nearest office, from your home, or a hybrid of both (subject to location restrictions and roles that are required to be in the office- see the full list of eligible US locations HERE). We will continue to hire and promote beyond the boundaries of our office locations, to enable broadened possibilities for employee diversity.

Together, we work every day to foster a work environment that’s inclusive as well as diverse, and where our people can be themselves. Every idea and perspective is valued so that our products and services reflect the global and diverse clients we serve.

Ancestry encourages applications from minorities, women, the disabled, protected veterans and all other qualified applicants. Passionate about dedicating your work to enriching people’s lives? Join the curious.

Ancestry is hiring an exceptional, passionate, and highly motivated Senior Data Scientist with expertise in areas including AIML, GenAI, LLMs, CV, NLP, embeddings, knowledge graphs, etc. to join our Data Science AI team. The Data Science AI team develops generative AI, CV and NLP models to organize and extract information from billions of historical and genealogical records including text, data, audio and image sources to help customers discover and connect with their family history. In this role, you will be responsible for developing state of the art solutions to a variety of challenging problems supporting family history and products.

What you will do…

  • In partnership with business leaders, define a vision for the use of AIML, LLMs, CV, NLP to extract value and data-driven insights from our billions of genealogical records such as census records, newspapers, city directories, family history books, birth, marriage and death records, etc.
  • Ability to engage in fast prototyping, and agile software development and deliver measurement-driven model improvements;
  • Perform applied research implementing SOTA generative AI, NLP, LLM, CV solutions for NER, relation extraction, summarization, topic analysis, entity resolution, knowledge graphs, embeddings based information retrieval, story generation, AI driven chat, etc.
  • Collaborate with ML Ops and Data Science Engineers to deploy datasets, truth sets, models, pipelines, training and inference code to cloud based model registry and optimize AIML and GenAI algorithms;
  • Partner with subject matter experts to inject their in-depth knowledge into the model creation process
  • Effectively communicate research results to stakeholders and the research community through documentation, white papers, peer-reviewed publications, and presentations Help to recruit, inspire, and develop a high performing and creative Data Science AI team members

Who you are…

  • Ph.D. or advanced degree in Data Science, Computer Science, Statistics, Mathematics, Linguistics, Engineering or data related field;
  • Minimum 5+ years of hands-on technical experience developing and deploying AIML models in production settings;
  • Minimum 2+ years hands on technical lead experience mentoring and leading data science or engineering teams;
  • Direct industrial experience with a proven track record of successfully leading efforts to design, implement, and deploy multiple data science projects end-to-end from idea generation, objectives formulation, to implementation, performance analysis and deliverables;
  • Extensive background in AIML methods including LLMs, NLP, CNN, RNN, transfer learning, attention mechanisms, large language models, transformers, generative models and embedding methods;
  • Experience with NLP techniques such as named entity extraction, document classification, summarization, topic modeling, relation extraction, sentiment analysis, dialogue systems;
  • Knowledge and understanding of language models including variants of BERT, T5, GPT, Falcon, and LLaMA, as well as others such as Hugging face and OpenAI models.
  • Expertise with AIML technologies including Python, Tensorflow, PyTorch, Keras, SciPy stack and Scikit-learn, NLTK, spaCy, pandas, numpy, etc.
  • Strong verbal and written communication skills to explain details of complex concepts to non-expert stakeholders in simple and understandable way with ability to document and explain technical details clearly and concisely (peer-reviewed publications or presentations preferred)
  • Self-starter, high level of motivation and ability to work well independently as well as motivate and energize teams.

Additional Information:

Ancestry is an Equal Opportunity Employer that makes employment decisions without regard to race, color, religious creed, national origin, ancestry, sex, pregnancy, sexual orientation, gender, gender identity, gender expression, age, mental or physical disability, medical condition, military or veteran status, citizenship, marital status, genetic information, or any other characteristic protected by applicable law. In addition, Ancestry will provide reasonable accommodations for qualified individuals with disabilities.

All job offers are contingent on a background check screen that complies with applicable law. For San Francisco office candidates, pursuant to the San Francisco Fair Chance Ordinance, Ancestry will consider for employment qualified applicants with arrest and conviction records.

Ancestry is not accepting unsolicited assistance from search firms for this employment opportunity. All resumes submitted by search firms to any employee at Ancestry via-email, the Internet or in any form and/or method without a valid written search agreement in place for this position will be deemed the sole property of Ancestry. No fee will be paid in the event the candidate is hired by Ancestry as a result of the referral or through other means.


Join our newsletter to get monthly updates on data science jobs.