Senior Data Scientist

Full time

Employment Information

Our digital solutions team is more than a traditional IT organization. We are a team of passionate, collaborative, agile, inventive, customer-centric, results-oriented problem solvers. We are intellectually curious, love advancements in technology and seek to adapt technologies to drive Staples forward. We anticipate the needs of our customers and business partners and deliver reliable, customer-centric technology services.

What you’ll be doing:

  • Understand and ensure that project/department milestones/goals are met and adhere to approved budgets.
  • Oversee, design and deploy data architecture (models) and repositories.
  • Develop data products from proof of concept to scalable solutions based on business priorities and impact, using modern tools and technologies on cloud.
  • Implement and maintain AI and ML models to drive business insights and decision making.
  • Collaborate with data scientists and analysts to develop and predictive and classification models, AI and ML solutions.
  • Stay up to date with the latest advancements in AI, ML, and GenAI technologies and incorporate them into the data and analytics strategy.
  • Enable data driven decision making: Advance access to information for decision making into wide populations of business users, by enabling them to access Data & Analytics via easy combinations of elements.
  • Increase adoption of Data & Analytics (D&A) via self-service and data catalog.
  • Assist in managing the flow of data from internal and external sources by leveraging both distributed and local data structures.
  • Optimize the data pipeline platform to support scalable workloads and diverse use cases.
  • Support mission critical applications and near real time data needs in the data platform.
  • Understand risk potential for various configurations and options for data administration and able to communicate issues to managers and team as required.
  • Work hand in hand with product owners, scrum masters to deliver quality work and will be able to exercise independent judgment.
  • Work collaboratively with project managers, technical data managers, data architects, data scientists as well as business partners.
  • Work as part of an engineering team from concept to operations, providing technical subject matter expertise for successful deployment.
  • Typically manage individual contributors
  • Ensure the ongoing training and development of the team.

What you bring to the table:

  • Strong hands-on coding experience with languages like Python, PySpark, SQL, UNIX/Linux scripting to access, extract, manipulate and summarize data.
  • Data Engineering: Experience in core data engineering activities like:
  • Database optimization (partitioning, group and sort keys, indexes, query optimization),
  • Data processing: (Spark, SQL Server, PostgreSql, Hadoop/Hive),
  • Programming and/or scripting experience: (Python, PySpark, Bash),
  • Data cleansing, Integration testing (PyTest or Unittest)
  • Experience in automation and testing of data workflows, preferably Apache Airflow
  • Familiarity with a broad base of analytical methods like Data modeling, variable based transformation & summarization, and algorithmic development.

What’s needed- Basic Qualifications:

  • Bachelor’s Degree or equivalent in Computer Sciences, Data Analytics, Management Information Systems, or related quantitative field in Data Engineering
  • 5+ years of experience in designing and building new scalable on-prem/cloud data engineering solutions using distributed frameworks like Spark, Hadoop etc.
  • Experience in developing and deploying AI and ML models.
  • Familiarity with AI and ML tools and frameworks such as TensorFlow, PyTorch, and scikit-learn.
  • Knowledge of GenAI technologies and their applications in data and analytics.

What’s needed- Preferred Qualifications:

  • Experience working with: Airflow, Kubernetes, Cloud infrastructure (e.g., Azure, GCP), CI/CD (e.g., Azure DevOps, Git actions, Jenkins, or CircleCI)
  • Experience in implementing AI and ML solutions in a business setting.
  • Knowledge of best practices in AI and ML model development, deployment, and maintenance.
  • Familiarity with GenAI technologies and their applications in data and analytics.
  • Hands-on experience with cloud data warehousing services like Snowflake.
  • Experience in SnowSQL, GCP - BigQuery, Cloud SQL, DataFlow, Pub/Sub, Cloud Functions etc.
  • Experience in developing User Defined Functions (UDF) in python, SQL, etc.
  • Experience with owning or overseeing ML Ops processes like Continuous Integration (CI), Continuous Delivery (CD), Continuous Training (CT), and Continuous Monitoring (CM) is a plus.
  • Understanding of upcoming data technology trends like
  • The Data Fabric from conceptualization to implementation
  • Compostable Data & Analytics
  • Experience in enabling a data marketplace solution, increasing reach of data products, and simplifying understanding and access for all consumers.
  • Implementations of data API solutions to expand the reach of the data warehouse to multiple applications.

We Offer:

  • Inclusive culture with associate-led Business Resource Groups
  • Flexible PTO (22 days) and Holiday Schedule
  • Online and Retail Discounts, Company Match 401(k), Physical and Mental Health Wellness programs, and more!

Join our newsletter to get monthly updates on data science jobs.