Data Engineer

Apply now »

Date: May 25, 2023

Location: Pune, IN

Company: Springer Nature Group

Springer Nature opens the doors to discovery for researchers, educators, clinicians and other professionals. Every day, around the globe, our imprints, books, journals, platforms and technology solutions reach millions of people. For over 175 years our brands and imprints have been a trusted source of knowledge to these communities and today, more than ever, we see it as our responsibility to ensure that fundamental knowledge can be found, verified, understood and used by our communities – enabling them to improve outcomes, make progress, and benefit the generations that follow.

Visit: group.springernature.com and follow @SpringerNature

 

Data Engineer

We are looking for a Data Engineer who will work in a new team that will be part of the Springer Nature's highly important Darwin project, that provides a platform combining all services & products under one roof in order to cover our customer’s needs.

SN Digital (SND) is the technology division of Springer Nature. We are responsible for maintaining and delivering millions of articles used by researchers, scientists and students around the globe. In the Researcher Engagement IT domain you will be working in an agile team that takes care of the 360° view on the researchers and the recommendations based on aggregated customer data

What we are looking for:

We are looking for a creative data engineer who is motivated to take a critical role in the creation of industry-strength solutions, that unleash the value of data, innovating in how we transform the traditional document perspective into a world of knowledge.

As a Data Engineer you will be responsible for ensuring continuous flow of data between data sources, you will be developing, testing and deploying data pipelines into the production environment.

 

You will be working in close partnership with data analysts, data scientists and data engineers, as well as other colleagues from Springer Nature and our technology partners including Google. You will have opportunities to work with the latest data and analytics technologies including graph databases, Google BigQuery as well as previewing new technologies from Google and other partners.

 

Role responsibilities will include:

 

  • Build Data pipelines for extraction/loading/transforming data between various data sources at scale in different formats.
  • Work proactively and independently with stakeholders, partners and other product managers to create, own and execute  data engineering projects that blend commercial, technical and customer needs.
  • Increase the adoption of data engineering best practices, standardise data engineering tools across the organisation, improve data management and reporting, and make data led insight accessible to the business 
  • Work closely with Data Scientists /Analysts to understand the requirements and develop the data solutions inline with the business and user requirements.
  • Automate/orchestrate various templatable solutions to ensure continuous delivery.
  • Maintain the current cloud infrastructure and help onboard the new applications.

 

Role requirements:

  • University degree with a strong analytical/quantitative background or equivalent experience (e.g. Data Science, Statistics, Mathematics, Econometrics, Physics, Computer Science etc.)
  • Strong working knowledge of SQL and at least one programming language (e.g. Java or Kotlin or Python)
  • Excellent problem solving capabilities
  • Knowledge of at least one of the distributive frameworks such as Apache Beam or Spark
  • Prior experience with schema designing data modelling .
  • Familiarity with Google Cloud products (e.g. BigQuery, DBT, Firestore) or other cloud data platforms beneficial but not essential
  • Familiarity with machine learning  beneficial but not essential 
  • Well organised and accurate with good time management
  • Experience of current software engineering practices

 

What you will be doing

            

Within 3 Months you will:

 

  • Get familiar with our emerging technology stack and data landscape. 
  • Align yourself with the work of the data platform team and understand the data requirements and issues facing our users.
  • Collaborate effectively with each discipline on the team.
  • Actively participate in technical discussions and share ideas.
  • Work with architects and other data engineers in the organisation to align the data processing architecture 

 

By 3-6 months you will:

 

  • Have an understanding of the team’s context within the wider organisation.
  • Be a supportive member of the team, developing the platform by using the appropriate technology solutions to solve the problem at hand.
  • Triage supports queries and diagnoses issues in our live applications.
  • Identify new sources of data across the organisation and build relationships with data providers to gain access.
  • Understand the processes by which data is acquired and any resulting limitations or bias and communicate this to the team.
  • Develop and maintain data pipelines to load data into systems like BigQuery, to analyse, clean and join datasets, in an automated, repeatable way.
  • Ensure that data is stored securely and in compliance with GDPR.
  • Work with data owners to understand how we can allow them to self-serve their data using tools we develop.

By 6-12 months you will:

  • Develop processes and tools to monitor feeds and test data integrity and completeness and to alert users when a problem occurs.
  • Understand our customers’ needs, both internal and external, and how your work affects their experience.
  • Able to gauge the complexity or scope of a piece of work, breaking it into smaller pieces when appropriate.
  • Give and receive constructive feedback within your team.
  • Mentor other members of the team in the principles of data engineering and promote best practice.
  • Promote and advocate the use of data across Springer Nature.
  • If you have an interest in data science there may be opportunities to apply machine learning techniques to these datasets to assist in the work of domain teams-

 

At Springer Nature, we value the diversity of our teams and work to build an inclusive culture, where people are treated fairly and can bring their differences to work and thrive. We empower our colleagues and value their diverse perspectives as we strive to attract, nurture and develop the very best talent.

Springer Nature was awarded Diversity Team of the Year at the 2022 British Diversity Awards. Find out more about our DEI work here.

If you have any access needs related to disability, neurodivergence or a chronic condition, please contact us so we can make all necessary accommodation.