Data Engineer_SQL, Python, Data Warehouse, Cloud

Apply now »

Date: Sep 22, 2022

Location: Pune, IN

Company: Springer Nature Group

Springer Nature opens the doors to discovery for researchers, educators, clinicians and other professionals. Every day, around the globe, our imprints, books, journals, platforms and technology solutions reach millions of people. For over 175 years our brands and imprints have been a trusted source of knowledge to these communities and today, more than ever, we see it as our responsibility to ensure that fundamental knowledge can be found, verified, understood and used by our communities – enabling them to improve outcomes, make progress, and benefit the generations that follow. 

Visit: and follow @SpringerNature


Permanent contract

Located in Pune, India


Springer Nature is seeking a Data Engineer for its highly-regarded Data and Analytics Solutions group.  The group meets the needs of Springer Nature’s Research division which includes Nature, Springer, BioMedCentral and Scientific American, as well as developing new data products for the research community.  This is an exciting opportunity as Data Engineering is expanding from strong foundations into new solutions, and we are looking for someone who is able to deliver solutions and work independently, with support from the wider team where necessary.


As a Data Engineer  you will be responsible for ensuring continuous flow of data with minimum latency between data sources,you will be developing,testing and deploying data pipelines into the production environment. You will be a part of the team which delivers ML/AI solutions at scale .


You will be working in close partnership with data analysts, data scientists and data engineers, as well as other colleagues from Springer Nature and our technology partners including Google. You will have opportunities to work with the latest data and analytics technologies including graph databases, Google BigQuery, Tensorflow and Plotly Dash as well as previewing new technologies from Google and other partners.



Role responsibilities will include:


  • Build streaming/batch Data pipelines for extraction/loading/transforming data between various data sources at scale in different formats.
  • Work closely with Data Scientists /Analysts to understand the requirements and develop the data solutions inline with the business requirements.
  • Explore various  best practices to Deliver/Deploy/Maintain  the inhouse ML/AI solutions at scale.
  • Automate/orchestrate various templatable solutions to ensure continuous delivery.
  • Maintain the current cloud infrastructure  and help onboard the new applications.
  • Use creative ideas to ensure ease of Data Use within the organisation.



Role requirements:


  • University degree with a strong analytical/quantitative background or equivalent experience (e.g. Data Science, Statistics, Mathematics, Econometrics, Physics, Computer Science etc.)
  • Strong working knowledge of SQL and Python
  • Excellent problem solving capabilities
  • Knowledge of at least one of the distributive frameworks such as Apache Beam or Spark
  • Knowledge of Machine Learning concepts is beneficial but not essential as training will be provided
  • Prior experience with schema designing data modelling .
  • Familiarity with Google Cloud products (BigQuery, Dataform, Colab) or other cloud data platforms beneficial but not essential
  • Well organized and accurate with good time management 

At Springer Nature we value the diversity of our teams. We recognize the many benefits of a diverse workforce with equitable opportunities for everyone. We strive for an inclusive workplace that empowers all our colleagues to thrive. Our search for the best talent fully encompasses and embraces these values and principles.

Visit the Springer Nature Editorial and Publishing website at for more information about our Research E&P career opportunities.