Data Engineer

Houghton Mifflin Harcourt

Job Requisition ID: 8061

We’re looking for a Data Engineer to join Digital Services and LABs team to build state of the art modern data platform. You will be responsible for selecting optimal solutions to build data platform to support various business use cases such as Customer 360, digital marketing campaign ROI. You will be collaborating with Data Architects, IT team members and business stakeholders on project goals.




·      Design, build, test, deploy and maintain highly scalable data management platform

·      Ensure system meets business requirements and industry best practices.

·      Selecting and integrating any Big Data tools and frameworks required providing requested business capabilities.

·      Work on POCs such as digital marketing ROI and Customer 360.

·      Implementing ETL/ELT workflows

·      Monitoring performance and advising any necessary infrastructure changes

·      Defining data retention and security policies

·      Defining data governance.


Skills and Qualifications:

·      Experience with Database modeling, design and governance.

·      SQL-based technologies (e.g. Oracle, MSSQL, PostgreSQL and MySQL)

·      Experience with NoSQL databases, such as HBase, Cassandra, MongoDB

·      Python, C/C++, Java, R

·      Management of Hadoop cluster, with all included services.

·      Hadoop, MapReduce, HDFS

·      Experience with Cloudera/AWS EMR/Redshift

·      Experience with Spark and Apache Beam

·      Experience with data catalog or metadata management tools.

·      Experience with cloud computing environment such as AWS infrastructure and DevOps

·      Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala

·      Experience with integration of data from multiple data sources

·      Knowledge of various ETL techniques and frameworks, such as Flume, Sqoop, Informatica

·      Knowledge of various messaging systems, such as Kafka

·      Good knowledge of Big Data ML toolkits, such as SparkML, H2O, AWS ML

·      Predictive modeling, NLP and text analysis

·      Experience with building digital marketing data platforms such as Customer 360 will be a plus.

·      Experience with digital marketing analytics tools such as Adobe Analytics and Google Analytics will be a plus.

·      Excellent oral and written communication skills

·      Ability to work in a collaborative and agile team environment.

·      Bachelor’s or Master’s degree in computer science or software engineering or related field


Physical Requirements:
• Might be in a stationary position for a considerable time (sitting and/or standing).
• The person in this position needs to move about inside the office to access file cabinets, office machinery, etc.
• Constantly operates a computer and other office productivity machinery, such as a calculator, copy machine, and computer printer.
• Must be able to collaborate with colleagues via face to face, conference calls, and online meetings.


Houghton Mifflin Harcourt (NASDAQ:HMHC) is a global learning company dedicated to changing people’s lives by fostering passionate, curious learners. As a leading provider of pre-K–12 education content, services, and cutting-edge technology solutions across a variety of media, HMH enables learning in a changing landscape. HMH is uniquely positioned to create engaging and effective educational content and experiences from early childhood to beyond the classroom.  HMH serves more than 50 million students in over 150 countries worldwide, while its award-winning children’s books, novels, non-fiction, and reference titles are enjoyed by readers throughout the world.

For more information, visit  

Houghton Mifflin Harcourt is an equal employment opportunity employer and participates in E-Verify. All qualified applicants will receive consideration for employment and will not be discriminated against on the basis of gender, race/ethnicity, gender identity, sexual orientation, protected veteran status, disability, or other protected group status.

To apply for this job please visit the following URL: →