Data Science Intern

at Morningstar
Published November 4, 2022
Location Shenzhen, China
Category Data Science  
Job Type Internship  


The Role:

As a Data Science Intern, you will be contributing to the implementation of Artificial Intelligence (AI) within Data Collections software applications under the supervision of a senior colleague. This role requires significant interaction with both upstream and downstream stakeholders across Technology, Data and Research.

The Data Science Intern will participate in various data collection projects at differing stages from a prototype phase to a fully-fledged, scalable, and consumer service. Often, these services must be integrated into Morningstar’s platform of financial products, so that our clients can use these software tools in the investment decision-making process.

We are looking for an individual who possesses strong technical development skills, an ability to follow analyst requirements and technical specifications for robust code, and a passion for investment research.

This position reports to the Tech Manager of the Data Collections AI team.


  • Understand business needs to design and implement machine learning solutions to automate data collection processes.
  • Collaborate with peer engineering teams and downstream data analysts to continuously and iteratively improving workflows and data storage practices.
  • Follow good development practices, innovative frameworks and technology solutions that help business move faster, e.g., implementing automated model retraining and deployment.
  • Contribute to brainstorming and help other team members in their projects.
  • Prepare written reports or power point slides in English.


  • Fluent with Python (and its packages, e.g. numpy, pandas) and experienced in data cleaning and munging techniques.
  • Knowledge in machine learning fundamentals and some practical experience regarding model implementation
  • Knowledge in NLP or CV related concepts and algorithms, e.g., text classification, NER, machine translation etc.
  • Knowledge in either TensorFlow or PyTorch.
  • Familiar with SQL and common data storage formats, e.g., HTML, XML, json etc.
  • Strong independent analytical skills and ability to keep improving model performance.

Intermediate knowledge of statistical methods is desirable

  • Pursuit of a degree in computer science, statistics or related fields is preferred.
  • Familiarity with NLP (e.g. text classification, named entity recognition) and/or Computer Vision (e.g. object detection, object segmentation) is preferred.
  • Familiarity with statistical models, data analytics, and data visualization is a plus.
  • Fluent in both oral and written English.

C99_MstarResShenz Morningstar (Shenzhen) Ltd. Legal Entity