Uni Internship Jan to May 2025 - Development of New Search and Graph Augmented Retrieval Methods

Date: 13 Sep 2024

Location: SG

Company: Synapxe

Synapxe is the national HealthTech agency inspiring tomorrow’s health. The nexus of HealthTech, we connect people and systems to power a healthier Singapore. Together with partners, we create intelligent technological solutions to improve the health of millions of people every day, everywhere.
 
Are you someone who enjoys problem solving, has a creative and curious mind, and strives to create a better and healthier tomorrow? If you say yes to all, do check out our website and find out more about Internship@Synapxe.
 
Join Synapxe as an intern and see how you can contribute in powering a healthier Singapore. We aim to deliver the best experience for all interns, to create exponential growth and paving your future in the tech industry.

 

The candidate will have the

The project contributes to the advancement of Generative AI applications by exploring novel methods for Retrieval Augmented Generation, which will include vector store, new hybrid search, and leveraging knowledge graph as a graph store.

 

The selected intern will be involved, but not limited to the following:

  • Vector Store Experimentation: Experiment with several vector stores to understand their functionalities and performance.
  • Retriever Strategy Evaluation: Implement and evaluate different retrieval strategies within the selected vector stores. This could include:
  • Semantic search (vector/hybrid)
    • Index Strategy (HSNW, RHSNW, IVF_FLAT, etc.)
  • Large Language Model: Test and integrate the chosen retrieval strategy with a large language model (e.g., GPT-3.5/4, Gemini, Claude 3) to evaluate the retrieved context for response generation.
  • Knowledge Graph Exploration: Storing chunks or entities as graph nodes
  • Graph Prompting: Investigate the use of graph-based prompts to guide the large language model for answer retrieval and generation. Explore techniques like subgraph extraction and multi-hop answering as focused prompting to retrieve the relevant answers.
  • Comparative Analysis: Compare the performance of Phase 1 (vector store retrieval) with Phase 2 (graph retrieval) using the established evaluation metrics.
  • Hybrid Retrieval: Investigate potential for combining vector store retrieval and graph prompting for a more comprehensive approach to contextual retrieval.

 

About you:

  • Be pursuing a Bachelor Degree in Business Analytics, Data Science, Computer Engineering, Computer Science or related discipline
  • Graduating in May/Dec 2025 or May 2026
  • Proficient in in Python syntax, data structures, algorithms including familiarity with common python libraries, and ability to write clean, efficient, well-documented code
  • Adept in using tools like VS Code for script development and Jupyter notebooks for exploratory analysis
  • Ability to multitask and work effectively as part of a multidisciplinary team
  • Passionate and keen to make a difference to re-imagine the future of HealthTech

 

The intern's work location will be at 1 Maritime Square #12-01 Harbourfront Centre Singapore 099253.

 

#LI-YG1