Skill-Matched Recommender

A local-vector scaffold that can later switch to Azure embeddings without requiring Azure AI Search.

Problem

Students describe skills and interests in different words than role descriptions, so exact filters miss good matches.

Users

Students choosing hackathon roles, project ideas, internships, or next learning modules.

Why this track

This practices embeddings, similarity search, and recommendation explanation while respecting the current Azure rule: no Azure AI Search dependency by default.

Architecture

Stay minimal. 5-6 nodes. Each arrow is one network hop.

profile

Student profile text

embed

Embedding model or local vectorizer

corpus

Role/project corpus

ranker

Cosine similarity ranker

evidence

Matched skills + gaps

Recommendation UI

Edges

profile embed — query vector
corpus embed — cached vectors
embed ranker — vectors
ranker evidence — top-k matches
evidence ui — explainable result

Prompt Pack

Starting prompts. Iterate. Move the system prompt into prompts/system.md so it can be versioned.

system

You explain recommender results. Use only the matched role text and skill evidence. Return a short recommendation, two skill gaps, and one next learning action. Do not rank people by sensitive attributes.

user

Profile: Python, Docker, FastAPI. Top matches: backend internship, MLOps assistant, data analyst.

Code Snippet

The pattern shape. Read it, run the matching scaffold, then adapt the idea for your own team.

python

query_vec = embed("Python Docker FastAPI internship")
role_vecs = [embed(text) for text in ROLE_DESCRIPTIONS]
scores = [
    (cosine(query_vec, role_vec), role)
    for role_vec, role in zip(role_vecs, ROLE_DESCRIPTIONS)
]
top_matches = sorted(scores, reverse=True)[:3]
# ... your turn: explain why each match helps the student

Reference: src/techniques/embeddings_search/ in halla-ai/hackathon-sample-2026

Demo Screens

Three screens that prove the prototype works.

Profile input

Student enters skills, interests, and preferred project style.

Top matches

Three roles or ideas with scores and matched evidence.

Learning gaps

Short next-step checklist based on missing skills.

Azure budget

Local vectors are free. If tutors enable text-embedding-3-small, 1M embedding tokens is roughly USD 0.022 before storage.

Pitfalls

• Symptom: every result looks similar. Cause: corpus items are too generic. Fix: make role descriptions specific.
• Symptom: rankings are hard to defend. Cause: only score is shown. Fix: show matched terms or evidence.
• Symptom: cost spikes. Cause: corpus is re-embedded every request. Fix: cache vectors.

Possible Extensions

If you finish the 1-day path early, use one question below to make the project more original.

How would your version combine semantic score with explicit skill tags?
How would you show enough evidence that users trust the match?
How would you update cached vectors when roles change?

Multi-step Service Triage Agent

Document Vision Reader