Easy~30 min setup

Structured Interview Kits That Cut Bad Hires

Turn a role into a complete interview kit — the competencies that matter, behavioral questions tied to each, and a scoring rubric every interviewer uses the same way.

Stop interviewing on gut feel. Decide what you're testing before anyone walks in.

A structured interview isn't more rigid — it's more fair and more accurate. You name the few competencies the role actually needs, ask every candidate the same evidence-based questions, and score against one rubric. Gut feel is what's left over after the structure does its job.

Unstructured interviews mostly measure how much the candidate reminds you of yourself. That's the bias that produces a confident yes and a bad hire six weeks later.

The hiring decision is only as good as the thing you agreed to measure before the interview started.

The first kit won't be perfect. After a few hires, you'll know which questions actually predicted performance and which were theater. Keep the predictive ones.

At a glance
ComplexityEasy
Tools neededClaude Pro or above, Claude Desktop → Cowork mode, on macOS or Windows, Claude Projects
Time to build~30 min first time · ~5 min per role after that
Best forFounder or hiring manager who runs their own interview loops — B2B services and small companies

What this solves

Interviewers asking different questions, scoring on gut feel, and the hiring decision going to whoever argues hardest in the debrief.

The problem

Two interviewers meet the same candidate and come back with opposite reads. One loved them; one didn’t click. Neither can fully say why, because neither asked the same questions or scored against anything written down. The decision comes down to who argues hardest in the debrief — and that’s how a confident hire turns into a six-week regret. The interview felt thorough. It just wasn’t measuring anything consistent.

The fix isn’t more interviews. It’s deciding, before anyone walks in, the handful of things the role actually requires and how you’ll know a candidate has them.

Ingredients

  • Claude SubscriptionsCowork and Projects aren't available on the free plan
  • Platforms & ModesCowork runs in the desktop app only — not on web or mobile. This recipe builds a multi-part kit — competencies, question sets, and a rubric — across files
  • Claude ProjectsHow your company evaluates people doesn't change role to role — the values, the bar, the way you score. A Project holds that so every kit is built the same way
Built on these guides

How it works

1
A Claude Project

You hire for different roles, but how your company evaluates people is constant — your values, your bar, your scoring scale. A Project stores that once so every interview kit is built from the same standard instead of reinvented per opening.

  1. Open Claude Desktop and click Cowork in the mode selector across the top (Chat · Cowork · Code).
  2. In the left panel, find Projects and click the + button.
  3. Choose Start from scratch. Name the project “Interview Kits” or “Hiring” and let Claude set up its folder.
  4. You’ll know it worked when the project appears in the left panel with its own folder and an instructions field.
~30 minutes the first time. ~5 minutes per role after that
2
Set Up Your Workspace

Your project has a folder on your computer — that’s where the two working files live.

  1. Ask Claude, right in the project: “Create two files in this project’s folder: hiring-standard.md and roles.md. Leave them empty — I’ll fill them in.” (Or create them yourself in any text editor and save them into the project folder.)
  2. Fill in both files using the descriptions below.
  3. Confirm Claude can see them: ask “List the files you can see in this project.” Both filenames should come back. If they don’t, see If It Doesn’t Work.

hiring-standard.md

How your company evaluates everyone, regardless of role: your scoring scale, your must-have values, and the competencies you always test. If it’s never been written down, the Knowledge File Seeding guide shows how to have Claude interview you into it.

Example: “Score each competency 1–4 (1 = no evidence, 4 = strong evidence with specifics). Always test: ownership, communication, problem-solving under ambiguity. Culture must-haves: gives direct feedback, admits mistakes. Two strong-no votes on a must-have = no hire, regardless of skill score.”

roles.md

What’s specific to each role — the two or three competencies that actually predict success in that job, beyond the always-test list.

Example: “Account Executive: discovery skill (can they uncover a real problem, not just pitch), resilience after a no, forecasting honesty. Support Lead: de-escalation, written clarity, judgment on when to escalate.”

3
Prompt Claude

Open your Project in Claude Cowork. Give Claude the specifics for this run, ask for the main output, then follow up for any additional pieces you want. The exact wording for each prompt — with what it’s asking for and why — is in What you actually type below.

4
Review What Comes Back

Check three things before you put the kit in front of an interviewer:

  1. The questions ask for evidence, not opinions. “Tell me about a time you owned a failure” gets a story you can score. “Are you a good problem solver?” gets a yes. If a question can be answered with a self-assessment, ask Claude to rewrite it to demand a specific past example.
  2. The rubric describes what good looks like. A 1–4 scale with no anchors is just vibes with numbers. Each score should have a sentence describing the evidence that earns it. Push back if the anchors are vague.
  3. No two interviewers test the same competency. A loop where everyone asks about communication and nobody probes judgment has a blind spot. Make sure the kit divides coverage so the panel sees the whole candidate.

Before you run the loop: does the kit match the job as it really is, not the job description? If the role’s hardest part is managing an impatient client, the kit should test that. Thirty seconds of your own knowledge keeps the kit honest.

What you actually type

Name your files explicitly the first few runs, and ask Claude to show its work on anything that matters.

Prompt AGenerateCopy this
Ask for the kit
Build a structured interview kit for the Account Executive role using `hiring-standard.md` and the AE section of `roles.md`. For each competency, give me two behavioral questions that ask for past examples, what a strong answer sounds like, and what a weak answer sounds like. Then build a one-page scorecard interviewers fill in.
Prompt BFollow upCopy this
Follow up to tailor it
Add one question that probes for our culture must-have about giving direct feedback.
Prompt CRefineCopy this
Follow up to tailor it
Turn the scorecard into a shared version with space for evidence quotes, not just numbers.
What you get back

An interview kit with three labeled parts: the competencies for the role, two behavioral questions per competency with what a strong and a weak answer sound like, and a one-page scorecard where every score has an anchor sentence describing the evidence that earns it. If a question can be answered with a self-assessment, or the scale is numbers with no anchors, it missed — send it back.

What this does not do
  • It doesn't interview anyone. The kit decides what to measure; reading a candidate's actual answers in the room is still the interviewer's job.
  • It only encodes the bar you give it. Generic competencies in `roles.md` produce a generic kit — the predictive power comes from your files, not the template.
  • It won't tell you which questions predicted performance. That takes a few hires and your own notes on who worked out — fold those lessons back into the file.

If it doesn’t work

  • No Cowork tab in Claude Desktop — update the app to the latest version and confirm you’re on a paid plan; Cowork isn’t on the free tier. On Windows, Cowork also needs the Virtual Machine Platform feature enabled — if the tab still won’t appear, that’s the fix.
  • Claude can’t see hiring-standard.md or roles.md — the files aren’t in the project’s folder, or they’re in a different folder than the one the project owns. Open the project, check which folder it points to, and move the files there. Then re-run “list the files you can see.”
  • The kit could be for any role — the questions test “communication” and “teamwork” in the abstract because roles.md names generic competencies. Sharpen them to what predicts success in this job (“discovery skill,” “resilience after a no”) and run it again. The kit is only as specific as the competencies behind it.
  • The scorecard is numbers with no anchors — a 1–4 scale where nothing defines a 3 is gut feel in disguise. The anchors come from hiring-standard.md: add a one-line definition of the evidence each score requires, then ask: “Rebuild the scorecard with an anchor sentence for every score.”

Extra credit

Small additions that pay back the next time you run it.

  • Google Drive — keep hiring-standard.md in a shared Drive folder so every hiring manager builds kits from the same bar. See the Connectors guide.
  • Calendar connector — have Claude draft the loop schedule with each interviewer assigned their competency, then place the slots.
  • Reuse the debrief — after interviews, paste the filled scorecards back and ask Claude to summarize the evidence by competency before the panel meets, so the debrief argues from notes, not memory.

“You can't compare two candidates fairly if you asked them different questions and scored them on different things.”

What this teaches you about Claude Cowork

The recipe is one application. The principles apply to everything you’d hand to Claude.

Deciding the measure is the work. Most hiring mistakes happen before the interview, when nobody agreed on what to test. Writing competencies and a rubric into a file forces that decision up front — which is where it actually changes the outcome.

Consistency is what makes comparison fair. When every candidate gets the same evidence-based questions scored on the same rubric, the differences you see are real differences, not artifacts of who interviewed whom. The structure is what lets your judgment be trusted.

The file learns which questions predict. After a few hires, you know which questions separated the people who worked out from the people who didn’t. Keep those, cut the rest, and your kit gets sharper than any template — because it’s tuned to your actual results.

Who this is for

Founder or hiring manager who runs their own interview loops in B2B services and small companies (5–50 employees).

The pain: Two interviewers, opposite reads, and nothing written down to settle it

The outcome: Every candidate gets the same evidence-based questions scored on one rubric

Published June 9, 2026 · 0 views